Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Mucin-5AC

Gene

MUC5AC

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score:

Annotation score:5 out of 5

<p>The annotation score provides a heuristic measure of the annotation content of a UniProtKB entry or proteome. This score <strong>cannot</strong> be used as a measure of the accuracy of the annotation as we cannot define the ‘correct annotation’ for any given protein.<p><a href='/help/annotation_score' target='_top'>More...</a></p>
-Experimental evidence at protein leveli <p>This indicates the type of evidence that supports the existence of the protein. Note that the ‘protein existence’ evidence does not give information on the accuracy or correctness of the sequence(s) displayed.<p><a href='/help/protein_existence' target='_top'>More...</a></p>

<p>This section provides any useful information about the protein, mostly biological knowledge.<p><a href='/help/function_section' target='_top'>More...</a></p>Functioni

Gel-forming glycoprotein of gastric and respiratoy tract epithelia that protects the mucosa from infection and chemical damage by binding to inhaled microrganisms and particles that are subsequently removed by the mucocilary system.

<p>The <a href="http://www.geneontology.org/">Gene Ontology (GO)</a> project provides a set of hierarchical controlled vocabulary split into 3 categories:<p><a href='/help/gene_ontology' target='_top'>More...</a></p>GO - Molecular functioni

  • extracellular matrix structural constituent Source: UniProtKB

GO - Biological processi

Enzyme and pathway databases

Reactome - a knowledgebase of biological pathways and processes

More...
Reactomei
R-HSA-5083625 Defective GALNT3 causes familial hyperphosphatemic tumoral calcinosis (HFTC)
R-HSA-5083632 Defective C1GALT1C1 causes Tn polyagglutination syndrome (TNPS)
R-HSA-5083636 Defective GALNT12 causes colorectal cancer 1 (CRCS1)
R-HSA-5621480 Dectin-2 family
R-HSA-913709 O-linked glycosylation of mucins
R-HSA-977068 Termination of O-glycan biosynthesis

Protein family/group databases

MEROPS protease database

More...
MEROPSi
I08.951

<p>This section provides information about the protein and gene name(s) and synonym(s) and about the organism that is the source of the protein sequence.<p><a href='/help/names_and_taxonomy_section' target='_top'>More...</a></p>Names & Taxonomyi

<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section provides an exhaustive list of all names of the protein, from commonly used to obsolete, to allow unambiguous identification of a protein.<p><a href='/help/protein_names' target='_top'>More...</a></p>Protein namesi
Recommended name:
Mucin-5ACCurated
Short name:
MUC-5ACCurated
Alternative name(s):
Gastric mucin1 Publication
Lewis B blood group antigen1 Publication
Short name:
LeB1 Publication
Major airway glycoprotein1 Publication
Mucin-5 subtype AC, tracheobronchial
Tracheobronchial mucin1 Publication
Short name:
TBM1 Publication
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section indicates the name(s) of the gene(s) that code for the protein sequence(s) described in the entry. Four distinct tokens exist: ‘Name’, ‘Synonyms’, ‘Ordered locus names’ and ‘ORF names’.<p><a href='/help/gene_name' target='_top'>More...</a></p>Gene namesi
Name:MUC5AC1 PublicationImported
Synonyms:MUC51 Publication
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section provides information on the name(s) of the organism that is the source of the protein sequence.<p><a href='/help/organism-name' target='_top'>More...</a></p>OrganismiHomo sapiens (Human)
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section shows the unique identifier assigned by the NCBI to the source organism of the protein. This is known as the ‘taxonomic identifier’ or ‘taxid’.<p><a href='/help/taxonomic_identifier' target='_top'>More...</a></p>Taxonomic identifieri9606 [NCBI]
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section contains the taxonomic hierarchical classification lineage of the source organism. It lists the nodes as they appear top-down in the taxonomic tree, with the more general grouping listed first.<p><a href='/help/taxonomic_lineage' target='_top'>More...</a></p>Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section is present for entries that are part of a <a href="http://www.uniprot.org/proteomes">proteome</a>, i.e. of a set of proteins thought to be expressed by organisms whose genomes have been completely sequenced.<p><a href='/help/proteomes_manual' target='_top'>More...</a></p>Proteomesi
  • UP000005640 <p>A UniProt <a href="http://www.uniprot.org/manual/proteomes_manual">proteome</a> can consist of several components. <br></br>The component name refers to the genomic component encoding a set of proteins.<p><a href='/help/proteome_component' target='_top'>More...</a></p> Componenti: Chromosome 11

Organism-specific databases

Eukaryotic Pathogen Database Resources

More...
EuPathDBi
HostDB:ENSG00000215182.8

Human Gene Nomenclature Database

More...
HGNCi
HGNC:7515 MUC5AC

Online Mendelian Inheritance in Man (OMIM)

More...
MIMi
158373 gene

neXtProt; the human protein knowledge platform

More...
neXtProti
NX_P98088

<p>This section provides information on the location and the topology of the mature protein in the cell.<p><a href='/help/subcellular_location_section' target='_top'>More...</a></p>Subcellular locationi

Extracellular region or secreted Cytosol Plasma membrane Cytoskeleton Lysosome Endosome Peroxisome ER Golgi apparatus Nucleus Mitochondrion Manual annotation Automatic computational assertionGraphics by Christian Stolte; Source: COMPARTMENTS

<p>UniProtKB Keywords constitute a <a href="http://www.uniprot.org/keywords">controlled vocabulary</a> with a hierarchical structure. Keywords summarise the content of a UniProtKB entry and facilitate the search for proteins of interest.<p><a href='/help/keywords' target='_top'>More...</a></p>Keywords - Cellular componenti

Secreted

<p>This section provides information on the disease(s) and phenotype(s) associated with a protein.<p><a href='/help/pathology_and_biotech_section' target='_top'>More...</a></p>Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the <a href="http://www.uniprot.org/manual/pathology_and_biotech_section">'Pathology and Biotech'</a> section describes the effect of the experimental mutation of one or more amino acid(s) on the biological properties of the protein.<p><a href='/help/mutagen' target='_top'>More...</a></p>Mutagenesisi2122W → A: No binding to mannose-specific lectin. Loss of secretion from the endoplasmic reticulum. 1 Publication1
Mutagenesisi4926D → A or E: Abolishes cleavage. 1 Publication1

Organism-specific databases

DisGeNET

More...
DisGeNETi
4586

Open Targets

More...
OpenTargetsi
ENSG00000215182

Chemistry databases

ChEMBL database of bioactive drug-like small molecules

More...
ChEMBLi
CHEMBL3713020

Polymorphism and mutation databases

Domain mapping of disease mutations (DMDM)

More...
DMDMi
160370004

<p>This section describes post-translational modifications (PTMs) and/or processing events.<p><a href='/help/ptm_processing_section' target='_top'>More...</a></p>PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘PTM / Processing’ section denotes the presence of an N-terminal signal peptide.<p><a href='/help/signal' target='_top'>More...</a></p>Signal peptidei1 – 27Sequence analysisAdd BLAST27
<p>This subsection of the ‘PTM / Processing’ section describes the extent of a polypeptide chain in the mature protein following processing.<p><a href='/help/chain' target='_top'>More...</a></p>ChainiPRO_000015895728 – 5654Mucin-5ACSequence analysisAdd BLAST5627

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the PTM / Processing":/help/ptm_processing_section section describes the positions of cysteine residues participating in disulfide bonds.<p><a href='/help/disulfid' target='_top'>More...</a></p>Disulfide bondi103 ↔ 111PROSITE-ProRule annotation
<p>This subsection of the <a href="http://www.uniprot.org/help/ptm_processing_section">PTM / Processing</a> section specifies the position and type of each covalently attached glycan group (mono-, di-, or polysaccharide).<p><a href='/help/carbohyd' target='_top'>More...</a></p>Glycosylationi205N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi258N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi415N-linked (GlcNAc...) asparagineSequence analysis1
Disulfide bondi456 ↔ 464PROSITE-ProRule annotation
Glycosylationi524N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1308N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1389C-linked (Man) tryptophanCurated1
Glycosylationi1584C-linked (Man) tryptophanCurated1
Glycosylationi1749C-linked (Man) tryptophanCurated1
Glycosylationi1957C-linked (Man) tryptophanCurated1
Glycosylationi2122C-linked (Man) tryptophan1 Publication1
Glycosylationi3228C-linked (Man) tryptophanCurated1
Glycosylationi3526C-linked (Man) tryptophanCurated1
Glycosylationi3774N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi3959C-linked (Man) tryptophanCurated1
Glycosylationi4633C-linked (Man) tryptophanCurated1
Glycosylationi4869N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi4942N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5057N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5093N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5236N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5347N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5377N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5386N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5455N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi5528N-linked (GlcNAc...) asparagineSequence analysis1
Disulfide bondi5532 ↔ 5582PROSITE-ProRule annotation
Disulfide bondi5546 ↔ 5596PROSITE-ProRule annotation
Disulfide bondi5557 ↔ 5612PROSITE-ProRule annotation
Disulfide bondi5561 ↔ 5614PROSITE-ProRule annotation
Glycosylationi5591N-linked (GlcNAc...) asparagineSequence analysis1

<p>This subsection of the <a href="http://www.uniprot.org/help/ptm_processing_section">PTM/processing</a> section describes post-translational modifications (PTMs). This subsection <strong>complements</strong> the information provided at the sequence level or describes modifications for which <strong>position-specific data is not yet available</strong>.<p><a href='/help/post-translational_modification' target='_top'>More...</a></p>Post-translational modificationi

C-, O- and N-glycosylated. O-glycosylated on the Thr-/Ser-rich tandem repeats. C-mannosylation in the Cys-rich subdomains may be required for proper folding of these regions and for export from the endoplasmic reticulum during biosynthesis.1 Publication
Proteolytic cleavage in the C-terminal is initiated early in the secretory pathway and does not involve a serine protease. The extent of cleavage is increased in the acidic parts of the secretory pathway. Cleavage generates a reactive group which could link the protein to a primary amide.1 Publication

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection describes interesting single amino acid sites on the sequence that are not defined in any other subsection. This subsection can be displayed in different sections (‘Function’, ‘PTM / Processing’, ‘Pathology and Biotech’) according to its content.<p><a href='/help/site' target='_top'>More...</a></p>Sitei4926 – 4927Cleavage2

Keywords - PTMi

Disulfide bond, Glycoprotein

Proteomic databases

PaxDb, a database of protein abundance averages across all three domains of life

More...
PaxDbi
P98088

PeptideAtlas

More...
PeptideAtlasi
P98088

PRoteomics IDEntifications database

More...
PRIDEi
P98088

ProteomicsDB human proteome resource

More...
ProteomicsDBi
57789

PTM databases

GlyConnect protein glycosylation platform

More...
GlyConnecti
375

iPTMnet integrated resource for PTMs in systems biology context

More...
iPTMneti
P98088

UniCarbKB; an annotated and curated database of glycan structures

More...
UniCarbKBi
P98088

<p>This section provides information on the expression of a gene at the mRNA or protein level in cells or in tissues of multicellular organisms.<p><a href='/help/expression_section' target='_top'>More...</a></p>Expressioni

<p>This subsection of the ‘Expression’ section provides information on the expression of a gene at the mRNA or protein level in cells or in tissues of multicellular organisms. By default, the information is derived from experiments at the mRNA level, unless specified ‘at protein level’. <br></br>Examples: <a href="http://www.uniprot.org/uniprot/P92958#expression">P92958</a>, <a href="http://www.uniprot.org/uniprot/Q8TDN4#expression">Q8TDN4</a>, <a href="http://www.uniprot.org/uniprot/O14734#expression">O14734</a><p><a href='/help/tissue_specificity' target='_top'>More...</a></p>Tissue specificityi

Highly expressed in surface mucosal cells of respiratory tract and stomach epithelia. Overexpressed in a number of carcinomas. Also expressed in Barrett's esophagus epithelium and in the proximal duodenum.2 Publications

Gene expression databases

Bgee dataBase for Gene Expression Evolution

More...
Bgeei
ENSG00000215182 Expressed in 94 organ(s), highest expression level in nasal cavity epithelium

Genevisible search portal to normalized and curated expression data from Genevestigator

More...
Genevisiblei
P98088 HS

Organism-specific databases

Human Protein Atlas

More...
HPAi
CAB002774
CAB009395
HPA040456
HPA040615

<p>This section provides information on the quaternary structure of a protein and on interaction(s) with other proteins or protein complexes.<p><a href='/help/interaction_section' target='_top'>More...</a></p>Interactioni

<p>This subsection of the <a href="http://www.uniprot.org/help/interaction_section">'Interaction'</a> section provides information about the protein quaternary structure and interaction(s) with other proteins or protein complexes (with the exception of physiological receptor-ligand interactions which are annotated in the <a href="http://www.uniprot.org/help/function_section">'Function'</a> section).<p><a href='/help/subunit_structure' target='_top'>More...</a></p>Subunit structurei

Multimeric. Interacts with H.pylori in the gastric epithelium, Barrett's esophagus as well as in gastric metaplasia of the duodenum (GMD).1 Publication

Protein-protein interaction databases

Protein interaction database and analysis system

More...
IntActi
P98088, 1 interactor

STRING: functional protein association networks

More...
STRINGi
9606.ENSP00000435591

<p>This section provides information on the tertiary and secondary structure of a protein.<p><a href='/help/structure_section' target='_top'>More...</a></p>Structurei

3D structure databases

Protein Model Portal of the PSI-Nature Structural Biology Knowledgebase

More...
ProteinModelPortali
P98088

SWISS-MODEL Repository - a database of annotated 3D protein structure models

More...
SMRi
P98088

Database of comparative protein structure models

More...
ModBasei
Search...

MobiDB: a database of protein disorder and mobility annotations

More...
MobiDBi
Search...

<p>This section provides information on sequence similarities with other proteins and the domain(s) present in a protein.<p><a href='/help/family_and_domains_section' target='_top'>More...</a></p>Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the <a href="http://www.uniprot.org/help/family_and_domains_section">Family and Domains</a> section describes the position and type of a domain, which is defined as a specific combination of secondary structures organized into a characteristic three-dimensional structure or fold.<p><a href='/help/domain' target='_top'>More...</a></p>Domaini80 – 281VWFD 1PROSITE-ProRule annotationAdd BLAST202
Domaini338 – 394TIL 1Sequence analysisAdd BLAST57
Domaini394 – 465VWFC 1PROSITE-ProRule annotationAdd BLAST72
Domaini433 – 647VWFD 2PROSITE-ProRule annotationAdd BLAST215
Domaini704 – 761TIL 2Sequence analysisAdd BLAST58
Domaini818 – 863TIL 3Sequence analysisAdd BLAST46
Domaini902 – 1109VWFD 3PROSITE-ProRule annotationAdd BLAST208
<p>This subsection of the ‘Family and Domains’ section indicates the positions and types of repeated sequence motifs or repeated domains within the protein.<p><a href='/help/repeat' target='_top'>More...</a></p>Repeati1383 – 1481Cys-rich subdomain 1Add BLAST99
Repeati1577 – 1677Cys-rich subdomain 2Add BLAST101
Repeati1743 – 1847Cys-rich subdomain 3Add BLAST105
Repeati1950 – 2050Cys-rich subdomain 4Add BLAST101
Repeati2116 – 2220Cys-rich subdomain 5Add BLAST105
Repeati3222 – 3326Cys-rich subdomain 6Add BLAST105
Repeati3520 – 3660Cys-rich subdomain 7Add BLAST141
Repeati3953 – 4057Cys-rich subdomain 8Add BLAST105
Repeati4627 – 4731Cys-rich subdomain 9Add BLAST105
Domaini4852 – 4918VWFC 2PROSITE-ProRule annotationAdd BLAST67
Domaini4920 – 5131VWFD 4PROSITE-ProRule annotationAdd BLAST212
Domaini5276 – 5345VWFC 3PROSITE-ProRule annotationAdd BLAST70
Domaini5381 – 5448VWFC 4PROSITE-ProRule annotationAdd BLAST68
Domaini5532 – 5620CTCKPROSITE-ProRule annotationAdd BLAST89

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Family and Domains’ section describes a region of interest that cannot be described in other subsections.<p><a href='/help/region' target='_top'>More...</a></p>Regioni1383 – 47319 X Cys-rich subdomain repeatsAdd BLAST3349
Regioni2257 – 3200107 X 8 AA approximate tandem repeats of T-T-S-T-T-S-A-PAdd BLAST944
Regioni3363 – 349817 X 8 AA approximate tandem repeats of T-T-S-T-T-S-A-PAdd BLAST136
Regioni3661 – 393134 X 8 AA approximate tandem repeats of T-T-S-T-T-S-A-PAdd BLAST271
Regioni4093 – 459558 X 8 AA approximate tandem repeats of T-T-S-T-T-S-A-PAdd BLAST503

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Family and Domains’ section describes the position of regions of compositional bias within the protein and the particular amino acids that are over-represented within those regions.<p><a href='/help/compbias' target='_top'>More...</a></p>Compositional biasi1490 – 1585Thr-richPROSITE-ProRule annotationAdd BLAST96
Compositional biasi1681 – 1750Thr-richPROSITE-ProRule annotationAdd BLAST70
Compositional biasi1850 – 1958Thr-richPROSITE-ProRule annotationAdd BLAST109
Compositional biasi2054 – 2124Thr-richPROSITE-ProRule annotationAdd BLAST71
Compositional biasi2223 – 4618Thr-richPROSITE-ProRule annotationAdd BLAST2396
Compositional biasi2231 – 4615Ser-richPROSITE-ProRule annotationAdd BLAST2385
Compositional biasi4750 – 4778Ser-richPROSITE-ProRule annotationAdd BLAST29
Compositional biasi5520 – 5525Poly-ProSequence analysis6

<p>This subsection of the ‘Family and domains’ section provides general information on the biological role of a domain. The term ‘domain’ is intended here in its wide acceptation, it may be a structural domain, a transmembrane region or a functional domain. Several domains are described in this subsection.<p><a href='/help/domain_cc' target='_top'>More...</a></p>Domaini

The cysteine residues in the Cys-rich subdomain repeats are not involved in disulfide bonding.

Keywords - Domaini

Repeat, Signal

Phylogenomic databases

Ensembl GeneTree

More...
GeneTreei
ENSGT00940000156076

InParanoid: Eukaryotic Ortholog Groups

More...
InParanoidi
P98088

KEGG Orthology (KO)

More...
KOi
K21125

Identification of Orthologs from Complete Genome Data

More...
OMAi
KWFDVEF

Database of Orthologous Groups

More...
OrthoDBi
EOG091G0006

Family and domain databases

Integrated resource of protein families, domains and functional sites

More...
InterProi
View protein in InterPro
IPR006207 Cys_knot_C
IPR036084 Ser_inhib-like_sf
IPR002919 TIL_dom
IPR014853 Unchr_dom_Cys-rich
IPR001007 VWF_dom
IPR001846 VWF_type-D
IPR025155 WxxW_domain

Pfam protein domain database

More...
Pfami
View protein in Pfam
PF08742 C8, 4 hits
PF13330 Mucin2_WxxW, 9 hits
PF01826 TIL, 2 hits
PF00094 VWD, 4 hits

Simple Modular Architecture Research Tool; a protein domain database

More...
SMARTi
View protein in SMART
SM00832 C8, 4 hits
SM00041 CT, 1 hit
SM00214 VWC, 6 hits
SM00215 VWC_out, 2 hits
SM00216 VWD, 4 hits

Superfamily database of structural and functional annotation

More...
SUPFAMi
SSF57567 SSF57567, 4 hits

PROSITE; a protein domain and family database

More...
PROSITEi
View protein in PROSITE
PS01185 CTCK_1, 1 hit
PS01225 CTCK_2, 1 hit
PS01208 VWFC_1, 2 hits
PS50184 VWFC_2, 2 hits
PS51233 VWFD, 4 hits

<p>This section displays by default the canonical protein sequence and upon request all isoforms described in the entry. It also includes information pertinent to the sequence(s), including <a href="http://www.uniprot.org/help/sequence_length">length</a> and <a href="http://www.uniprot.org/help/sequences">molecular weight</a>.<p><a href='/help/sequences_section' target='_top'>More...</a></p>Sequencei

<p>This subsection of the <a href="http://www.uniprot.org/help/sequences_section">Sequence</a> section indicates if the <a href="http://www.uniprot.org/help/canonical_and_isoforms">canonical sequence</a> displayed by default in the entry is complete or not.<p><a href='/help/sequence_status' target='_top'>More...</a></p>Sequence statusi: Complete.

<p>This subsection of the <a href="http://www.uniprot.org/help/sequences_section">Sequence</a> section indicates if the <a href="http://www.uniprot.org/help/canonical_and_isoforms">canonical sequence</a> displayed by default in the entry is in its mature form or if it represents the precursor.<p><a href='/help/sequence_processing' target='_top'>More...</a></p>Sequence processingi: The displayed sequence is further processed into a mature form.

P98088-1 [UniParc]FASTAAdd to basket
« Hide
        10         20         30         40         50
MSVGRRKLAL LWALALALAC TRHTGHAQDG SSESSYKHHP ALSPIARGPS
60 70 80 90 100
GVPLRGATVF PSLRTIPVVR ASNPAHNGRV CSTWGSFHYK TFDGDVFRFP
110 120 130 140 150
GLCNYVFSEH CGAAYEDFNI QLRRSQESAA PTLSRVLMKV DGVVIQLTKG
160 170 180 190 200
SVLVNGHPVL LPFSQSGVLI QQSSSYTKVE ARLGLVLMWN HDDSLLLELD
210 220 230 240 250
TKYANKTCGL CGDFNGMPVV SELLSHNTKL TPMEFGNLQK MDDPTDQCQD
260 270 280 290 300
PVPEPPRNCS TGFGICEELL HGQLFSGCVA LVDVGSYLEA CRQDLCFCED
310 320 330 340 350
TDLLSCVCHT LAEYSRQCTH AGGLPQDWRG PDFCPQKCPN NMQYHECRSP
360 370 380 390 400
CADTCSNQEH SRACEDHCVA GCFCPEGTVL DDIGQTGCVP VSKCACVYNG
410 420 430 440 450
AAYAPGATYS TDCTNCTCSG GRWSCQEVPC PGTCSVLGGA HFSTFDGKQY
460 470 480 490 500
TVHGDCSYVL TKPCDSSAFT VLAELRRCGL TDSETCLKSV TLSLDGAQTV
510 520 530 540 550
VVIKASGEVF LNQIYTQLPI SAANVTIFRP STFFIIAQTS LGLQLNLQLV
560 570 580 590 600
PTMQLFMQLA PKLRGQTCGL CGNFNSIQAD DFRTLSGVVE ATAAAFFNTF
610 620 630 640 650
KTQAACPNIR NSFEDPCSLS VENEKYAQHW CSQLTDADGP FGRCHAAVKP
660 670 680 690 700
GTYYSNCMFD TCNCERSEDC LCAALSSYVH ACAAKGVQLG GWRDGVCTKP
710 720 730 740 750
MTTCPKSMTY HYHVSTCQPT CRSLSEGDIT CSVGFIPVDG CICPKGTFLD
760 770 780 790 800
DTGKCVQASN CPCYHRGSMI PNGESVHDSG AICTCTHGKL SCIGGQAPAP
810 820 830 840 850
VCAAPMVFFD CRNATPGDTG AGCQKSCHTL DMTCYSPQCV PGCVCPDGLV
860 870 880 890 900
ADGEGGCITA EDCPCVHNEA SYRAGQTIRV GCNTCTCDSR MWRCTDDPCL
910 920 930 940 950
ATCAVYGDGH YLTFDGQSYS FNGDCEYTLV QNHCGGKDST QDSFRVVTEN
960 970 980 990 1000
VPCGTTGTTC SKAIKIFLGG FELKLSHGKV EVIGTDESQE VPYTIRQMGI
1010 1020 1030 1040 1050
YLVVDTDIGL VLLWDKKTSI FINLSPEFKG RVCGLCGNFD DIAVNDFATR
1060 1070 1080 1090 1100
SRSVVGDVLE FGNSWKLSPS CPDALAPKDP CTANPFRKSW AQKQCSILHG
1110 1120 1130 1140 1150
PTFAACHAHV EPARYYEACV NDACACDSGG DCECFCTAVA AYAQACHEVG
1160 1170 1180 1190 1200
LCVSWRTPSI CPLFCDYYNP EGQCEWHYQP CGVPCLRTCR NPRGDCLRDV
1210 1220 1230 1240 1250
RGLEGCYPKC PPEAPIFDED KMQCVATCPT PPLPPRCHVH GKSYRPGAVV
1260 1270 1280 1290 1300
PSDKNCQSCL CTERGVECTY KAEACVCTYN GQRFHPGDVI YHTTDGTGGC
1310 1320 1330 1340 1350
ISARCGANGT IERRVYPCSP TTPVPPTTFS FSTPPLVVSS THTPSNGPSS
1360 1370 1380 1390 1400
AHTGPPSSAW PTTAGTSPRT RLPTASASLP PVCGEKCLWS PWMDVSRPGR
1410 1420 1430 1440 1450
GTDSGDFDTL ENLRAHGYRV CESPRSVECR AEDAPGVPLR ALGQRVQCSP
1460 1470 1480 1490 1500
DVGLTCRNRE QASGLCYNYQ IRVQCCTPLP CSTSSSPAQT TPPTTSKTTE
1510 1520 1530 1540 1550
TRASGSSAPS STPGTVSLST ARTTPAPGTA TSVKKTFSTP SPPPVPATST
1560 1570 1580 1590 1600
SSMSTTAPGT SVVSSKPTPT EPSTSSCLQE LCTWTEWIDG SYPAPGINGG
1610 1620 1630 1640 1650
DFDTFQNLRD EGYTFCESPR SVQCRAESFP NTPLADLGQD VICSHTEGLI
1660 1670 1680 1690 1700
CLNKNQLPPI CYNYEIRIQC CETVNVCRDI TRLPKTVATT RPTPHPTGAQ
1710 1720 1730 1740 1750
TQTTFTTHMP SASTEQPTAT SRGGPTATSV TQGTHTTLVT RNCHPRCTWT
1760 1770 1780 1790 1800
KWFDVDFPSP GPHGGDKETY NNIIRSGEKI CRRPEEITRL QCRAKSHPEV
1810 1820 1830 1840 1850
SIEHLGQVVQ CSREEGLVCR NQDQQGPFKM CLNYEVRVLC CETPRGCHMT
1860 1870 1880 1890 1900
STPGSTSSSP AQTTPSTTSK TTETQASGSS APSSTPGTVS LSTARTTPAP
1910 1920 1930 1940 1950
GTATSVKKTF STPSPPPVPA TSTSSMSTTA PGTSVVSSKP TPTEPSTSSC
1960 1970 1980 1990 2000
LQELCTWTEW IDGSYPAPGI NGGDFDTFQN LRDEGYTFCE SPRSVQCRAE
2010 2020 2030 2040 2050
SFPNTPLADL GQDVICSHTE GLICLNKNQL PPICYNYEIR IQCCETVNVC
2060 2070 2080 2090 2100
RDITRPPKTV ATTRPTPHPT GAQTQTTFTT HMPSASTEQP TATSRGGPTA
2110 2120 2130 2140 2150
TSVTQGTHTT PVTRNCHPRC TWTTWFDVDF PSPGPHGGDK ETYNNIIRSG
2160 2170 2180 2190 2200
EKICRRPEEI TRLQCRAKSH PEVSIEHLGQ VVQCSREEGL VCRNQDQQGP
2210 2220 2230 2240 2250
FKMCLNYEVR VLCCETPKGC PVTSTPVTAP STPSGRATSP TQSTSSWQKS
2260 2270 2280 2290 2300
RTTTLVTTST TSTPQTSTTY AHTTSTTSAP TARTTSAPTT RTTSASPAST
2310 2320 2330 2340 2350
TSGPGNTPSP VPTTSTISAP TTSITSAPTT STTSAPTSST TSGPGTTPSP
2360 2370 2380 2390 2400
VPTTSITSAP TTSTTSAPTT STTSARTSST TSATTTSRIS GPETTPSPVP
2410 2420 2430 2440 2450
TTSTTSATTT STTSAPTTST TSAPTSSTTS SPQTSTTSAP TTSTTSGPGT
2460 2470 2480 2490 2500
TPSPVPTTST TSAPTTRTTS APKSSTTSAA TTSTTSGPET TPRPVPTTST
2510 2520 2530 2540 2550
TSSPTTSTTS APTTSTTSAS TTSTTSGAGT TPSPVPTTST TSAPTTSTTS
2560 2570 2580 2590 2600
APISSTTSAT TTSTTSGPGT TPSPVPTTST TSAPTTSTTS GPGTTPSAVP
2610 2620 2630 2640 2650
TTSITSAPTT STNSAPISST TSATTTSRIS GPETTPSPVP TASTTSASTT
2660 2670 2680 2690 2700
STTSGPGTTP SPVPTTSTIS VPTTSTTSAS TTSTTSASTT STTSGPGTTP
2710 2720 2730 2740 2750
SPVPTTSTTS APTTSTTSAP TTSTISAPTT STTSATTTST TSAPTPRRTS
2760 2770 2780 2790 2800
APTTSTISAS TTSTTSATTT STTSATTTST ISAPTTSTTL SPTTSTTSTT
2810 2820 2830 2840 2850
ITSTTSAPIS STTSTPQTST TSAPTTSTTS GPGTTSSPVP TTSTTSAPTT
2860 2870 2880 2890 2900
STTSAPTTRT TSVPTSSTTS TATTSTTSGP GTTPSPVPTT STTSAPTTRT
2910 2920 2930 2940 2950
TSAPTTSTTS APTTSTTSAP TSSTTSATTT STISVPTTST TSVPGTTPSP
2960 2970 2980 2990 3000
VPTTSTISVP TTSTTSASTT STTSGPGTTP SPVPTTSTTS APTTSTTSAP
3010 3020 3030 3040 3050
TTSTISAPTT STPSAPTTST TLAPTTSTTS APTTSTTSTP TSSTTSSPQT
3060 3070 3080 3090 3100
STTSASTTSI TSGPGTTPSP VPTTSTTSAP TTSTTSAATT STISAPTTST
3110 3120 3130 3140 3150
TSAPTTSTTS ASTASKTSGL GTTPSPIPTT STTSPPTTST TSASTASKTS
3160 3170 3180 3190 3200
GPGTTPSPVP TTSTIFAPRT STTSASTTST TPGPGTTPSP VPTTSTASVS
3210 3220 3230 3240 3250
KTSTSHVSIS KTTHSQPVTR DCHLRCTWTK WFDIDFPSPG PHGGDKETYN
3260 3270 3280 3290 3300
NIIRSGEKIC RRPEEITRLQ CRAESHPEVS IEHLGQVVQC SREEGLVCRN
3310 3320 3330 3340 3350
QDQQGPFKMC LNYEVRVLCC ETPKGCPVTS TPVTAPSTPS GRATSPTQST
3360 3370 3380 3390 3400
SSWQKSRTTT LVTTSTTSTP QTSTTSAPTT STTSAPTTST TSAPTTSTTS
3410 3420 3430 3440 3450
TPQTSISSAP TSSTTSAPTS STISARTTSI ISAPTTSTTS SPTTSTTSAT
3460 3470 3480 3490 3500
TTSTTSAPTS STTSTPQTSK TSAATSSTTS GSGTTPSPVT TTSTASVSKT
3510 3520 3530 3540 3550
STSHVSVSKT THSQPVTRDC HPRCTWTKWF DVDFPSPGPH GGDKETYNNI
3560 3570 3580 3590 3600
IRSGEKICRR PEEITRLQCR AKSHPEVSIE HLGQVVQCSR EEGLVCRNQD
3610 3620 3630 3640 3650
QQGPFKMCLN YEVRVLCCET PKGCPVTSTS VTAPSTPSGR ATSPTQSTSS
3660 3670 3680 3690 3700
WQKSRTTTLV TSSITSTTQT STTSAPTTST TPASIPSTTS APTTSTTSAP
3710 3720 3730 3740 3750
TTSTTSAPTT STTSTPQTTT SSAPTSSTTS APTTSTISAP TTSTISAPTT
3760 3770 3780 3790 3800
STTSAPTAST TSAPTSTSSA PTTNTTSAPT TSTTSAPITS TISAPTTSTT
3810 3820 3830 3840 3850
STPQTSTISS PTTSTTSTPQ TSTTSSPTTS TTSAPTTSTT SAPTTSTTST
3860 3870 3880 3890 3900
PQTSISSAPT SSTTSAPTAS TISAPTTSTT SFHTTSTTSP PTSSTSSTPQ
3910 3920 3930 3940 3950
TSKTSAATSS TTSGSGTTPS PVPTTSTASV SKTSTSHVSV SKTTHSQPVT
3960 3970 3980 3990 4000
RDCHPRCTWT KWFDVDFPSP GPHGGDKETY NNIIRSGEKI CRRPEEITRL
4010 4020 4030 4040 4050
QCRAESHPEV SIEHLGQVVQ CSREEGLVCR NQDQQGPFKM CLNYEVRVLC
4060 4070 4080 4090 4100
CETPKGCPVT STPVTAPSTP SGRATSPTQS TSSWQKSRTT TLVTTSTTST
4110 4120 4130 4140 4150
PQTSTTSAPT TSTIPASTPS TTSAPTTSTT SAPTTSTTSA PTHRTTSGPT
4160 4170 4180 4190 4200
TSTTLAPTTS TTSAPTTSTN SAPTTSTISA STTSTISAPT TSTISSPTSS
4210 4220 4230 4240 4250
TTSTPQTSKT SAATSSTTSG SGTTPSPVPT TSTTSASTTS TTSAPTTSTT
4260 4270 4280 4290 4300
SGPGTTPSPV PSTSTTSAAT TSTTSAPTTR TTSAPTSSMT SGPGTTPSPV
4310 4320 4330 4340 4350
PTTSTTSAPT TSTTSGPGTT PSPVPTTSTT SAPITSTTSG PGSTPSPVPT
4360 4370 4380 4390 4400
TSTTSAPTTS TTSASTASTT SGPGTTPSPV PTTSTTSAPT TRTTSASTAS
4410 4420 4430 4440 4450
TTSGPGSTPS PVPTTSTTSA PTTRTTPAST ASTTSGPGTT PSPVPTTSTT
4460 4470 4480 4490 4500
SASTTSTISL PTTSTTSAPI TSMTSGPGTT PSPVPTTSTT SAPTTSTTSA
4510 4520 4530 4540 4550
STASTTSGPG TTPSPVPTTS TTSAPTTSTT SASTASTTSG PGTSLSPVPT
4560 4570 4580 4590 4600
TSTTSAPTTS TTSGPGTTPS PVPTTSTTSA PTTSTTSGPG TTPSPVPTTS
4610 4620 4630 4640 4650
TTPVSKTSTS HLSVSKTTHS QPVTSDCHPL CAWTKWFDVD FPSPGPHGGD
4660 4670 4680 4690 4700
KETYNNIIRS GEKICRRPEE ITRLQCRAES HPEVNIEHLG QVVQCSREEG
4710 4720 4730 4740 4750
LVCRNQDQQG PFKMCLNYEV RVLCCETPRG CPVTSVTPYG TSPTNALYPS
4760 4770 4780 4790 4800
LSTSMVSASV ASTSVASSSV ASSSVAYSTQ TCFCNVADRL YPAGSTIYRH
4810 4820 4830 4840 4850
RDLAGHCYYA LCSQDCQVVR GVDSDCPSTT LPPAPATSPS ISTSEPVTEL
4860 4870 4880 4890 4900
GCPNAVPPRK KGETWATPNC SEATCEGNNV ISLRPRTCPR VEKPTCANGY
4910 4920 4930 4940 4950
PAVKVADQDG CCHHYQCQCV CSGWGDPHYI TFDGTYYTFL DNCTYVLVQQ
4960 4970 4980 4990 5000
IVPVYGHFRV LVDNYFCGAE DGLSCPRSII LEYHQDRVVL TRKPVHGVMT
5010 5020 5030 5040 5050
NEIIFNNKVV SPGFRKNGIV VSRIGVKMYA TIPELGVQVM FSGLIFSVEV
5060 5070 5080 5090 5100
PFSKFANNTE GQCGTCTNDR KDECRTPRGT VVASCSEMSG LWNVSIPDQP
5110 5120 5130 5140 5150
ACHRPHPTPT TVGPTTVGST TVGPTTVGST TVGPTTPPAP CLPSPICQLI
5160 5170 5180 5190 5200
LSKVFEPCHT VIPPLLFYEG CVFDRCHMTD LDVVCSSLEL YAALCASHDI
5210 5220 5230 5240 5250
CIDWRGRTGH MCPFTCPADK VYQPCGPSNP SYCYGNDSAS LGALPEAGPI
5260 5270 5280 5290 5300
TEGCFCPEGM TLFSTSAQVC VPTGCPRCLG PHGEPVKVGH TVGMDCQECT
5310 5320 5330 5340 5350
CEAATWTLTC RPKLCPLPPA CPLPGFVPVP AAPQAGQCCP QYSCACNTSR
5360 5370 5380 5390 5400
CPAPVGCPEG ARAIPTYQEG ACCPVQNCSW TVCSINGTLY QPGAVVSSSL
5410 5420 5430 5440 5450
CETCRCELPG GPPSDAFVVS CETQICNTHC PVGFEYQEQS GQCCGTCVQV
5460 5470 5480 5490 5500
ACVTNTSKSP AHLFYPGETW SDAGNHCVTH QCEKHQDGLV VVTTKKACPP
5510 5520 5530 5540 5550
LSCSLDEARM SKDGCCRFCP PPPPPYQNQS TCAVYHRSLI IQQQGCSSSE
5560 5570 5580 5590 5600
PVRLAYCRGN CGDSSSMYSL EGNTVEHRCQ CCQELRTSLR NVTLHCTDGS
5610 5620 5630 5640 5650
SRAFSYTEVE ECGCMGRRCP APGDTQHSEE AEPEPSQEAE SGSWERGVPV

SPMH
Length:5,654
Mass (Da):585,570
Last modified:April 1, 2015 - v4
<p>The checksum is a form of redundancy check that is calculated from the sequence. It is useful for tracking sequence updates.</p> <p>It should be noted that while, in theory, two different sequences could have the same checksum value, the likelihood that this would happen is extremely low.</p> <p>However UniProtKB may contain entries with identical sequences in case of multiple genes (paralogs).</p> <p>The checksum is computed as the sequence 64-bit Cyclic Redundancy Check value (CRC64) using the generator polynomial: x<sup>64</sup> + x<sup>4</sup> + x<sup>3</sup> + x + 1. The algorithm is described in the ISO 3309 standard. </p> <p class="publication">Press W.H., Flannery B.P., Teukolsky S.A. and Vetterling W.T.<br /> <strong>Cyclic redundancy and other checksums</strong><br /> <a href="http://www.nrbook.com/b/bookcpdf.php">Numerical recipes in C 2nd ed., pp896-902, Cambridge University Press (1993)</a>)</p> Checksum:i13217A8257E8E2DE
GO

<p>This subsection of the ‘Sequence’ section reports difference(s) between the protein sequence shown in the UniProtKB entry and other available protein sequences derived from the same gene.<p><a href='/help/sequence_caution' target='_top'>More...</a></p>Sequence cautioni

The sequence AAA18431 differs from that shown. Reason: Frameshift at several positions.Curated
The sequence AAA18431 differs from that shown. Reason: Erroneous termination at position 4999. Translated as Met.Curated
The sequence AAC15950 differs from that shown. Reason: Frameshift at positions 24, 44, 671 and 683.Curated
The sequence CAA88307 differs from that shown. Reason: Frameshift at position 5649.Curated
The sequence CAH56330 differs from that shown. Reason: Frameshift at positions 5240, 5247 and 5253.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Sequence’ section reports difference(s) between the canonical sequence (displayed by default in the entry) and the different sequence submissions merged in the entry. These various submissions may originate from different sequencing projects, different types of experiments, or different biological samples. Sequence conflicts are usually of unknown origin.<p><a href='/help/conflict' target='_top'>More...</a></p>Sequence conflicti25G → S in AAC15950 (PubMed:9506983).Curated1
Sequence conflicti221S → R in AAC15950 (PubMed:9506983).Curated1
Sequence conflicti246D → E in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti246D → E in AAC15950 (PubMed:9506983).Curated1
Sequence conflicti432G → D in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti549L → P in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti658M → V in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti702T → I in AAC15950 (PubMed:9506983).Curated1
Sequence conflicti716T → A in AAC15950 (PubMed:9506983).Curated1
Sequence conflicti817 – 818GD → RG in AAC15950 (PubMed:9506983).Curated2
Sequence conflicti869E → K in AAC15950 (PubMed:9506983).Curated1
Sequence conflicti978G → R in AAC15950 (PubMed:9506983).Curated1
Sequence conflicti996R → Q in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti1141A → R in CAA57309 (PubMed:8948439).Curated1
Sequence conflicti1151 – 1155LCVSW → TCVCL in CAA57309 (PubMed:8948439).Curated5
Sequence conflicti1154 – 1155SW → CL in CAC83674 (PubMed:11535137).Curated2
Sequence conflicti1480P → A in CAA57309 (PubMed:8948439).Curated1
Sequence conflicti1683L → P in CAA57309 (PubMed:8948439).Curated1
Sequence conflicti1738L → P in CAA57309 (PubMed:8948439).Curated1
Sequence conflicti1790L → V in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti1790L → V in CAA57309 (PubMed:8948439).Curated1
Sequence conflicti1803E → N AA sequence (PubMed:2656675).Curated1
Sequence conflicti1874T → I in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti2008 – 2009AD → GR in CAC83674 (PubMed:11535137).Curated2
Sequence conflicti2176E → N AA sequence (PubMed:2656675).Curated1
Sequence conflicti2207Y → I in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti2238T → I in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti2289 – 4237Missing in CAC83674 (PubMed:11535137).CuratedAdd BLAST1949
Sequence conflicti3047S → T in CAC83675 (PubMed:11535137).Curated1
Sequence conflicti3088A → S in CAC83676 (PubMed:11535137).Curated1
Sequence conflicti3095 – 3096AP → PL in CAC83676 (PubMed:11535137).Curated2
Sequence conflicti3105T → I in CAC83676 (PubMed:11535137).Curated1
Sequence conflicti3107 – 4287Missing in CAC83676 (PubMed:11535137).CuratedAdd BLAST1181
Sequence conflicti3234I → V in CAC83675 (PubMed:11535137).Curated1
Sequence conflicti3481G → S in CAC83675 (PubMed:11535137).Curated1
Sequence conflicti3562E → Q in CAC83675 (PubMed:11535137).Curated1
Sequence conflicti3580E → N AA sequence (PubMed:2656675).Curated1
Sequence conflicti3636 – 3644TPSGRATSP → PLVGEPPAQ in CAC83675 (PubMed:11535137).Curated9
Sequence conflicti3817S → P in CAC83675 (PubMed:11535137).Curated1
Sequence conflicti4244A → V in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4250 – 4254TSGPG → ISGPK in CAC83674 (PubMed:11535137).Curated5
Sequence conflicti4262S → T in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4265T → I in CAC83675 (PubMed:11535137).Curated1
Sequence conflicti4274T → I in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4280 – 4284RTTSA → STTSV in CAC83674 (PubMed:11535137).Curated5
Sequence conflicti4286 – 4373Missing in CAC83674 (PubMed:11535137).CuratedAdd BLAST88
Sequence conflicti4290T → P in CAC83676 (PubMed:11535137).Curated1
Sequence conflicti4314 – 4473Missing in CAC83676 (PubMed:11535137).CuratedAdd BLAST160
Sequence conflicti4381P → L in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4398 – 4400TAS → PAG in CAC83674 (PubMed:11535137).Curated3
Sequence conflicti4407S → N in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4418T → I in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4421 – 4484Missing in CAC83674 (PubMed:11535137).CuratedAdd BLAST64
Sequence conflicti4489T → I in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4501 – 4503STA → PTS in CAC83674 (PubMed:11535137).Curated3
Sequence conflicti4521T → I in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4533 – 4572Missing in CAC83674 (PubMed:11535137).CuratedAdd BLAST40
Sequence conflicti4588G → A in CAC83674 (PubMed:11535137).Curated1
Sequence conflicti4614 – 4615VS → HE in AAA18431 (PubMed:7513696).Curated2
Sequence conflicti4827P → R in CAA88307 (PubMed:7775418).Curated1
Sequence conflicti4884R → S in CAA04737 (PubMed:9620876).Curated1
Sequence conflicti4884R → S in CAA04738 (PubMed:9620876).Curated1
Sequence conflicti4884R → S in CAA88307 (PubMed:7775418).Curated1
Sequence conflicti4886R → P in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti4899G → A in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti4946 – 4947VL → AW in AAA18431 (PubMed:7513696).Curated2
Sequence conflicti5013G → A in CAA88307 (PubMed:7775418).Curated1
Sequence conflicti5081 – 5084VVAS → HASA in AAH33831 (PubMed:15489334).Curated4
Sequence conflicti5148Q → H in CAA88307 (PubMed:7775418).Curated1
Sequence conflicti5148Q → H in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti5209 – 5210GH → RD in AAA18431 (PubMed:7513696).Curated2
Sequence conflicti5245P → R in CAA04737 (PubMed:9620876).Curated1
Sequence conflicti5245P → R in CAA04738 (PubMed:9620876).Curated1
Sequence conflicti5245P → R in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti5264S → T in CAA04737 (PubMed:9620876).Curated1
Sequence conflicti5264S → T in CAA04738 (PubMed:9620876).Curated1
Sequence conflicti5356G → R in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti5363A → R in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti5433G → R in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti5441G → A in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti5546C → S in AAA18431 (PubMed:7513696).Curated1
Sequence conflicti5622P → A in AAA18431 (PubMed:7513696).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Sequence’ section describes natural variant(s) of the protein sequence.<p><a href='/help/variant' target='_top'>More...</a></p>Natural variantiVAR_0368325521P → L2 PublicationsCorresponds to variant dbSNP:rs1132436Ensembl.1

Sequence databases

Select the link destinations:

EMBL nucleotide sequence database

More...
EMBLi

GenBank nucleotide sequence database

More...
GenBanki

DNA Data Bank of Japan; a nucleotide sequence database

More...
DDBJi
Links Updated
FO680660 Genomic DNA No translation available.
FP326773 Genomic DNA No translation available.
KC800812 Genomic DNA No translation available.
AJ298317 mRNA Translation: CAC83674.1
AJ298318 Genomic DNA Translation: CAC83675.1
AJ298319 Genomic DNA Translation: CAC83676.1
AF015521 mRNA Translation: AAC15950.1 Frameshift.
X81649 mRNA Translation: CAA57309.1
AJ001402 mRNA Translation: CAA04737.1
AJ001403 Genomic DNA Translation: CAA04738.1
U06711 mRNA Translation: AAA18431.1 Sequence problems.
Z48314 mRNA Translation: CAA88307.1 Frameshift.
BC033831 mRNA Translation: AAH33831.1
AL833060 mRNA Translation: CAH56330.1 Frameshift.

The Consensus CDS (CCDS) project

More...
CCDSi
CCDS76369.1

Protein sequence database of the Protein Information Resource

More...
PIRi
A33811
JE0095

NCBI Reference Sequences

More...
RefSeqi
NP_001291288.1, NM_001304359.1

UniGene gene-oriented nucleotide sequence clusters

More...
UniGenei
Hs.534332
Hs.558950
Hs.703588
Hs.703728
Hs.721515

Genome annotation databases

Ensembl eukaryotic genome annotation project

More...
Ensembli
ENST00000621226; ENSP00000485659; ENSG00000215182
ENST00000636567; ENSP00000490794; ENSG00000283158

Database of genes from NCBI RefSeq genomes

More...
GeneIDi
4586

KEGG: Kyoto Encyclopedia of Genes and Genomes

More...
KEGGi
hsa:4586

UCSC genome browser

More...
UCSCi
uc031xcx.2 human

Keywords - Coding sequence diversityi

Polymorphism

<p>This section provides links to proteins that are similar to the protein sequence(s) described in this entry at different levels of sequence identity thresholds (100%, 90% and 50%) based on their membership in UniProt Reference Clusters (<a href="http://www.uniprot.org/help/uniref">UniRef</a>).<p><a href='/help/similar_proteins_section' target='_top'>More...</a></p>Similar proteinsi

<p>This section is used to point to information related to entries and found in data collections other than UniProtKB.<p><a href='/help/cross_references_section' target='_top'>More...</a></p>Cross-referencesi

<p>This subsection of the <a href="http://www.uniprot.org/manual/cross_references_section">Cross-references</a> section provides links to various web resources that are relevant for a specific protein.<p><a href='/help/web_resource' target='_top'>More...</a></p>Web resourcesi

Mucin database
Atlas of Genetics and Cytogenetics in Oncology and Haematology

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
FO680660 Genomic DNA No translation available.
FP326773 Genomic DNA No translation available.
KC800812 Genomic DNA No translation available.
AJ298317 mRNA Translation: CAC83674.1
AJ298318 Genomic DNA Translation: CAC83675.1
AJ298319 Genomic DNA Translation: CAC83676.1
AF015521 mRNA Translation: AAC15950.1 Frameshift.
X81649 mRNA Translation: CAA57309.1
AJ001402 mRNA Translation: CAA04737.1
AJ001403 Genomic DNA Translation: CAA04738.1
U06711 mRNA Translation: AAA18431.1 Sequence problems.
Z48314 mRNA Translation: CAA88307.1 Frameshift.
BC033831 mRNA Translation: AAH33831.1
AL833060 mRNA Translation: CAH56330.1 Frameshift.
CCDSiCCDS76369.1
PIRiA33811
JE0095
RefSeqiNP_001291288.1, NM_001304359.1
UniGeneiHs.534332
Hs.558950
Hs.703588
Hs.703728
Hs.721515

3D structure databases

Select the link destinations:

Protein Data Bank Europe

More...
PDBei

Protein Data Bank RCSB

More...
RCSB PDBi

Protein Data Bank Japan

More...
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
5AJNX-ray1.67P4254-4268[»]
5AJOX-ray1.48B2528-2543[»]
5AJPX-ray1.65B2528-2543[»]
ProteinModelPortaliP98088
SMRiP98088
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

IntActiP98088, 1 interactor
STRINGi9606.ENSP00000435591

Chemistry databases

ChEMBLiCHEMBL3713020

Protein family/group databases

MEROPSiI08.951

PTM databases

GlyConnecti375
iPTMnetiP98088
UniCarbKBiP98088

Polymorphism and mutation databases

DMDMi160370004

Proteomic databases

PaxDbiP98088
PeptideAtlasiP98088
PRIDEiP98088
ProteomicsDBi57789

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000621226; ENSP00000485659; ENSG00000215182
ENST00000636567; ENSP00000490794; ENSG00000283158
GeneIDi4586
KEGGihsa:4586
UCSCiuc031xcx.2 human

Organism-specific databases

Comparative Toxicogenomics Database

More...
CTDi
4586
DisGeNETi4586
EuPathDBiHostDB:ENSG00000215182.8

GeneCards: human genes, protein and diseases

More...
GeneCardsi
MUC5AC

H-Invitational Database, human transcriptome db

More...
H-InvDBi
HIX0201650
HGNCiHGNC:7515 MUC5AC
HPAiCAB002774
CAB009395
HPA040456
HPA040615
MIMi158373 gene
neXtProtiNX_P98088
OpenTargetsiENSG00000215182

GenAtlas: human gene database

More...
GenAtlasi
Search...

Phylogenomic databases

GeneTreeiENSGT00940000156076
InParanoidiP98088
KOiK21125
OMAiKWFDVEF
OrthoDBiEOG091G0006

Enzyme and pathway databases

ReactomeiR-HSA-5083625 Defective GALNT3 causes familial hyperphosphatemic tumoral calcinosis (HFTC)
R-HSA-5083632 Defective C1GALT1C1 causes Tn polyagglutination syndrome (TNPS)
R-HSA-5083636 Defective GALNT12 causes colorectal cancer 1 (CRCS1)
R-HSA-5621480 Dectin-2 family
R-HSA-913709 O-linked glycosylation of mucins
R-HSA-977068 Termination of O-glycan biosynthesis

Miscellaneous databases

Database of phenotypes from RNA interference screens in Drosophila and Homo sapiens

More...
GenomeRNAii
4586

Protein Ontology

More...
PROi
PR:P98088

The Stanford Online Universal Resource for Clones and ESTs

More...
SOURCEi
Search...

Gene expression databases

BgeeiENSG00000215182 Expressed in 94 organ(s), highest expression level in nasal cavity epithelium
GenevisibleiP98088 HS

Family and domain databases

InterProiView protein in InterPro
IPR006207 Cys_knot_C
IPR036084 Ser_inhib-like_sf
IPR002919 TIL_dom
IPR014853 Unchr_dom_Cys-rich
IPR001007 VWF_dom
IPR001846 VWF_type-D
IPR025155 WxxW_domain
PfamiView protein in Pfam
PF08742 C8, 4 hits
PF13330 Mucin2_WxxW, 9 hits
PF01826 TIL, 2 hits
PF00094 VWD, 4 hits
SMARTiView protein in SMART
SM00832 C8, 4 hits
SM00041 CT, 1 hit
SM00214 VWC, 6 hits
SM00215 VWC_out, 2 hits
SM00216 VWD, 4 hits
SUPFAMiSSF57567 SSF57567, 4 hits
PROSITEiView protein in PROSITE
PS01185 CTCK_1, 1 hit
PS01225 CTCK_2, 1 hit
PS01208 VWFC_1, 2 hits
PS50184 VWFC_2, 2 hits
PS51233 VWFD, 4 hits

ProtoNet; Automatic hierarchical classification of proteins

More...
ProtoNeti
Search...

<p>This section provides general information on the entry.<p><a href='/help/entry_information_section' target='_top'>More...</a></p>Entry informationi

<p>This subsection of the ‘Entry information’ section provides a mnemonic identifier for a UniProtKB entry, but it is not a stable identifier. Each reviewed entry is assigned a unique entry name upon integration into UniProtKB/Swiss-Prot.<p><a href='/help/entry_name' target='_top'>More...</a></p>Entry nameiMUC5A_HUMAN
<p>This subsection of the ‘Entry information’ section provides one or more accession number(s). These are stable identifiers and should be used to cite UniProtKB entries. Upon integration into UniProtKB, each entry is assigned a unique accession number, which is called ‘Primary (citable) accession number’.<p><a href='/help/accession_numbers' target='_top'>More...</a></p>AccessioniPrimary (citable) accession number: P98088
Secondary accession number(s): A0A096LPK4
, O60460, O76065, Q13792, Q14425, Q658Q1, Q7M4S5, Q8N4M9, Q8WWQ3, Q8WWQ4, Q8WWQ5
<p>This subsection of the ‘Entry information’ section shows the date of integration of the entry into UniProtKB, the date of the last sequence update and the date of the last annotation modification (‘Last modified’). The version number for both the entry and the <a href="http://www.uniprot.org/help/canonical_and_isoforms">canonical sequence</a> are also displayed.<p><a href='/help/entry_history' target='_top'>More...</a></p>Entry historyiIntegrated into UniProtKB/Swiss-Prot: February 1, 1996
Last sequence update: April 1, 2015
Last modified: December 5, 2018
This is version 169 of the entry and version 4 of the sequence. See complete history.
<p>This subsection of the ‘Entry information’ section indicates whether the entry has been manually annotated and reviewed by UniProtKB curators or not, in other words, if the entry belongs to the Swiss-Prot section of UniProtKB (<strong>reviewed</strong>) or to the computer-annotated TrEMBL section (<strong>unreviewed</strong>).<p><a href='/help/entry_status' target='_top'>More...</a></p>Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

<p>This section contains any relevant information that doesn’t fit in any other defined sections<p><a href='/help/miscellaneous_section' target='_top'>More...</a></p>Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  2. Human chromosome 11
    Human chromosome 11: entries, gene names and cross-references to MIM
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  5. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again