Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Entry version 159 (03 Jul 2019)
Sequence version 3 (01 Nov 1997)
Previous versions | rss
Other tutorials and videosHelp videoFeedback
Protein

Collagen alpha-1(XII) chain

Gene

COL12A1

Organism
Gallus gallus (Chicken)
Status
Reviewed-Annotation score:

Annotation score:5 out of 5

<p>The annotation score provides a heuristic measure of the annotation content of a UniProtKB entry or proteome. This score <strong>cannot</strong> be used as a measure of the accuracy of the annotation as we cannot define the ‘correct annotation’ for any given protein.<p><a href='/help/annotation_score' target='_top'>More...</a></p>
-Experimental evidence at protein leveli <p>This indicates the type of evidence that supports the existence of the protein. Note that the ‘protein existence’ evidence does not give information on the accuracy or correctness of the sequence(s) displayed.<p><a href='/help/protein_existence' target='_top'>More...</a></p>

<p>This section provides any useful information about the protein, mostly biological knowledge.<p><a href='/help/function_section' target='_top'>More...</a></p>Functioni

Type XII collagen interacts with type I collagen-containing fibrils, the COL1 domain could be associated with the surface of the fibrils, and the COL2 and NC3 domains may be localized in the perifibrillar matrix.

<p>The <a href="http://www.geneontology.org/">Gene Ontology (GO)</a> project provides a set of hierarchical controlled vocabulary split into 3 categories:<p><a href='/help/gene_ontology' target='_top'>More...</a></p>GO - Biological processi

<p>UniProtKB Keywords constitute a <a href="http://www.uniprot.org/keywords">controlled vocabulary</a> with a hierarchical structure. Keywords summarise the content of a UniProtKB entry and facilitate the search for proteins of interest.<p><a href='/help/keywords' target='_top'>More...</a></p>Keywordsi

Biological processCell adhesion

<p>This section provides information about the protein and gene name(s) and synonym(s) and about the organism that is the source of the protein sequence.<p><a href='/help/names_and_taxonomy_section' target='_top'>More...</a></p>Names & Taxonomyi

<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section provides an exhaustive list of all names of the protein, from commonly used to obsolete, to allow unambiguous identification of a protein.<p><a href='/help/protein_names' target='_top'>More...</a></p>Protein namesi
Recommended name:
Collagen alpha-1(XII) chain
Alternative name(s):
Fibrochimerin
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section indicates the name(s) of the gene(s) that code for the protein sequence(s) described in the entry. Four distinct tokens exist: ‘Name’, ‘Synonyms’, ‘Ordered locus names’ and ‘ORF names’.<p><a href='/help/gene_name' target='_top'>More...</a></p>Gene namesi
Name:COL12A1
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section provides information on the name(s) of the organism that is the source of the protein sequence.<p><a href='/help/organism-name' target='_top'>More...</a></p>OrganismiGallus gallus (Chicken)
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section shows the unique identifier assigned by the NCBI to the source organism of the protein. This is known as the ‘taxonomic identifier’ or ‘taxid’.<p><a href='/help/taxonomic_identifier' target='_top'>More...</a></p>Taxonomic identifieri9031 [NCBI]
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section contains the taxonomic hierarchical classification lineage of the source organism. It lists the nodes as they appear top-down in the taxonomic tree, with the more general grouping listed first.<p><a href='/help/taxonomic_lineage' target='_top'>More...</a></p>Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiArchelosauriaArchosauriaDinosauriaSaurischiaTheropodaCoelurosauriaAvesNeognathaeGalloanseraeGalliformesPhasianidaePhasianinaeGallus
<p>This subsection of the <a href="http://www.uniprot.org/help/names_and_taxonomy_section">Names and taxonomy</a> section is present for entries that are part of a <a href="http://www.uniprot.org/proteomes">proteome</a>, i.e. of a set of proteins thought to be expressed by organisms whose genomes have been completely sequenced.<p><a href='/help/proteomes_manual' target='_top'>More...</a></p>Proteomesi
  • UP000000539 <p>A UniProt <a href="http://www.uniprot.org/manual/proteomes_manual">proteome</a> can consist of several components. <br></br>The component name refers to the genomic component encoding a set of proteins.<p><a href='/help/proteome_component' target='_top'>More...</a></p> Componenti: Unplaced

<p>This section provides information on the location and the topology of the mature protein in the cell.<p><a href='/help/subcellular_location_section' target='_top'>More...</a></p>Subcellular locationi

Extracellular region or secreted Cytosol Plasma membrane Cytoskeleton Lysosome Endosome Peroxisome ER Golgi apparatus Nucleus Mitochondrion Manual annotation Automatic computational assertionGraphics by Christian Stolte & Seán O’Donoghue; Source: COMPARTMENTS

Keywords - Cellular componenti

Extracellular matrix, Secreted

<p>This section describes post-translational modifications (PTMs) and/or processing events.<p><a href='/help/ptm_processing_section' target='_top'>More...</a></p>PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘PTM / Processing’ section denotes the presence of an N-terminal signal peptide.<p><a href='/help/signal' target='_top'>More...</a></p>Signal peptidei1 – 23Sequence analysisAdd BLAST23
<p>This subsection of the ‘PTM / Processing’ section describes the extent of a polypeptide chain in the mature protein following processing.<p><a href='/help/chain' target='_top'>More...</a></p>ChainiPRO_000000578224 – 3124Collagen alpha-1(XII) chainAdd BLAST3101

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the <a href="http://www.uniprot.org/help/ptm_processing_section">PTM / Processing</a> section specifies the position and type of each covalently attached glycan group (mono-, di-, or polysaccharide).<p><a href='/help/carbohyd' target='_top'>More...</a></p>Glycosylationi32N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi797O-linked (Xyl...) (chondroitin sulfate) serineSequence analysis1
Glycosylationi890O-linked (Xyl...) (chondroitin sulfate) serineSequence analysis1
Glycosylationi981O-linked (Xyl...) (chondroitin sulfate) serineSequence analysis1
Glycosylationi1006N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1032N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1044N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1512N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1767N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi2210N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi2273N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi2532N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi2683N-linked (GlcNAc...) asparagineSequence analysis1

<p>This subsection of the <a href="http://www.uniprot.org/help/ptm_processing_section">PTM/processing</a> section describes post-translational modifications (PTMs). This subsection <strong>complements</strong> the information provided at the sequence level or describes modifications for which <strong>position-specific data is not yet available</strong>.<p><a href='/help/post-translational_modification' target='_top'>More...</a></p>Post-translational modificationi

The triple-helical tail is stabilized by disulfide bonds at each end.
Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains.
O-glycosylated; glycosaminoglycan of chondroitin-sulfate type.By similarity

Keywords - PTMi

Disulfide bond, Glycoprotein, Hydroxylation, Proteoglycan

Proteomic databases

PaxDb, a database of protein abundance averages across all three domains of life

More...
PaxDbi
P13944

PRoteomics IDEntifications database

More...
PRIDEi
P13944

<p>This section provides information on the expression of a gene at the mRNA or protein level in cells or in tissues of multicellular organisms.<p><a href='/help/expression_section' target='_top'>More...</a></p>Expressioni

<p>This subsection of the ‘Expression’ section provides information on the expression of a gene at the mRNA or protein level in cells or in tissues of multicellular organisms. By default, the information is derived from experiments at the mRNA level, unless specified ‘at protein level’. <br></br>Examples: <a href="http://www.uniprot.org/uniprot/P92958#expression">P92958</a>, <a href="http://www.uniprot.org/uniprot/Q8TDN4#expression">Q8TDN4</a>, <a href="http://www.uniprot.org/uniprot/O14734#expression">O14734</a><p><a href='/help/tissue_specificity' target='_top'>More...</a></p>Tissue specificityi

Type XII collagen is present in tendons, ligaments, perichondrium, and periosteum, all dense connective tissues containing type I collagen.

<p>This section provides information on the quaternary structure of a protein and on interaction(s) with other proteins or protein complexes.<p><a href='/help/interaction_section' target='_top'>More...</a></p>Interactioni

<p>This subsection of the <a href="http://www.uniprot.org/help/interaction_section">'Interaction'</a> section provides information about the protein quaternary structure and interaction(s) with other proteins or protein complexes (with the exception of physiological receptor-ligand interactions which are annotated in the <a href="http://www.uniprot.org/help/function_section">'Function'</a> section).<p><a href='/help/subunit_structure' target='_top'>More...</a></p>Subunit structurei

Trimer of identical chains each containing 190 kDa of non-triple-helical sequences.

Protein-protein interaction databases

ComplexPortal: manually curated resource of macromolecular complexes

More...
ComplexPortali
CPX-3109 Collagen type XII trimer

STRING: functional protein association networks

More...
STRINGi
9031.ENSGALP00000025593

<p>This section provides information on the tertiary and secondary structure of a protein.<p><a href='/help/structure_section' target='_top'>More...</a></p>Structurei

3D structure databases

SWISS-MODEL Repository - a database of annotated 3D protein structure models

More...
SMRi
P13944

Database of comparative protein structure models

More...
ModBasei
Search...

<p>This section provides information on sequence similarities with other proteins and the domain(s) present in a protein.<p><a href='/help/family_and_domains_section' target='_top'>More...</a></p>Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the <a href="http://www.uniprot.org/help/family_and_domains_section">Family and Domains</a> section describes the position and type of a domain, which is defined as a specific combination of secondary structures organized into a characteristic three-dimensional structure or fold.<p><a href='/help/domain' target='_top'>More...</a></p>Domaini27 – 117Fibronectin type-III 1PROSITE-ProRule annotationAdd BLAST91
Domaini139 – 311VWFA 1PROSITE-ProRule annotationAdd BLAST173
Domaini335 – 424Fibronectin type-III 2PROSITE-ProRule annotationAdd BLAST90
Domaini439 – 615VWFA 2PROSITE-ProRule annotationAdd BLAST177
Domaini633 – 722Fibronectin type-III 3PROSITE-ProRule annotationAdd BLAST90
Domaini724 – 815Fibronectin type-III 4PROSITE-ProRule annotationAdd BLAST92
Domaini816 – 906Fibronectin type-III 5PROSITE-ProRule annotationAdd BLAST91
Domaini908 – 998Fibronectin type-III 6PROSITE-ProRule annotationAdd BLAST91
Domaini999 – 1087Fibronectin type-III 7PROSITE-ProRule annotationAdd BLAST89
Domaini1089 – 1179Fibronectin type-III 8PROSITE-ProRule annotationAdd BLAST91
Domaini1199 – 1371VWFA 3PROSITE-ProRule annotationAdd BLAST173
Domaini1387 – 1476Fibronectin type-III 9PROSITE-ProRule annotationAdd BLAST90
Domaini1477 – 1568Fibronectin type-III 10PROSITE-ProRule annotationAdd BLAST92
Domaini1569 – 1659Fibronectin type-III 11PROSITE-ProRule annotationAdd BLAST91
Domaini1660 – 1756Fibronectin type-III 12PROSITE-ProRule annotationAdd BLAST97
Domaini1759 – 1853Fibronectin type-III 13PROSITE-ProRule annotationAdd BLAST95
Domaini1854 – 1939Fibronectin type-III 14PROSITE-ProRule annotationAdd BLAST86
Domaini1940 – 2030Fibronectin type-III 15PROSITE-ProRule annotationAdd BLAST91
Domaini2031 – 2121Fibronectin type-III 16PROSITE-ProRule annotationAdd BLAST91
Domaini2122 – 2210Fibronectin type-III 17PROSITE-ProRule annotationAdd BLAST89
Domaini2211 – 2299Fibronectin type-III 18PROSITE-ProRule annotationAdd BLAST89
Domaini2327 – 2500VWFA 4PROSITE-ProRule annotationAdd BLAST174
Domaini2524 – 2716Laminin G-likeAdd BLAST193
Domaini2751 – 2802Collagen-like 1Add BLAST52
Domaini2807 – 2858Collagen-like 2Add BLAST52
Domaini2859 – 2900Collagen-like 3Add BLAST42
Domaini2945 – 2994Collagen-like 4Add BLAST50

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Family and Domains’ section describes a region of interest that cannot be described in other subsections.<p><a href='/help/region' target='_top'>More...</a></p>Regioni2455 – 2750Nonhelical region (NC3)Add BLAST296
Regioni2751 – 2902Triple-helical region (COL2) with 1 imperfectionAdd BLAST152
Regioni2903 – 2945Nonhelical region (NC2)Add BLAST43
Regioni2946 – 3048Triple-helical region (COL1) with 2 imperfectionsAdd BLAST103
Regioni3049 – 3124Nonhelical region (NC1)Add BLAST76

Motif

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Family and Domains’ section describes a short (usually not more than 20 amino acids) conserved sequence motif of biological significance.<p><a href='/help/motif' target='_top'>More...</a></p>Motifi2899 – 2901Cell attachment siteSequence analysis3

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Family and Domains’ section describes the position of regions of compositional bias within the protein and the particular amino acids that are over-represented within those regions.<p><a href='/help/compbias' target='_top'>More...</a></p>Compositional biasi3086 – 3096Asp/Glu-rich (acidic)Add BLAST11
Compositional biasi3111 – 3123Arg/Lys-rich (basic)Add BLAST13

<p>This subsection of the ‘Family and domains’ section provides general information on the biological role of a domain. The term ‘domain’ is intended here in its wide acceptation, it may be a structural domain, a transmembrane region or a functional domain. Several domains are described in this subsection.<p><a href='/help/domain_cc' target='_top'>More...</a></p>Domaini

This sequence defines five distinct domains, two triple-helical domains (COL1 and COL2) and three non-triple-helical domains (NC1, NC2, and NC3).

<p>This subsection of the ‘Family and domains’ section provides information about the sequence similarity with other proteins.<p><a href='/help/sequence_similarities' target='_top'>More...</a></p>Sequence similaritiesi

Keywords - Domaini

Collagen, Repeat, Signal

Phylogenomic databases

evolutionary genealogy of genes: Non-supervised Orthologous Groups

More...
eggNOGi
ENOG410IS7D Eukaryota
ENOG4111RUM LUCA

The HOGENOM Database of Homologous Genes from Fully Sequenced Organisms

More...
HOGENOMi
HOG000111877

InParanoid: Eukaryotic Ortholog Groups

More...
InParanoidi
P13944

Database of Orthologous Groups

More...
OrthoDBi
67372at2759

Database for complete collections of gene phylogenies

More...
PhylomeDBi
P13944

Family and domain databases

Conserved Domains Database

More...
CDDi
cd00063 FN3, 18 hits

Gene3D Structural and Functional Annotation of Protein Families

More...
Gene3Di
2.60.40.10, 18 hits
3.40.50.410, 4 hits

Integrated resource of protein families, domains and functional sites

More...
InterProi
View protein in InterPro
IPR008160 Collagen
IPR013320 ConA-like_dom_sf
IPR003961 FN3_dom
IPR036116 FN3_sf
IPR013783 Ig-like_fold
IPR001791 Laminin_G
IPR002035 VWF_A
IPR036465 vWFA_dom_sf

Pfam protein domain database

More...
Pfami
View protein in Pfam
PF01391 Collagen, 4 hits
PF00041 fn3, 17 hits
PF00092 VWA, 4 hits

Simple Modular Architecture Research Tool; a protein domain database

More...
SMARTi
View protein in SMART
SM00060 FN3, 18 hits
SM00210 TSPN, 1 hit
SM00327 VWA, 4 hits

Superfamily database of structural and functional annotation

More...
SUPFAMi
SSF49265 SSF49265, 11 hits
SSF49899 SSF49899, 1 hit
SSF53300 SSF53300, 4 hits

PROSITE; a protein domain and family database

More...
PROSITEi
View protein in PROSITE
PS50853 FN3, 18 hits
PS50234 VWFA, 4 hits

<p>This section displays by default the canonical protein sequence and upon request all isoforms described in the entry. It also includes information pertinent to the sequence(s), including <a href="http://www.uniprot.org/help/sequence_length">length</a> and <a href="http://www.uniprot.org/help/sequences">molecular weight</a>. The information is filed in different subsections. The current subsections and their content are listed below:<p><a href='/help/sequences_section' target='_top'>More...</a></p>Sequences (2)i

<p>This subsection of the <a href="http://www.uniprot.org/help/sequences_section">Sequence</a> section indicates if the <a href="http://www.uniprot.org/help/canonical_and_isoforms">canonical sequence</a> displayed by default in the entry is complete or not.<p><a href='/help/sequence_status' target='_top'>More...</a></p>Sequence statusi: Complete.

<p>This subsection of the <a href="http://www.uniprot.org/help/sequences_section">Sequence</a> section indicates if the <a href="http://www.uniprot.org/help/canonical_and_isoforms">canonical sequence</a> displayed by default in the entry is in its mature form or if it represents the precursor.<p><a href='/help/sequence_processing' target='_top'>More...</a></p>Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 <p>This subsection of the ‘Sequence’ section lists the alternative protein sequences (isoforms) that can be generated from the same gene by a single or by the combination of up to four biological events (alternative promoter usage, alternative splicing, alternative initiation and ribosomal frameshifting). Additionally, this section gives relevant information on each alternative protein isoform.<p><a href='/help/alternative_products' target='_top'>More...</a></p> isoformsi produced by alternative splicing. AlignAdd to basket
Note: The final tissue form of collagen XII may contain homotrimers of either isoform Long or isoform Short or any combination of isoform Long and isoform Short. Only isoform Long is a proteoglycan. Isoform Long has more restricted expression in embryonic tissue than isoform Short.
Isoform Long (identifier: P13944-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the <div> <p><b>What is the canonical sequence?</b><p><a href='/help/canonical_and_isoforms' target='_top'>More...</a></p>canonicali sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide
        10         20         30         40         50
MRTALCSAVA ALCAAALLSS IEAEVNPPSD LNFTIIDEHN VQMSWKRPPD
60 70 80 90 100
AIVGYRITVV PTNDGPTKEF TLSPSTTQTV LSDLIPEIEY VVSIASYDEV
110 120 130 140 150
EESLPVFGQL TIQTGGPGIP EEKKVEAQIQ KCSISAMTDL VFLVDGSWSV
160 170 180 190 200
GRNNFRYILD FMVALVSAFD IGEEKTRVGV VQYSSDTRTE FNLNQYFRRS
210 220 230 240 250
DLLDAIKRIP YKGGNTMTGE AIDYLVKNTF TESAGARKGF PKVAIVITDG
260 270 280 290 300
KAQDEVEIPA RELRNIGVEV FSLGIKAADA KELKLIASQP SLKHVFNVAN
310 320 330 340 350
FDGIVDIQNE IILQVCSGVD EQLGELVSGE EVVEPASNLV ATQISSKSVR
360 370 380 390 400
ITWDPSTSQI TGYRVQFIPM IAGGKQHVLS VGPQTTALNV KDLSPDTEYQ
410 420 430 440 450
INVYAMKGLT PSEPITIMEK TQQVKVQVEC SRGVDVKADV VFLVDGSYSI
460 470 480 490 500
GIANFVKVRA FLEVLVKSFE ISPRKVQISL VQYSRDPHME FSLNRYNRVK
510 520 530 540 550
DIIQAINTFP YRGGSTNTGK AMTYVREKVF VTSKGSRPNV PRVMILITDG
560 570 580 590 600
KSSDAFKEPA IKLRDADVEI FAVGVKDAVR TELEAIASPP AETHVYTVED
610 620 630 640 650
FDAFQRISFE LTQSVCLRIE QELAAIRKKS YVPAKNMVFS DVTSDSFKVS
660 670 680 690 700
WSAAGSEEKS YLIKYKVAIG GDEFIVSVPA SSTSSVLTNL LPETTYAVSV
710 720 730 740 750
IAEYEDGDGP PLDGEETTLE VKGAPRNLRI TDETTDSFIV GWTPAPGNVL
760 770 780 790 800
RYRLVYRPLT GGERRQVTVS ANERSTTLRN LIPDTRYEVS VIAEYQSGPG
810 820 830 840 850
NALNGYAKTD EVRGNPRNLR VSDATTSTTM KLSWSAAPGK VQHVLYNLHT
860 870 880 890 900
RYAGVETKEL TVKGDTTSKE LKGLDEATRY ALTVSALYAS GAGEALSGEG
910 920 930 940 950
ETLEERGSPR NLITTDITDT TVGLSWTPAP GTVNNYRIVW KSLYDDTMGE
960 970 980 990 1000
KRVPGNTVDA VLDGLEPETK YRISIYAAYS SGEGDPVEGE AFTDVSQSAR
1010 1020 1030 1040 1050
TVTVDNETEN TMRVSVAALT WEGLVLARVL PNRSGGRQMF GKVNASATSI
1060 1070 1080 1090 1100
VLKRLKPRTT YDLSVVPIYD FGQGKSRKAE GTTASPFKPP RNLRTSDSTM
1110 1120 1130 1140 1150
SSFRVTWEPA PGRVKGYKVT FHPTEDDRNL GELVVGPYDS TVVLEELRAG
1160 1170 1180 1190 1200
TTYKVNVFGM FDGGESNPLV GQEMTTLSDT TTEPFLSRGL ECRTRAEADI
1210 1220 1230 1240 1250
VLLVDGSWSI GRPNFKTVRN FISRIVEVFD IGPDKVQIGL AQYSGDPRTE
1260 1270 1280 1290 1300
WNLNAYRTKE ALLDAVTNLP YKGGNTLTGM ALDFILKNNF KQEAGLRPRA
1310 1320 1330 1340 1350
RKIGVLITDG KSQDDVVTPS RRLRDEGVEL YAIGIKNADE NELKQIATDP
1360 1370 1380 1390 1400
DDIHAYNVAD FSFLASIGED VTTNLCNSVK GPGDLPPPSN LVISEVTPHS
1410 1420 1430 1440 1450
FRLRWSPPPE SVDRYRVEYY PTTGGPPKQF YVSRMETTTV LKDLTPETEY
1460 1470 1480 1490 1500
IVNVFSVVED ESSEPLIGRE ITYPLSSVRN LNVYDIGSTS MRVRWEPVNG
1510 1520 1530 1540 1550
ATGYLLTYEP VNATVPTTEK EMRVGPSVNE VQLVDLIPNT EYTLTAYVLY
1560 1570 1580 1590 1600
GDITSDPLTS QEVTLPLPGP RGVTIRDVTH STMNVLWDPA PGKVRKYIIR
1610 1620 1630 1640 1650
YKIADEADVK EVEIDRLKTS TTLTDLSSQR LYNVKVVAVY DEGESLPVVA
1660 1670 1680 1690 1700
SCYSAVPSPV NLRITEITKN SFRGTWDHGA PDVSLYRITW GPYGRSEKAE
1710 1720 1730 1740 1750
SIVNGDVNSL LFENLNPDTL YEVSVTAIYP DESETVDDLI GSERTLPLVP
1760 1770 1780 1790 1800
ITTPAPKSGP RNLQVYNATS HSLTVKWDPA SGRVQRYKII YQPINGDGPE
1810 1820 1830 1840 1850
QSTMVGGRQN SVVIQKLQPD TPYAITVSSM YADGEGGRMT GRGRTKPLTT
1860 1870 1880 1890 1900
VKNMLVYDPT TSTLNVRWDH AEGNPRQYKV FYRPTAGGAE EMTTVPGNTN
1910 1920 1930 1940 1950
YVILRSLEPN TPYTVTVVPV FPEGDGGRTT DTGRTLERGT PRNIQVYNPT
1960 1970 1980 1990 2000
PNSMNVRWEP APGPVQQYRV NYSPLSGPRP SESIVVPANT RDVMLERLTP
2010 2020 2030 2040 2050
DTAYSINVIA LYADGEGNPS QAQGRTLPRS GPRNLRVFDE TTNSLSVQWD
2060 2070 2080 2090 2100
HADGPVQQYR IIYSPTVGDP IDEYTTVPGI RNNVILQPLQ SDTPYKITVV
2110 2120 2130 2140 2150
AVYEDGDGGQ LTGNGRTVGL LPPQNIYITD EWYTRFRVSW DPSPSPVLGY
2160 2170 2180 2190 2200
KIVYKPVGSN EPMEVFVGEV TSYTLHNLSP STTYDVNVYA QYDSGMSIPL
2210 2220 2230 2240 2250
TDQGTTLYLN VTDLTTYKIG WDTFCIRWSP HRSATSYRLK LNPADGSRGQ
2260 2270 2280 2290 2300
EITVRGSETS HCFTGLSPDT EYNATVFVQT PNLEGPPVSV REHTVLKPTE
2310 2320 2330 2340 2350
APTPPPTPPP PPTIPPARDV CRGAKADIVF LTDASWSIGD DNFNKVVKFV
2360 2370 2380 2390 2400
FNTVGAFDLI NPAGIQVSLV QYSDEAQSEF KLNTFDDKAQ ALGALQNVQY
2410 2420 2430 2440 2450
RGGNTRTGKA LTFIKEKVLT WESGMRRGVP KVLVVVTDGR SQDEVRKAAT
2460 2470 2480 2490 2500
VIQHSGFSVF VVGVADVDYN ELAKIASKPS ERHVFIVDDF DAFEKIQDNL
2510 2520 2530 2540 2550
VTFVCETATS TCPLIYLEGY TSPGFKMLES YNLTEKHFAS VQGVSLESGS
2560 2570 2580 2590 2600
FPSYVAYRLH KNAFVSQPIR EIHPEGLPQA YTIIMLFRLL PESPSEPFAI
2610 2620 2630 2640 2650
WQITDRDYKP QVGVVLDPGS KVLSFFNKDT RGEVQTVTFD NDEVKKIFYG
2660 2670 2680 2690 2700
SFHKVHIVVT SSNVKIYIDC SEILEKPIKE AGNITTDGYE ILGKLLKGDR
2710 2720 2730 2740 2750
RSATLEIQNF DIVCSPVWTS RDRCCDLPSM RDEAKCPALP NACTCTQDSV
2760 2770 2780 2790 2800
GPPGPPGPPG GPGAKGPRGE RGLTGSSGPP GPRGETGPPG PQGPPGPQGP
2810 2820 2830 2840 2850
NGLQIPGEPG RQGMKGDAGQ PGLPGRSGTP GLPGPPGPVG PPGERGFTGK
2860 2870 2880 2890 2900
DGPTGPRGPP GPAGAPGVPG VAGPSGKPGK PGDRGTPGTP GMKGEKGDRG
2910 2920 2930 2940 2950
DIASQNMMRA VARQVCEQLI NGQMSRFNQM LNQIPNDYYS NRNQPGPPGP
2960 2970 2980 2990 3000
PGPPGAAGTR GEPGPGGRPG FPGPPGVQGP PGERGMPGEK GERGTGSQGP
3010 3020 3030 3040 3050
RGLPGPPGPQ GESRTGPPGS TGSRGPPGPP GRPGNAGIRG PPGPPGYCDS
3060 3070 3080 3090 3100
SQCASIPYNG QGFPEPYVPE SGPYQPEGEP FIVPMESERR EDEYEDYGVE
3110 3120
MHSPEYPEHM RWKRSLSRKA KRKP
Length:3,124
Mass (Da):340,582
Last modified:November 1, 1997 - v3
<p>The checksum is a form of redundancy check that is calculated from the sequence. It is useful for tracking sequence updates.</p> <p>It should be noted that while, in theory, two different sequences could have the same checksum value, the likelihood that this would happen is extremely low.</p> <p>However UniProtKB may contain entries with identical sequences in case of multiple genes (paralogs).</p> <p>The checksum is computed as the sequence 64-bit Cyclic Redundancy Check value (CRC64) using the generator polynomial: x<sup>64</sup> + x<sup>4</sup> + x<sup>3</sup> + x + 1. The algorithm is described in the ISO 3309 standard. </p> <p class="publication">Press W.H., Flannery B.P., Teukolsky S.A. and Vetterling W.T.<br /> <strong>Cyclic redundancy and other checksums</strong><br /> <a href="http://www.nrbook.com/b/bookcpdf.php">Numerical recipes in C 2nd ed., pp896-902, Cambridge University Press (1993)</a>)</p> Checksum:i094285AFE7F346CF
GO
Isoform Short (identifier: P13944-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     25-1188: Missing.

Show »
Length:1,960
Mass (Da):212,902
Checksum:i23D3F54E10AD88CC
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Sequence’ section reports difference(s) between the canonical sequence (displayed by default in the entry) and the different sequence submissions merged in the entry. These various submissions may originate from different sequencing projects, different types of experiments, or different biological samples. Sequence conflicts are usually of unknown origin.<p><a href='/help/conflict' target='_top'>More...</a></p>Sequence conflicti1258T → S in CAA47744 (PubMed:1420368).Curated1
Sequence conflicti1264D → E in CAA47744 (PubMed:1420368).Curated1
Sequence conflicti2759P → A in AAA48635 (PubMed:2584192).Curated1
Sequence conflicti2803L → F in AAA48635 (PubMed:2584192).Curated1
Sequence conflicti2977V → F in AAA48635 (PubMed:2584192).Curated1
Sequence conflicti2977V → F in AAA48718 (PubMed:3476925).Curated1
Sequence conflicti3075 – 3076QP → AG in AAA48718 (PubMed:3476925).Curated2

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
<p>This subsection of the ‘Sequence’ section describes the sequence of naturally occurring alternative protein isoform(s). The changes in the amino acid sequence may be due to alternative splicing, alternative promoter usage, alternative initiation, or ribosomal frameshifting.<p><a href='/help/var_seq' target='_top'>More...</a></p>Alternative sequenceiVSP_00114825 – 1188Missing in isoform Short. 1 PublicationAdd BLAST1164

Sequence databases

Select the link destinations:

EMBL nucleotide sequence database

More...
EMBLi

GenBank nucleotide sequence database

More...
GenBanki

DNA Data Bank of Japan; a nucleotide sequence database

More...
DDBJi
Links Updated
D00824 mRNA Translation: BAA00701.1
X61024 mRNA Translation: CAA43358.1
J05137 mRNA Translation: AAA48635.1
M17375 mRNA Translation: AAA48718.1
X67327 mRNA Translation: CAA47744.1

Protein sequence database of the Protein Information Resource

More...
PIRi
A40020

Keywords - Coding sequence diversityi

Alternative splicing

<p>This section is used to point to information related to entries and found in data collections other than UniProtKB.<p><a href='/help/cross_references_section' target='_top'>More...</a></p>Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D00824 mRNA Translation: BAA00701.1
X61024 mRNA Translation: CAA43358.1
J05137 mRNA Translation: AAA48635.1
M17375 mRNA Translation: AAA48718.1
X67327 mRNA Translation: CAA47744.1
PIRiA40020

3D structure databases

SMRiP13944
ModBaseiSearch...

Protein-protein interaction databases

ComplexPortaliCPX-3109 Collagen type XII trimer
STRINGi9031.ENSGALP00000025593

Proteomic databases

PaxDbiP13944
PRIDEiP13944

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Phylogenomic databases

eggNOGiENOG410IS7D Eukaryota
ENOG4111RUM LUCA
HOGENOMiHOG000111877
InParanoidiP13944
OrthoDBi67372at2759
PhylomeDBiP13944

Family and domain databases

CDDicd00063 FN3, 18 hits
Gene3Di2.60.40.10, 18 hits
3.40.50.410, 4 hits
InterProiView protein in InterPro
IPR008160 Collagen
IPR013320 ConA-like_dom_sf
IPR003961 FN3_dom
IPR036116 FN3_sf
IPR013783 Ig-like_fold
IPR001791 Laminin_G
IPR002035 VWF_A
IPR036465 vWFA_dom_sf
PfamiView protein in Pfam
PF01391 Collagen, 4 hits
PF00041 fn3, 17 hits
PF00092 VWA, 4 hits
SMARTiView protein in SMART
SM00060 FN3, 18 hits
SM00210 TSPN, 1 hit
SM00327 VWA, 4 hits
SUPFAMiSSF49265 SSF49265, 11 hits
SSF49899 SSF49899, 1 hit
SSF53300 SSF53300, 4 hits
PROSITEiView protein in PROSITE
PS50853 FN3, 18 hits
PS50234 VWFA, 4 hits

ProtoNet; Automatic hierarchical classification of proteins

More...
ProtoNeti
Search...

MobiDB: a database of protein disorder and mobility annotations

More...
MobiDBi
Search...

<p>This section provides general information on the entry.<p><a href='/help/entry_information_section' target='_top'>More...</a></p>Entry informationi

<p>This subsection of the ‘Entry information’ section provides a mnemonic identifier for a UniProtKB entry, but it is not a stable identifier. Each reviewed entry is assigned a unique entry name upon integration into UniProtKB/Swiss-Prot.<p><a href='/help/entry_name' target='_top'>More...</a></p>Entry nameiCOCA1_CHICK
<p>This subsection of the ‘Entry information’ section provides one or more accession number(s). These are stable identifiers and should be used to cite UniProtKB entries. Upon integration into UniProtKB, each entry is assigned a unique accession number, which is called ‘Primary (citable) accession number’.<p><a href='/help/accession_numbers' target='_top'>More...</a></p>AccessioniPrimary (citable) accession number: P13944
Secondary accession number(s): Q04509
<p>This subsection of the ‘Entry information’ section shows the date of integration of the entry into UniProtKB, the date of the last sequence update and the date of the last annotation modification (‘Last modified’). The version number for both the entry and the <a href="http://www.uniprot.org/help/canonical_and_isoforms">canonical sequence</a> are also displayed.<p><a href='/help/entry_history' target='_top'>More...</a></p>Entry historyiIntegrated into UniProtKB/Swiss-Prot: January 1, 1990
Last sequence update: November 1, 1997
Last modified: July 3, 2019
This is version 159 of the entry and version 3 of the sequence. See complete history.
<p>This subsection of the ‘Entry information’ section indicates whether the entry has been manually annotated and reviewed by UniProtKB curators or not, in other words, if the entry belongs to the Swiss-Prot section of UniProtKB (<strong>reviewed</strong>) or to the computer-annotated TrEMBL section (<strong>unreviewed</strong>).<p><a href='/help/entry_status' target='_top'>More...</a></p>Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

<p>This section contains any relevant information that doesn’t fit in any other defined sections<p><a href='/help/miscellaneous_section' target='_top'>More...</a></p>Miscellaneousi

Keywords - Technical termi

Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. SIMILARITY comments
    Index of protein domains and families
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again