Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Collagen alpha-1(II) chain

Gene

col2a1

Organism
Xenopus laevis (African clawed frog)
Status
Reviewed-Annotation score: Annotation score: 4 out of 5-Experimental evidence at transcript leveli

Functioni

Type II collagen is specific for cartilaginous tissues. It is essential for the normal embryonic development of the skeleton, for linear growth and for the ability of cartilage to resist compressive forces (By similarity).By similarity

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Metal bindingi1300CalciumBy similarity1
Metal bindingi1302CalciumBy similarity1
Metal bindingi1303Calcium; via carbonyl oxygenBy similarity1
Metal bindingi1305Calcium; via carbonyl oxygenBy similarity1
Metal bindingi1308CalciumBy similarity1

GO - Molecular functioni

Complete GO annotation...

Keywords - Ligandi

Calcium, Metal-binding

Names & Taxonomyi

Protein namesi
Recommended name:
Collagen alpha-1(II) chain
Alternative name(s):
Alpha-1 type II collagen
Gene namesi
Name:col2a1
OrganismiXenopus laevis (African clawed frog)
Taxonomic identifieri8355 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiAmphibiaBatrachiaAnuraPipoideaPipidaeXenopodinaeXenopusXenopus

Organism-specific databases

XenbaseiXB-GENE-6252613. col2a1.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 26Sequence analysisAdd BLAST26
PropeptideiPRO_000028617827 – 183N-terminal propeptideBy similarityAdd BLAST157
ChainiPRO_0000286179184 – 1243Collagen alpha-1(II) chainAdd BLAST1060
PropeptideiPRO_00002861801244 – 1486C-terminal propeptideBy similarityAdd BLAST243

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi1282 ↔ 1314PROSITE-ProRule annotation
Disulfide bondi1288Interchain (with C-1305)PROSITE-ProRule annotation
Disulfide bondi1305Interchain (with C-1288)PROSITE-ProRule annotation
Disulfide bondi1322 ↔ 1484PROSITE-ProRule annotation
Glycosylationi1387N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi1392 ↔ 1437PROSITE-ProRule annotation

Post-translational modificationi

Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains.By similarity

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei183 – 184Cleavage; by procollagen N-endopeptidaseBy similarity2
Sitei1243 – 1244Cleavage; by procollagen C-endopeptidaseBy similarity2

Keywords - PTMi

Disulfide bond, Glycoprotein, Hydroxylation

Proteomic databases

PRIDEiQ91717.

Expressioni

Developmental stagei

Initially, the transcripts are localized to notochord, somites, and the dorsal region of the lateral plate mesoderm. At later stages of development and parallel to increased mRNA accumulation, collagen expression becomes progressively more confined to chondrogenic regions of the tadpole.1 Publication

Interactioni

Subunit structurei

Homotrimers of alpha 1(II) chains.By similarity

Structurei

3D structure databases

ProteinModelPortaliQ91717.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini36 – 94VWFCPROSITE-ProRule annotationAdd BLAST59
Domaini1252 – 1486Fibrillar collagen NC1PROSITE-ProRule annotationAdd BLAST235

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni203 – 1216Triple-helical regionAdd BLAST1014
Regioni1217 – 1243Nonhelical region (C-terminal)Add BLAST27

Domaini

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function (By similarity).By similarity

Sequence similaritiesi

Belongs to the fibrillar collagen family.PROSITE-ProRule annotation
Contains 1 fibrillar collagen NC1 domain.PROSITE-ProRule annotation
Contains 1 VWFC domain.PROSITE-ProRule annotation

Keywords - Domaini

Collagen, Repeat, Signal

Phylogenomic databases

HOVERGENiHBG004933.
KOiK19719.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_dom.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 8 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

Q91717-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MFSFVDSRTL VLFAATQVIL LAVVRCQDEE DVLDTGSCVQ HGQRYSDKDV
60 70 80 90 100
WKPEPCQICV CDTGTVLCDD IICEESKDCP NAEIPFGECC PICPTEQSST
110 120 130 140 150
SSGQGVLKGQ KGEPGDIKDV LGPRGPPGPQ GPSGEQGSRG ERGDKGEKGA
160 170 180 190 200
PGPRGRDGEP GTPGNPGPVG PPGPPGLGGN FAAQMTGGFD EKAGGAQMGV
210 220 230 240 250
MQGPMGPMGP RGPPGPTGAP GPQGFQGNPG EPGEPGAGGP MGPRGPPGPS
260 270 280 290 300
GKPGDDGEAG KPGKSGERGP PGPQGARGFP GTPGLPGVKG HRGYPGLDGA
310 320 330 340 350
KGEAGAAGAK GEGGATGEAG SPGPMGPRGL PGERGRPGSS GAAGARGNDG
360 370 380 390 400
LPGPAGPPGP VGPAGAPGFP GAPGSKGEAG PTGARGPEGA QGPRGESGTP
410 420 430 440 450
GSPGPAGASG NPGTDGIPGA KGSSGGPGIA GAPGFPGPRG PPGPQGATGP
460 470 480 490 500
LGPKGQTGDP GVAGFKGEQG PKGEIGSAGP QGAPGPAGEE GKRGARGEPG
510 520 530 540 550
AAGPNGPPGE RGAPGNRGFP GQDGLAGPKG APGERGVPGL GGPKGGNGDP
560 570 580 590 600
GRPGEPGLPG ARGLTGRPGD AGPQGKVGPS GASGEDGRPG PPGPQGARGQ
610 620 630 640 650
PGVMGFPGPK GANGEPGKAG EKGLVGAPGL RGLPGKDGET GSQGPNGPAG
660 670 680 690 700
PAGERGEQGP PGPSGFQGLP GPPGSPGEGG KPGDQGVPGE AGAPGLVGPR
710 720 730 740 750
GERGFPGERG SSGPQGLQGP RGLPGTPGTD GPKGASGPSG PNGAQGPPGL
760 770 780 790 800
QGMPGERGAA GISGPKGDRG DTGEKGPEGA SGKDGSRGLT GPIGPPGPAG
810 820 830 840 850
PNGEKGESGP SGPPGIVGAR GAPGDRGENG PPGPAGFAGP PGADGQSGLK
860 870 880 890 900
GDQGESGQKG DAGAPGPQGP SGAPGPQGPT GVFGPKGARG AQGPAGATGF
910 920 930 940 950
PGAAGRVGTP GPNGNPGPPG PPGSAGKEGP KGVRGDAGPP GRAGDPGLQG
960 970 980 990 1000
AAGAPGEKGE PGEDGPSGPD GPPGPQGLSG QRGIVGLPGQ RGERGFPGLP
1010 1020 1030 1040 1050
GPSGEPGKQG GPGSSGDRGP PGPVGPPGLT GPSGEPGREG NPGSDGPPGR
1060 1070 1080 1090 1100
DGATGIKGDR GETGPLGAPG APGAPGAPGS VGPTGKQGDR GESGPQGPLG
1110 1120 1130 1140 1150
PSGPAGARGL AGPQGPRGDK GEAGEAGERG QKGHRGFTGL QGLPGPPGSA
1160 1170 1180 1190 1200
GDQGATGPAG PAGPRGPPGP VGPSGKDGSN GISGPIGPPG PRGRSGETGP
1210 1220 1230 1240 1250
SGPPGQPGPP GPPGPPGPGI DMSAFAGLSQ PEKGPDPMRY MRADQASNSL
1260 1270 1280 1290 1300
PVDVEATLKS LNNQIENIRS PDGTKKNPAR TCRDLKLCHP EWKSGDYWID
1310 1320 1330 1340 1350
PNQGCTVDAI KVFCDMETGE TCVYPNPSKI PKKNWWSAKG KEKKHIWFGE
1360 1370 1380 1390 1400
TINGGFQFSY GDDSSAPNTA NIQMTFLRLL STDASQNITY HCKNSIAFMD
1410 1420 1430 1440 1450
EASGNLKKAV LLQGSNDVEI RAEGNSRFTY NALEDGCKKH TGKWSKTVIE
1460 1470 1480
YRTQKTSRLP IVDIAPMDIG GADQEFGVDI GPVCFL
Length:1,486
Mass (Da):142,263
Last modified:May 1, 2007 - v2
Checksum:i02C18E5F5807100E
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti456Q → E in AAA49678 (PubMed:1918153).Curated1
Sequence conflicti1287L → I in AAA49678 (PubMed:1918153).Curated1
Sequence conflicti1315D → N in AAA49678 (PubMed:1918153).Curated1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M63595 mRNA. Translation: AAA49678.1.
BC048221 mRNA. Translation: AAH48221.1.
BC111515 mRNA. Translation: AAI11516.1.
PIRiA40333.
B40333.
RefSeqiNP_001081258.1. NM_001087789.1.
UniGeneiXl.606.

Genome annotation databases

GeneIDi397738.
KEGGixla:397738.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M63595 mRNA. Translation: AAA49678.1.
BC048221 mRNA. Translation: AAH48221.1.
BC111515 mRNA. Translation: AAI11516.1.
PIRiA40333.
B40333.
RefSeqiNP_001081258.1. NM_001087789.1.
UniGeneiXl.606.

3D structure databases

ProteinModelPortaliQ91717.
ModBaseiSearch...
MobiDBiSearch...

Proteomic databases

PRIDEiQ91717.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi397738.
KEGGixla:397738.

Organism-specific databases

CTDi1280.
XenbaseiXB-GENE-6252613. col2a1.

Phylogenomic databases

HOVERGENiHBG004933.
KOiK19719.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_dom.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 8 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCO2A1_XENLA
AccessioniPrimary (citable) accession number: Q91717
Secondary accession number(s): Q7ZTI6
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 1, 2007
Last sequence update: May 1, 2007
Last modified: October 5, 2016
This is version 78 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Documents

  1. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.