Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Collagen alpha-1(II) chain

Gene

col2a1

Organism
Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
Status
Reviewed-Annotation score: Annotation score: 4 out of 5-Experimental evidence at transcript leveli

Functioni

Type II collagen is specific for cartilaginous tissues. It is essential for the normal embryonic development of the skeleton, for linear growth and for the ability of cartilage to resist compressive forces (By similarity).By similarity

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Metal bindingi1306CalciumBy similarity1
Metal bindingi1308CalciumBy similarity1
Metal bindingi1309Calcium; via carbonyl oxygenBy similarity1
Metal bindingi1311Calcium; via carbonyl oxygenBy similarity1
Metal bindingi1314CalciumBy similarity1

GO - Molecular functioni

Complete GO annotation...

Keywords - Ligandi

Calcium, Metal-binding

Names & Taxonomyi

Protein namesi
Recommended name:
Collagen alpha-1(II) chain
Alternative name(s):
Alpha-1 type II collagen
Gene namesi
Name:col2a1
OrganismiXenopus tropicalis (Western clawed frog) (Silurana tropicalis)
Taxonomic identifieri8364 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiAmphibiaBatrachiaAnuraPipoideaPipidaeXenopodinaeXenopusSilurana
Proteomesi
  • UP000008143 Componenti: Unassembled WGS sequence

Organism-specific databases

XenbaseiXB-GENE-6258353. col2a1.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 26Sequence analysisAdd BLAST26
PropeptideiPRO_000028618127 – 186N-terminal propeptideBy similarityAdd BLAST160
ChainiPRO_0000286182187 – 1246Collagen alpha-1(II) chainAdd BLAST1060
PropeptideiPRO_00002861831247 – 1492C-terminal propeptideBy similarityAdd BLAST246

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi1288 ↔ 1320PROSITE-ProRule annotation
Disulfide bondi1294Interchain (with C-1311)PROSITE-ProRule annotation
Disulfide bondi1311Interchain (with C-1294)PROSITE-ProRule annotation
Disulfide bondi1328 ↔ 1490PROSITE-ProRule annotation
Glycosylationi1393N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi1398 ↔ 1443PROSITE-ProRule annotation

Post-translational modificationi

Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains.By similarity

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei186 – 187Cleavage; by procollagen N-endopeptidaseBy similarity2
Sitei1246 – 1247Cleavage; by procollagen C-endopeptidaseBy similarity2

Keywords - PTMi

Disulfide bond, Glycoprotein, Hydroxylation

Proteomic databases

PaxDbiQ6P4Z2.
PRIDEiQ6P4Z2.

Interactioni

Subunit structurei

Homotrimers of alpha 1(II) chains.By similarity

Protein-protein interaction databases

STRINGi8364.ENSXETP00000043834.

Structurei

3D structure databases

ProteinModelPortaliQ6P4Z2.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini36 – 94VWFCPROSITE-ProRule annotationAdd BLAST59
Domaini1258 – 1492Fibrillar collagen NC1PROSITE-ProRule annotationAdd BLAST235

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni206 – 1219Triple-helical regionAdd BLAST1014
Regioni1220 – 1246Nonhelical region (C-terminal)Add BLAST27

Domaini

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function (By similarity).By similarity

Sequence similaritiesi

Belongs to the fibrillar collagen family.PROSITE-ProRule annotation
Contains 1 fibrillar collagen NC1 domain.PROSITE-ProRule annotation
Contains 1 VWFC domain.PROSITE-ProRule annotation

Keywords - Domaini

Collagen, Repeat, Signal

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410XNMM. LUCA.
HOVERGENiHBG004933.
InParanoidiQ6P4Z2.
KOiK19719.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_dom.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 6 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

Q6P4Z2-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MFSFVDSRTL VLFAATQVIL LAVVRCQDEE DVLATGSCVQ HGQRYSDKDV
60 70 80 90 100
WKPEPCQICV CDTGNVLCDE IICEDPKDCP NAEIPFGECC PICPTEQSST
110 120 130 140 150
SSGQGVLKGQ KGEPGDIKDV VGPKGPPGPQ GPSGEQGPRG DRGDKGEKGA
160 170 180 190 200
PGPRGRDGEP GTPGNPGPVG PPGPPGPPGL GGNFAAQMTG GFDEKAGGAQ
210 220 230 240 250
MGVMQGPMGP MGPRGPPGPT GAPGPQGFQG NPGEPGEPGA GGPMGPRGPP
260 270 280 290 300
GPAGKPGDDG EAGKPGKSGE RGPPGPQGAR GFPGTPGLPG VKGHRGYPGL
310 320 330 340 350
DGSKGEAGAA GAKGEGGATG EAGSPGPMGP RGLPGERGRP GASGAAGARG
360 370 380 390 400
NDGLPGPAGP PGPVGPAGAP GFPGAPGSKG EAGPTGARGP EGAQGPRGES
410 420 430 440 450
GTPGSPGPAG ASGNPGTDGI PGAKGSSGAP GIAGAPGFPG PRGPPGPQGA
460 470 480 490 500
TGPLGPKGQT GDPGVAGFKG EHGPKGEIGS AGPQGAPGPA GEEGKRGARG
510 520 530 540 550
EPGAAGPLGP PGERGAPGNR GFPGQDGLAG PKGAPGERGV PGLGGPKGAN
560 570 580 590 600
GDPGRPGEPG LPGARGLTGR PGDAGPQGKV GPSGASGEDG RPGPPGPQGA
610 620 630 640 650
RGQPGVMGFP GPKGANGEPG KAGEKGLLGA PGLRGLPGKD GETGAQGPNG
660 670 680 690 700
PAGPAGERGE QGPPGPSGFQ GLPGPPGSPG EGGKPGDQGV PGEAGAPGLV
710 720 730 740 750
GPRGERGFPG ERGSSGPQGL QGPRGLPGTP GTDGPKGATG PSGPNGAQGP
760 770 780 790 800
PGLQGMPGER GAAGISGPKG DRGDTGEKGP EGAPGKDGSR GLTGPIGPPG
810 820 830 840 850
PSGPNGEKGE SGPSGPAGIV GARGAPGDRG ETGPPGPAGF AGPPGADGQA
860 870 880 890 900
GLKGDQGESG QKGDAGAPGP QGPSGAPGPQ GPTGVNGPKG ARGAQGPPGA
910 920 930 940 950
TGFPGAAGRV GPPGPNGNPG PSGAPGSAGK EGPKGARGDA GPTGRAGDPG
960 970 980 990 1000
LQGPAGVPGE KGESGEDGPS GPDGPPGPQG LSGQRGIVGL PGQRGERGFP
1010 1020 1030 1040 1050
GLPGPSGEPG KQGGPGSAGD RGPPGPVGPP GLTGPAGEPG REGNAGSDGP
1060 1070 1080 1090 1100
PGRDGATGIK GDRGETGPLG APGAPGAPGA PGPVGPTGKQ GDRGESGPQG
1110 1120 1130 1140 1150
PLGPSGPAGA RGLPGPQGPR GDKGEAGEAG ERGQKGHRGF TGLQGLPGPP
1160 1170 1180 1190 1200
GTAGDQGASG PAGPGGPRGP PGPVGPSGKD GSNGLPGPIG PPGPRGRGGE
1210 1220 1230 1240 1250
TGPAGPPGQP GPPGPPGPPG PGIDMSAFAG LSQPEKGPDP MRYMRADQAS
1260 1270 1280 1290 1300
SSVPQRDVDV EATLKSLNNQ IESIRSPDGT KKNPARTCRD LKLCHPEWKS
1310 1320 1330 1340 1350
GDYWIDPNQG CTVDAIKVFC NMETGETCVY PNPSKIPKKN WWSAKGKEKK
1360 1370 1380 1390 1400
HIWFGETING GFQFSYGDDS SAPNTANIQM TFLRLLSTDA TQNITYHCKN
1410 1420 1430 1440 1450
SIAFMDEASG NLKKAVLLQG SNDVEIRAEG NSRFTYNALE DGCKKHTGKW
1460 1470 1480 1490
SKTVIEYRTQ KTSRLPIVDI APMDIGGADQ EFGVDIGPVC FL
Length:1,492
Mass (Da):142,696
Last modified:July 5, 2004 - v1
Checksum:iDB7AF42B94210EB7
GO

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
BC063191 mRNA. Translation: AAH63191.1.
RefSeqiNP_989220.1. NM_203889.1.
UniGeneiStr.54515.

Genome annotation databases

GeneIDi394828.
KEGGixtr:394828.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
BC063191 mRNA. Translation: AAH63191.1.
RefSeqiNP_989220.1. NM_203889.1.
UniGeneiStr.54515.

3D structure databases

ProteinModelPortaliQ6P4Z2.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

STRINGi8364.ENSXETP00000043834.

Proteomic databases

PaxDbiQ6P4Z2.
PRIDEiQ6P4Z2.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi394828.
KEGGixtr:394828.

Organism-specific databases

CTDi1280.
XenbaseiXB-GENE-6258353. col2a1.

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410XNMM. LUCA.
HOVERGENiHBG004933.
InParanoidiQ6P4Z2.
KOiK19719.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_dom.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 6 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCO2A1_XENTR
AccessioniPrimary (citable) accession number: Q6P4Z2
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 1, 2007
Last sequence update: July 5, 2004
Last modified: October 5, 2016
This is version 70 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.