Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P05539 (CO2A1_RAT) Reviewed, UniProtKB/Swiss-Prot

Last modified April 16, 2014. Version 109. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (1) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Collagen alpha-1(II) chain
Alternative name(s):
Alpha-1 type II collagen

Cleaved into the following 2 chains:

  1. Collagen alpha-1(II) chain
  2. Chondrocalcin
Gene names
Name:Col2a1
OrganismRattus norvegicus (Rat) [Reference proteome]
Taxonomic identifier10116 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeRattus

Protein attributes

Sequence length1419 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Type II collagen is specific for cartilaginous tissues. It is essential for the normal embryonic development of the skeleton, for linear growth and for the ability of cartilage to resist compressive forces.

Subunit structure

Homotrimers of alpha 1(II) chains.

Subcellular location

Secretedextracellular spaceextracellular matrix By similarity.

Tissue specificity

Expressed in chondrocytes. Ref.5

Domain

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function By similarity.

Post-translational modification

Prolines at the third position of the tripeptide repeating unit (G-X-P) are hydroxylated in some or all of the chains. Probably 3-hydroxylated on Pro-602, Pro-839, Pro-1076, Pro-1133, Pro-1139 and Pro-1145 by LEPREL1.

Sequence similarities

Belongs to the fibrillar collagen family.

Contains 1 fibrillar collagen NC1 domain.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2525 Potential
Propeptide26 – 11388N-terminal propeptide By similarity
PRO_0000005735
Chain114 – 11731060Collagen alpha-1(II) chain
PRO_0000005736
Chain1174 – 1419246Chondrocalcin By similarity
PRO_0000043407

Regions

Domain1185 – 1419235Fibrillar collagen NC1
Region133 – 11461014Triple-helical region
Region1147 – 117327Nonhelical region (C-terminal)

Sites

Metal binding12331Calcium By similarity
Metal binding12351Calcium By similarity
Metal binding12361Calcium; via carbonyl oxygen By similarity
Metal binding12381Calcium; via carbonyl oxygen By similarity
Metal binding12411Calcium By similarity
Site113 – 1142Cleavage; by procollagen N-endopeptidase By similarity
Site1173 – 11742Cleavage; by procollagen C-endopeptidase By similarity

Amino acid modifications

Modified residue12215-hydroxylysine By similarity
Modified residue21915-hydroxylysine By similarity
Modified residue23115-hydroxylysine By similarity
Modified residue24015-hydroxylysine By similarity
Modified residue30615-hydroxylysine By similarity
Modified residue54015-hydroxylysine By similarity
Modified residue55215-hydroxylysine By similarity
Modified residue60213-hydroxyproline; partial Ref.5
Modified residue83913-hydroxyproline; partial Ref.5
Modified residue107613-hydroxyproline; partial Ref.5
Modified residue111813-hydroxyproline Ref.5
Modified residue113313-hydroxyproline; partial Ref.5
Modified residue113913-hydroxyproline; partial Ref.5
Modified residue114513-hydroxyproline; partial Ref.5
Glycosylation1221O-linked (Gal...) By similarity
Glycosylation2191O-linked (Gal...) By similarity
Glycosylation2311O-linked (Gal...) By similarity
Glycosylation2401O-linked (Gal...) By similarity
Glycosylation3061O-linked (Gal...) By similarity
Glycosylation5401O-linked (Gal...) By similarity
Glycosylation5521O-linked (Gal...) By similarity
Glycosylation13201N-linked (GlcNAc...) Potential
Disulfide bond1215 ↔ 1247 By similarity
Disulfide bond1221Interchain (with C-1238) By similarity
Disulfide bond1238Interchain (with C-1221) By similarity
Disulfide bond1255 ↔ 1417 By similarity
Disulfide bond1325 ↔ 1370 By similarity

Experimental info

Sequence conflict1211E → Q in AAA40919. Ref.2

Sequences

Sequence LengthMass (Da)Tools
P05539 [UniParc].

Last modified December 6, 2005. Version 2.
Checksum: B7C63B77819CE50B

FASTA1,419134,570
        10         20         30         40         50         60 
MIRLGAPQSL VLLTLLIATV LQCQGQDARK LGPKGQKGEP GDIKDIIGPK GPPGPQGPAG 

        70         80         90        100        110        120 
EQGPRGDRGD KGERGAPGPR GRDGEPGTPG NPGPPGPPGP PGPPGLGGGN FAAQMAGGFD 

       130        140        150        160        170        180 
EKAGGAQMGV MQGPMGPMGP RGPPGPAGAP GPQGFQGNPG EPGEPGVSGP IGPRGPPGPA 

       190        200        210        220        230        240 
GKPGDDGEAG KPGKAGERGL PGPQGARGFP GTPGLPGVKG HRGYPGLDGA KGEAGAPGVK 

       250        260        270        280        290        300 
GESGSPGENG SPGPMGPRGL PGERGRTGPA GAAGARGNDG QPGPAGPPGP VGPAGGPGFL 

       310        320        330        340        350        360 
GAPGAKGEAG PTGARGPEGA QGSRGEPGNP GSPGPAGASG NPGTDGIPGA KGSAGAPGIA 

       370        380        390        400        410        420 
GAPGFPGPRG PPGPQGATGP LGPKGQTGEP GIAGFKGEQG PKGETGPAGP QGAPGPAGEE 

       430        440        450        460        470        480 
GKRGARGEPG GAGPIGPPGE RGAPGNRGFP GQDGLAGPKG APGERGPSGL AGPKGANGDP 

       490        500        510        520        530        540 
GRPGEPGLPG ARGLTGRPGD AGPQGKVGPS GAPGEDGRPG PPGPQGARGQ PGVMGFPGPK 

       550        560        570        580        590        600 
GANGEPGKAG EKGLAGAPGL RGLPGKDGET GAAGPPGPSG PAGERGEQGA PGPSGFQGLP 

       610        620        630        640        650        660 
GPPGPPGEGG KQGDQGIPGE AGAPGLVGPR GERGFPGERG SPGAQGLQGP RGLPGTPGTD 

       670        680        690        700        710        720 
GPKGAAGPDG PPGAQGPPGL QGMPGERGAA GIAGPKGDRG DVGEKGPEGA PGKDGGRGLT 

       730        740        750        760        770        780 
GPIGPPGPAG ANGEKGEVGP PGPSGSTGAR GAPGERGETG PPGPAGFAGP PGADGQPGAK 

       790        800        810        820        830        840 
GDQGEAGQKG DAGAPGPQGP SGAPGPQGPT GVTGPKGARG AQGPPGATGF PGAAGRVGPP 

       850        860        870        880        890        900 
GSNGNPGPAG PPGPAGKDGP KGARGDTGAP GRAGDPGLQG PAGAPGEKGE PGDDGPSGSD 

       910        920        930        940        950        960 
GPPGPQGLAG QRGIVGLPGQ RGERGFPGLP GPSGEPGKQG APGASGDRGP PGPVGPPGLT 

       970        980        990       1000       1010       1020 
GPAGEPGREG SPGADGPPGR DGAAGVKGDR GETGALGAPG APGPPGSPGP AGPTGKQGDR 

      1030       1040       1050       1060       1070       1080 
GEAGAQGPMG PSGPAGARGI AGPQGPRGDK GEAGEPGERG LKGHRGFTGL QGLPGPPGPS 

      1090       1100       1110       1120       1130       1140 
GDQGTSGPAG PSGPRGPPGP VGPSGKDGSN GIPGPIGPPG PRGRSGETGP AGPPGNPGPP 

      1150       1160       1170       1180       1190       1200 
GPPGPPGPGI DMSAFAGLGQ REKGPDPLQY MRADEADSTL RQHDVEVDAT LKSLNNQIES 

      1210       1220       1230       1240       1250       1260 
IRSPDGSRKN PARTCQDLKL CHPEWKSGDY WIDPNQGCTL DAMKVFCNME TGESCVYPNP 

      1270       1280       1290       1300       1310       1320 
ATVPRKNWWS SKSKEKKHIW FGETMNGGFH FSYGDGNLAP NTANVQMTFL RLLSTEGSQN 

      1330       1340       1350       1360       1370       1380 
ITYHCKNSIA YLDEAAGNLK KALLIQGSND VEMRAEGNSR FTYTALKDGC TKHTGKWGKT 

      1390       1400       1410 
IIEYRSQKTS RLPIVDIAPM DIGGPDQEFG VDIGPVCFL 

« Hide

References

[1]"Complete rat type II collagen cDNA sequence."
Urabe K., Sarkar G., Bolander M.E.
Submitted (OCT-1995) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE [MRNA].
Tissue: Bone.
[2]"Isolation and characterization of a cDNA clone for the amino-terminal portion of the pro-alpha 1(II) chain of cartilage collagen."
Kohno K., Martin G.R., Yamada Y.
J. Biol. Chem. 259:13668-13673(1984) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 1-122.
[3]"Structure of the promoter of the rat type II procollagen gene."
Kohno K., Sullivan M., Yamada Y.
J. Biol. Chem. 260:4441-4447(1985) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1-29.
[4]"T-cell recognition of carbohydrates on type II collagen."
Michaelson E., Malmstrom V., Reis S., Engstrom A., Burkhardt H., Holmdahl R.
J. Exp. Med. 180:745-749(1994) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 370-422.
Strain: DA.
Tissue: Cartilage.
[5]"A role for prolyl 3-hydroxylase 2 in post-translational modification of fibril-forming collagens."
Fernandes R.J., Farnand A.W., Traeger G.R., Weis M.A., Eyre D.R.
J. Biol. Chem. 286:30662-30669(2011) [PubMed] [Europe PMC] [Abstract]
Cited for: HYDROXYLATION AT PRO-602; PRO-839; PRO-1076; PRO-1118; PRO-1133; PRO-1139 AND PRO-1145, TISSUE SPECIFICITY, IDENTIFICATION BY MASS SPECTROMETRY.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
L48440 mRNA. Translation: AAA79780.1.
K02804 mRNA. Translation: AAA40919.1.
M10613 Genomic DNA. Translation: AAA40920.1.
X79816 mRNA. Translation: CAA56213.1.
PIRA05152.
I60384.
RefSeqNP_037061.1. NM_012929.1.
UniGeneRn.10124.

3D structure databases

ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

IntActP05539. 1 interaction.

Proteomic databases

PaxDbP05539.
PRIDEP05539.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

GeneID25412.
KEGGrno:25412.
UCSCRGD:2375. rat.

Organism-specific databases

CTD1280.
RGD2375. Col2a1.

Phylogenomic databases

eggNOGNOG12793.
HOGENOMHOG000085654.
HOVERGENHBG004933.
KOK06236.
PhylomeDBP05539.

Gene expression databases

GenevestigatorP05539.

Family and domain databases

InterProIPR008160. Collagen.
IPR000885. Fib_collagen_C.
[Graphical view]
PfamPF01410. COLFI. 1 hit.
PF01391. Collagen. 4 hits.
[Graphical view]
ProDomPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTSM00038. COLFI. 1 hit.
[Graphical view]
PROSITEPS51461. NC1_FIB. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

NextBio606543.
PROP05539.

Entry information

Entry nameCO2A1_RAT
AccessionPrimary (citable) accession number: P05539
Secondary accession number(s): Q63123, Q63565, Q78DY3
Entry history
Integrated into UniProtKB/Swiss-Prot: November 1, 1988
Last sequence update: December 6, 2005
Last modified: April 16, 2014
This is version 109 of the entry and version 2 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families