Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P02459 (CO2A1_BOVIN) Reviewed, UniProtKB/Swiss-Prot

Last modified April 16, 2014. Version 122. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (1) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Collagen alpha-1(II) chain
Alternative name(s):
Alpha-1 type II collagen
Gene names
Name:COL2A1
OrganismBos taurus (Bovine) [Reference proteome]
Taxonomic identifier9913 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaLaurasiatheriaCetartiodactylaRuminantiaPecoraBovidaeBovinaeBos

Protein attributes

Sequence length1487 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Type II collagen is specific for cartilaginous tissues. It is essential for the normal embryonic development of the skeleton, for linear growth and for the ability of cartilage to resist compressive forces.

Subunit structure

Homotrimers of alpha 1(II) chains.

Subcellular location

Secretedextracellular spaceextracellular matrix By similarity.

Domain

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function By similarity.

Post-translational modification

Probably 3-hydroxylated on prolines by LEPREL1 By similarity. Proline residues at the third position of the tripeptide repeating unit (G-X-P) are hydroxylated in some or all of the chains. Proline residues at the second position of the tripeptide repeating unit (G-P-X) are hydroxylated in some of the chains.

O-linked glycans consist of Glc-Gal disaccharides bound to the oxygen atom of post-translationally added hydroxyl groups.

Sequence similarities

Belongs to the fibrillar collagen family.

Contains 1 fibrillar collagen NC1 domain.

Contains 1 VWFC domain.

Ontologies

Keywords
   Cellular componentExtracellular matrix
Secreted
   Coding sequence diversityPolymorphism
   DomainCollagen
Repeat
Signal
   LigandCalcium
Metal-binding
   PTMDisulfide bond
Glycoprotein
Hydroxylation
   Technical termComplete proteome
Direct protein sequencing
Reference proteome
Gene Ontology (GO)
   Biological_processcartilage condensation

Inferred from electronic annotation. Source: Ensembl

cartilage development involved in endochondral bone morphogenesis

Inferred from electronic annotation. Source: Ensembl

cellular response to BMP stimulus

Inferred from electronic annotation. Source: Ensembl

central nervous system development

Inferred from electronic annotation. Source: Ensembl

chondrocyte differentiation

Inferred from electronic annotation. Source: Ensembl

collagen fibril organization

Inferred from electronic annotation. Source: Ensembl

embryonic skeletal joint morphogenesis

Inferred from electronic annotation. Source: Ensembl

endochondral ossification

Inferred from electronic annotation. Source: Ensembl

heart morphogenesis

Inferred from electronic annotation. Source: Ensembl

inner ear morphogenesis

Inferred from electronic annotation. Source: Ensembl

limb bud formation

Inferred from electronic annotation. Source: Ensembl

negative regulation of extrinsic apoptotic signaling pathway in absence of ligand

Inferred from electronic annotation. Source: Ensembl

notochord development

Inferred from electronic annotation. Source: Ensembl

otic vesicle development

Inferred from electronic annotation. Source: Ensembl

palate development

Inferred from electronic annotation. Source: Ensembl

proteoglycan metabolic process

Inferred from electronic annotation. Source: Ensembl

regulation of gene expression

Inferred from electronic annotation. Source: Ensembl

sensory perception of sound

Inferred from electronic annotation. Source: Ensembl

tissue homeostasis

Inferred from electronic annotation. Source: Ensembl

visual perception

Inferred from electronic annotation. Source: Ensembl

   Cellular_componentbasement membrane

Inferred from electronic annotation. Source: Ensembl

collagen type II

Inferred from electronic annotation. Source: Ensembl

cytoplasm

Inferred from electronic annotation. Source: Ensembl

extracellular region

Traceable author statement. Source: Reactome

extracellular space

Inferred from electronic annotation. Source: Ensembl

   Molecular_functionextracellular matrix structural constituent

Inferred from electronic annotation. Source: InterPro

metal ion binding

Inferred from electronic annotation. Source: UniProtKB-KW

Complete GO annotation...

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2525 Potential
Propeptide26 – 181156N-terminal propeptide By similarity
PRO_0000401210
Chain182 – 14871306Collagen alpha-1(II) chain
PRO_0000005725

Regions

Domain32 – 9059VWFC
Domain1253 – 1487235Fibrillar collagen NC1
Region201 – 12141014Triple-helical region By similarity
Region1215 – 124127Nonhelical region (C-terminal) By similarity

Sites

Metal binding13011Calcium By similarity
Metal binding13031Calcium By similarity
Metal binding13041Calcium; via carbonyl oxygen By similarity
Metal binding13061Calcium; via carbonyl oxygen By similarity
Metal binding13091Calcium By similarity
Site181 – 1822Cleavage; by procollagen N-endopeptidase By similarity
Site1241 – 12422Cleavage; by procollagen C-endopeptidase By similarity

Amino acid modifications

Modified residue19015-hydroxylysine By similarity
Modified residue2121Hydroxyproline
Modified residue2181Hydroxyproline
Modified residue2301Hydroxyproline
Modified residue2331Hydroxyproline
Modified residue2451Hydroxyproline
Modified residue2481Hydroxyproline
Modified residue2511Hydroxyproline
Modified residue2601Hydroxyproline
Modified residue2691Hydroxyproline
Modified residue2781Hydroxyproline
Modified residue2811Hydroxyproline
Modified residue2841Hydroxyproline
Modified residue28715-hydroxylysine Ref.2
Modified residue2931Hydroxyproline
Modified residue29915-hydroxylysine Ref.2
Modified residue3051Hydroxyproline
Modified residue30815-hydroxylysine Ref.2
Modified residue3141Hydroxyproline
Modified residue3201Hydroxyproline
Modified residue3291Hydroxyproline
Modified residue3501Hydroxyproline
Modified residue3561Hydroxyproline
Modified residue3651Hydroxyproline
Modified residue3681Hydroxyproline
Modified residue3711Hydroxyproline
Modified residue37415-hydroxylysine Ref.4
Modified residue3951Hydroxyproline
Modified residue3981Hydroxyproline
Modified residue4011Hydroxyproline
Modified residue4101Hydroxyproline
Modified residue4161Hydroxyproline
Modified residue41915-hydroxylysine Ref.4
Modified residue4251Hydroxyproline
Modified residue4311Hydroxyproline
Modified residue4341Hydroxyproline
Modified residue4401Hydroxyproline
Modified residue45215-hydroxylysine Ref.4
Modified residue4581Hydroxyproline
Modified residue46415-hydroxylysine Ref.4
Modified residue47015-hydroxylysine Ref.4
Modified residue4731Hydroxyproline
Modified residue4821Hydroxyproline
Modified residue4971Hydroxyproline
Modified residue5061Hydroxyproline
Modified residue5121Hydroxyproline
Modified residue5181Hydroxyproline
Modified residue52715-hydroxylysine Ref.4
Modified residue5301Hydroxyproline
Modified residue54215-hydroxylysine Ref.4
Modified residue5511Hydroxyproline
Modified residue5571Hydroxyproline
Modified residue5661Hydroxyproline
Modified residue5811Hydroxyproline
Modified residue5871Hydroxyproline
Modified residue5901Hydroxyproline
Modified residue5991Hydroxyproline
Modified residue6051Hydroxyproline
Modified residue60815-hydroxylysine Ref.5
Modified residue6141Hydroxyproline
Modified residue62015-hydroxylysine Ref.5
Modified residue6231Hydroxyproline
Modified residue6261Hydroxyproline
Modified residue6321Hydroxyproline
Modified residue6441Hydroxyproline
Modified residue6591Hydroxyproline
Modified residue6681Hydroxyproline
Modified residue67013-hydroxyproline By similarity
Modified residue6711Hydroxyproline
Modified residue6741Hydroxyproline
Modified residue90713-hydroxyproline By similarity
Modified residue113015-hydroxylysine By similarity
Modified residue114413-hydroxyproline By similarity
Modified residue118613-hydroxyproline By similarity
Modified residue120113-hydroxyproline By similarity
Modified residue120713-hydroxyproline By similarity
Modified residue121313-hydroxyproline By similarity
Glycosylation1901O-linked (Gal...) By similarity
Glycosylation2871O-linked (Gal...) Ref.2
Glycosylation2991O-linked (Gal...) Ref.2
Glycosylation3081O-linked (Gal...) Ref.2
Glycosylation3741O-linked (Gal...) By similarity
Glycosylation6081O-linked (Gal...) Ref.5
Glycosylation6201O-linked (Gal...) Ref.5
Glycosylation11301O-linked (Gal...) By similarity
Glycosylation13881N-linked (GlcNAc...) Potential
Disulfide bond1283 ↔ 1315 By similarity
Disulfide bond1289Interchain (with C-1306) By similarity
Disulfide bond1306Interchain (with C-1289) By similarity
Disulfide bond1323 ↔ 1485 By similarity
Disulfide bond1393 ↔ 1438 By similarity

Natural variations

Natural variant3491Q → L. Ref.3

Experimental info

Sequence conflict2021P → V AA sequence Ref.2
Sequence conflict3801T → Q AA sequence Ref.4
Sequence conflict4001S → A AA sequence Ref.4
Sequence conflict4121T → A AA sequence Ref.4
Sequence conflict4361P → A AA sequence Ref.4
Sequence conflict4431Q → T AA sequence Ref.4
Sequence conflict4461T → S AA sequence Ref.4
Sequence conflict4761A → T in AAA30436. Ref.6
Sequence conflict4781P → V AA sequence Ref.4
Sequence conflict5141N → S AA sequence Ref.4
Sequence conflict5181P → S in AAA30436. Ref.6
Sequence conflict5231L → I AA sequence Ref.4
Sequence conflict5291A → P AA sequence Ref.4
Sequence conflict535 – 5395PSGLA → SPGAV AA sequence Ref.4
Sequence conflict544 – 5485ANGDP → SPGEA AA sequence Ref.4
Sequence conflict5541P → A AA sequence Ref.4
Sequence conflict5601R → K AA sequence Ref.4
Sequence conflict6771G → P AA sequence Ref.5
Sequence conflict7121S → A in AAD42347. Ref.7
Sequence conflict7181A → P in AAD42347. Ref.7
Sequence conflict7341A → S in AAD42347. Ref.7
Sequence conflict13721N → D in CAA26269. Ref.8

Sequences

Sequence LengthMass (Da)Tools
P02459 [UniParc].

Last modified November 30, 2010. Version 4.
Checksum: F99891F6FD1E47F9

FASTA1,487141,828
        10         20         30         40         50         60 
MIRLGAPQTL VLLTLLVAAV LRCHGQDVQK AGSCVQDGQR YNDKDVWKPE PCRICVCDTG 

        70         80         90        100        110        120 
TVLCDDIICE DMKDCLSPET PFGECCPICS ADLPTASGQP GPKGQKGEPG DIKDIVGPKG 

       130        140        150        160        170        180 
PPGPQGPAGE QGPRGDRGDK GEKGAPGPRG RDGEPGTPGN PGPPGPPGPP GPPGLGGNFA 

       190        200        210        220        230        240 
AQMAGGFDEK AGGAQMGVMQ GPMGPMGPRG PPGPAGAPGP QGFQGNPGEP GEPGVSGPMG 

       250        260        270        280        290        300 
PRGPPGPPGK PGDDGEAGKP GKSGERGPPG PQGARGFPGT PGLPGVKGHR GYPGLDGAKG 

       310        320        330        340        350        360 
EAGAPGVKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP GPAGPPGPVG 

       370        380        390        400        410        420 
PAGGPGFPGA PGAKGEAGPT GARGPEGAQG PRGEPGTPGS PGPAGAAGNP GTDGIPGAKG 

       430        440        450        460        470        480 
SAGAPGIAGA PGFPGPRGPP GPQGATGPLG PKGQTGEPGI AGFKGEQGPK GEPGPAGPQG 

       490        500        510        520        530        540 
APGPAGEEGK RGARGEPGGA GPAGPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG 

       550        560        570        580        590        600 
PKGANGDPGR PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG 

       610        620        630        640        650        660 
VMGFPGPKGA NGEPGKAGEK GLPGAPGLRG LPGKDGETGA AGPPGPAGPA GERGEQGAPG 

       670        680        690        700        710        720 
PSGFQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE RGFPGERGSP GSQGLQGARG 

       730        740        750        760        770        780 
LPGTPGTDGP KGAAGPAGPP GAQGPPGLQG MPGERGAAGI AGPKGDRGDV GEKGPEGAPG 

       790        800        810        820        830        840 
KDGGRGLTGP IGPPGPAGAN GEKGEVGPPG PAGTAGARGA PGERGETGPP GPAGFAGPPG 

       850        860        870        880        890        900 
ADGQPGAKGE QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKGARGAQ GPPGATGFPG 

       910        920        930        940        950        960 
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGDPGLQGPA GPPGEKGEPG 

       970        980        990       1000       1010       1020 
DDGPSGPDGP PGPQGLAGQR GIVGLPGQRG ERGFPGLPGP SGEPGKQGAP GASGDRGPPG 

      1030       1040       1050       1060       1070       1080 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG AAGVKGDRGE TGAVGAPGAP GPPGSPGPAG 

      1090       1100       1110       1120       1130       1140 
PIGKQGDRGE AGAQGPMGPA GPAGARGMPG PQGPRGDKGE TGEAGERGLK GHRGFTGLQG 

      1150       1160       1170       1180       1190       1200 
LPGPPGPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGPIGPPGPR GRSGETGPAG 

      1210       1220       1230       1240       1250       1260 
PPGNPGPPGP PGPPGPGIDM SAFAGLGQRE KGPDPLQYMR ADEAAGNLRQ HDAEVDATLK 

      1270       1280       1290       1300       1310       1320 
SLNNQIESLR SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 

      1330       1340       1350       1360       1370       1380 
ETCVYPNPAS VPKKNWWSSK SKDKKHIWFG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 

      1390       1400       1410       1420       1430       1440 
LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA LLIQGSNDVE IRAEGNSRFT YTVLKDGCTK 

      1450       1460       1470       1480 
HTGKWGKTMI EYRSQKTSRL PIIDIAPMDI GGPEQEFGVD IGPVCFL 

« Hide

References

« Hide 'large scale' references
[1]"The genome sequence of taurine cattle: a window to ruminant biology and evolution."
The bovine genome sequencing and analysis consortium
Science 324:522-528(2009) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
Strain: Hereford.
[2]"The covalent structure of cartilage collagen. Amino acid sequence of the NH2-terminal helical portion of the alpha 1 (II) chain."
Butler W.T., Miller E.J., Finch J.E. Jr.
Biochemistry 15:3000-3006(1976) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 201-362, HYDROXYLATION AT PRO-212; PRO-218; PRO-230; PRO-233; PRO-245; PRO-248; PRO-251; PRO-260; PRO-269; PRO-278; PRO-281; PRO-284; LYS-287; PRO-293; LYS-299; PRO-305; LYS-308; PRO-314; PRO-320; PRO-329; PRO-350 AND PRO-356, GLYCOSYLATION AT LYS-287; LYS-299 AND LYS-308.
Tissue: Cartilage.
[3]"The covalent structure of cartilage collagen. Evidence for sequence heterogeneity of bovine alpha1(II) chains."
Butler W.T., Finch J.E. Jr., Miller E.J.
J. Biol. Chem. 252:639-643(1977) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 345-359, HYDROXYLATION AT PRO-350 AND PRO-356, VARIANT LEU-349.
Tissue: Cartilage.
[4]"Covalent structure of collagen. Amino acid sequence of an arthritogenic cyanogen bromide peptide from type II collagen of bovine cartilage."
Seyer J.M., Hasty K.A., Kang A.H.
Eur. J. Biochem. 181:159-173(1989) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 324-602, HYDROXYLATION AT PRO-329; PRO-350; PRO-356; PRO-365; PRO-368; PRO-371; LYS-374; PRO-395; PRO-398; PRO-401; PRO-410; PRO-416; LYS-419; PRO-425; PRO-431; PRO-434; PRO-440; LYS-452; PRO-458; LYS-464; LYS-470; PRO-473; PRO-482; PRO-497; PRO-506; PRO-512; PRO-518; LYS-527; PRO-530; LYS-542; PRO-551; PRO-557; PRO-566; PRO-581; PRO-587; PRO-590 AND PRO-599, GLYCOSYLATION AT LYS-374; LYS-419; LYS-452; LYS-464; LYS-470; LYS-527 AND LYS-542.
Tissue: Cartilage.
[5]"Homologous regions of collagen alpha1(I) and alpha1(II) chains: apparent clustering of variable and invariant amino acid residues."
Butler W.T., Miller E.J., Finch J.E. Jr., Inagami T.
Biochem. Biophys. Res. Commun. 57:190-195(1974) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 603-677, HYDROXYLATION AT PRO-605; LYS-608; PRO-614; LYS-620; PRO-623; PRO-626; PRO-632; PRO-644; PRO-659; PRO-668; PRO-671 AND PRO-674, GLYCOSYLATION AT LYS-608 AND LYS-620.
[6]"Characterization of the T cell determinants in the induction of autoimmune arthritis by bovine alpha 1(II)-CB11 in H-2q mice."
Brand D.D., Myers L.K., Terato K., Whittington K.B., Stuart J.M., Kang A.H., Rosloniec E.F.
J. Immunol. 152:3088-3097(1994) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 323-602.
Tissue: Chondrocyte.
[7]"Molecular definition and characterization of recombinant bovine CB8 and CB10: immunogenicity and arthritogenicity."
Tang B., Chiang T.M., Brand D.D., Gumanovskaya M.L., Stuart J.M., Kang A.H., Myers L.K.
Clin. Immunol. 92:256-264(1999) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 602-751.
Tissue: Chondrocyte.
[8]"Analysis of cDNA and genomic clones coding for the pro alpha 1 chain of calf type II collagen."
Sangiorgi F.O., Benson-Chanda V., de Wet W.J., Sobel M.E., Ramirez F.
Nucleic Acids Res. 13:2815-2826(1985) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 1307-1487.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AAFC03017082 Genomic DNA. No translation available.
AAFC03017085 Genomic DNA. No translation available.
AAFC03056593 Genomic DNA. No translation available.
L28918 mRNA. Translation: AAA30436.2.
AF138883 mRNA. Translation: AAD42346.1.
AF138957 mRNA. Translation: AAD42347.1.
X02420 mRNA. Translation: CAA26269.1.
PIRCGBO6C. A90369.
I45876.
RefSeqNP_001001135.2. NM_001001135.2.
UniGeneBt.21390.

3D structure databases

ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

IntActP02459. 1 interaction.
STRING9913.ENSBTAP00000017505.

Proteomic databases

PRIDEP02459.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENSBTAT00000017505; ENSBTAP00000017505; ENSBTAG00000013155.
GeneID407142.
KEGGbta:407142.

Organism-specific databases

CTD1280.

Phylogenomic databases

eggNOGNOG12793.
GeneTreeENSGT00740000114967.
HOGENOMHOG000085654.
HOVERGENHBG004933.
InParanoidQ9XT25.
KOK06236.
OMAPLQYMRA.
TreeFamTF344135.

Family and domain databases

InterProIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_C.
[Graphical view]
PfamPF01410. COLFI. 1 hit.
PF01391. Collagen. 10 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

NextBio20818406.
PMAP-CutDBP02459.

Entry information

Entry nameCO2A1_BOVIN
AccessionPrimary (citable) accession number: P02459
Secondary accession number(s): Q28070, Q9XT24, Q9XT25
Entry history
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: November 30, 2010
Last modified: April 16, 2014
This is version 122 of the entry and version 4 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families