Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Collagen alpha-2(I) chain

Gene

Col1a2

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at transcript leveli

Functioni

Type I collagen is a member of group I collagen (fibrillar forming collagen).

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Metal bindingi1187 – 11871CalciumBy similarity
Metal bindingi1189 – 11891CalciumBy similarity
Metal bindingi1190 – 11901Calcium; via carbonyl oxygenBy similarity
Metal bindingi1192 – 11921Calcium; via carbonyl oxygenBy similarity
Metal bindingi1195 – 11951CalciumBy similarity

GO - Molecular functioni

  1. extracellular matrix structural constituent Source: InterPro
  2. identical protein binding Source: MGI
  3. metal ion binding Source: UniProtKB-KW
  4. platelet-derived growth factor binding Source: MGI
  5. protein binding, bridging Source: MGI
  6. SMAD binding Source: MGI

GO - Biological processi

  1. blood vessel development Source: MGI
  2. cellular response to amino acid stimulus Source: MGI
  3. collagen fibril organization Source: MGI
  4. protein heterotrimerization Source: MGI
  5. regulation of blood pressure Source: MGI
  6. Rho protein signal transduction Source: MGI
  7. skeletal system development Source: MGI
  8. skin morphogenesis Source: MGI
  9. transforming growth factor beta receptor signaling pathway Source: MGI
Complete GO annotation...

Keywords - Ligandi

Calcium, Metal-binding

Enzyme and pathway databases

ReactomeiREACT_278886. Cell surface interactions at the vascular wall.
REACT_285754. Collagen biosynthesis and modifying enzymes.
REACT_299762. Crosslinking of collagen fibrils.
REACT_300420. Extracellular matrix organization.
REACT_313067. Collagen degradation.
REACT_318656. Assembly of collagen fibrils and other multimeric structures.
REACT_319261. Integrin cell surface interactions.
REACT_320075. Non-integrin membrane-ECM interactions.
REACT_326610. Syndecan interactions.
REACT_337915. GPVI-mediated activation cascade.
REACT_343076. Anchoring fibril formation.
REACT_344422. Platelet Adhesion to exposed collagen.
REACT_346699. Scavenging by Class A Receptors.
REACT_354321. ECM proteoglycans.

Names & Taxonomyi

Protein namesi
Recommended name:
Collagen alpha-2(I) chain
Alternative name(s):
Alpha-2 type I collagen
Gene namesi
Name:Col1a2
Synonyms:Cola2
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
ProteomesiUP000000589 Componenti: Chromosome 6

Organism-specific databases

MGIiMGI:88468. Col1a2.

Subcellular locationi

Secretedextracellular spaceextracellular matrix PROSITE-ProRule annotation

GO - Cellular componenti

  1. collagen trimer Source: MGI
  2. collagen type I trimer Source: MGI
  3. extracellular matrix Source: UniProtKB
  4. extracellular space Source: MGI
  5. extracellular vesicular exosome Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 2222Sequence AnalysisAdd
BLAST
Propeptidei23 – 8563N-terminal propeptideBy similarityPRO_0000005807Add
BLAST
Chaini86 – 11081023Collagen alpha-2(I) chainPRO_0000005808Add
BLAST
Propeptidei1109 – 1372264C-terminal propeptideBy similarityPRO_0000005809Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei90 – 901AllysineBy similarity
Disulfide bondi1169 ↔ 1201PROSITE-ProRule annotation
Disulfide bondi1209 ↔ 1370PROSITE-ProRule annotation
Glycosylationi1273 – 12731N-linked (GlcNAc...)Sequence Analysis
Disulfide bondi1278 ↔ 1323PROSITE-ProRule annotation

Post-translational modificationi

Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains.

Keywords - PTMi

Disulfide bond, Glycoprotein, Hydroxylation

Proteomic databases

MaxQBiQ01149.
PaxDbiQ01149.
PRIDEiQ01149.

PTM databases

PhosphoSiteiQ01149.

Expressioni

Tissue specificityi

Forms the fibrils of tendon, ligaments and bones. In bones the fibrils are mineralized with calcium hydroxyapatite.

Gene expression databases

BgeeiQ01149.
CleanExiMM_COL1A2.
ExpressionAtlasiQ01149. baseline and differential.
GenevestigatoriQ01149.

Interactioni

Subunit structurei

Trimers of one alpha 2(I) and two alpha 1(I) chains.

Protein-protein interaction databases

BioGridi198832. 1 interaction.
IntActiQ01149. 1 interaction.
MINTiMINT-4091346.

Structurei

3D structure databases

ProteinModelPortaliQ01149.
SMRiQ01149. Positions 1158-1371.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini1139 – 1372234Fibrillar collagen NC1PROSITE-ProRule annotationAdd
BLAST

Domaini

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function (By similarity).By similarity

Sequence similaritiesi

Belongs to the fibrillar collagen family.PROSITE-ProRule annotation
Contains 1 fibrillar collagen NC1 domain.PROSITE-ProRule annotation

Keywords - Domaini

Collagen, Repeat, Signal

Phylogenomic databases

eggNOGiNOG12793.
HOGENOMiHOG000085654.
HOVERGENiHBG004933.
InParanoidiQ01149.
KOiK06236.
OMAiNWYRSSK.
OrthoDBiEOG7TJ3HH.
PhylomeDBiQ01149.
TreeFamiTF344135.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 6 hits.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

Q01149-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MLSFVDTRTL LLLAVTSCLA TCQYLQSGSV RKGPTGDRGP RGQRGPAGPR
60 70 80 90 100
GRDGVDGPMG PPGPPGSPGP PGSPAPPGLT GNFAAQYSDK GVSSGPGPMG
110 120 130 140 150
LMGPRGPPGA VGAPGPQGFQ GPAGEPGEPG QTGPAGPRGP AGSPGKAGED
160 170 180 190 200
GHPGKPGRPG ERGVVGPQGA RGFPGTPGLP GFKGVKGHSG MDGLKGQPGA
210 220 230 240 250
QGVKGEPGAP GENGTPGQAG ARGLPGERGR VGAPGPAGAR GSDGSVGPVG
260 270 280 290 300
PAGPIGSAGP PGFPGAPGPK GELGPVGNPG PAGPAGPRGE VGLPGLSGPV
310 320 330 340 350
GPPGNPGTNG LTGAKGATGL PGVAGAPGLP GPRGIPGPAG AAGATGARGL
360 370 380 390 400
VGEPGPAGSK GESGNKGEPG SVGAQGPPGP SGEEGKRGSP GEAGSAGPAG
410 420 430 440 450
PPGLRGSPGS RGLPGADGRA GVMGPPGNRG STGPAGIRGP NGDAGRPGEP
460 470 480 490 500
GLMGPRGLPG SPGNVGPSGK EGPVGLPGID GRPGPIGPAG PRGEAGNIGF
510 520 530 540 550
PGPKGPSGDP GKPGERGHPG LAGARGAPGP DGNNGAQGPP GPQGVQGGKG
560 570 580 590 600
EQGPAGPPGF QGLPGPSGTT GEVGKPGERG LPGEFGLPGP AGPRGERGTP
610 620 630 640 650
GESGAAGPSG PIGSRGPSGA PGPDGNKGEA GAVGAPGSAG ASGPGGLPGE
660 670 680 690 700
RGAAGIPGGK GEKGETGLRG DTGNTGRDGA RGIPGAVGAP GPAGASGDRG
710 720 730 740 750
EAGAAGPSGP AGPRGSPGER GEVGPAGPNG FAGPAGAAGQ PGAKGEKGTK
760 770 780 790 800
GPKGENGIVG PTGSVGAAGP SGPNGPPGPV GSRGDGGPPG MTGFPGAAGR
810 820 830 840 850
TGPPGPSGIA GPPGPPGAAG KEGIRGPRGD QGPVGRTGET GASGPPGFVG
860 870 880 890 900
EKGPSGEPGT AGAPGTAGPQ GLLGAPGILG LPGSRGERGL PGIAGALGEP
910 920 930 940 950
GPLGISGPPG ARGPPGAVGS PGVNGAPGEA GRDGNPGSDG PPGRDGQPGH
960 970 980 990 1000
KGERGYPGSI GPTGAAGAPG PHGSVGPAGK HGNRGEPGPA GSVGPVGAVG
1010 1020 1030 1040 1050
PRGPSGPQGI RGDKGEPGDK GHRGLPGLKG YSGLQGLPGL AGLHGDQGAP
1060 1070 1080 1090 1100
GPVGPAGPRG PAGPSGPVGK DGRSGQPGPV GPAGVRGSQG SQGPAGPPGP
1110 1120 1130 1140 1150
PGPPGPPGVS GGGYDFGFEG DFYRADQPRS QPSLRPKDYE VDATLKSLNN
1160 1170 1180 1190 1200
QIETLLTPEG SRKNPARTCR DLRLSHPEWN SDYYWIDPNQ GCTMDAIKVY
1210 1220 1230 1240 1250
CDFSTGETCI QAQPVNTPAK NSYSRAQANK HVWLGETING GSQFEYNVEG
1260 1270 1280 1290 1300
VSSKEMATQL AFMRLLANRA SQNITYHCKN SIAYLDEETG SLNKAVLLQG
1310 1320 1330 1340 1350
SNDVELVAEG NSRFTYSVLV DGCSKKTNEW GKTIIEYKTN KPSRLPFLDI
1360 1370
APLDIGGADQ EFRVEVGPVC FK
Length:1,372
Mass (Da):129,557
Last modified:August 28, 2001 - v2
Checksum:i0D17DF5D6C1452D1
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti15 – 151V → A in AAA37331 (PubMed:3039494).Curated
Sequence conflicti1167 – 11671R → TT in CAA41205 (PubMed:1505972).Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X58251 mRNA. Translation: CAA41205.1.
BC007158 mRNA. Translation: AAH07158.1.
BC042503 mRNA. Translation: AAH42503.2.
K01832 Genomic DNA. Translation: AAA37331.1.
CCDSiCCDS39420.1.
PIRiA43291.
RefSeqiNP_031769.2. NM_007743.2.
UniGeneiMm.277792.

Genome annotation databases

EnsembliENSMUST00000031668; ENSMUSP00000031668; ENSMUSG00000029661.
GeneIDi12843.
KEGGimmu:12843.
UCSCiuc009avm.1. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X58251 mRNA. Translation: CAA41205.1.
BC007158 mRNA. Translation: AAH07158.1.
BC042503 mRNA. Translation: AAH42503.2.
K01832 Genomic DNA. Translation: AAA37331.1.
CCDSiCCDS39420.1.
PIRiA43291.
RefSeqiNP_031769.2. NM_007743.2.
UniGeneiMm.277792.

3D structure databases

ProteinModelPortaliQ01149.
SMRiQ01149. Positions 1158-1371.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi198832. 1 interaction.
IntActiQ01149. 1 interaction.
MINTiMINT-4091346.

PTM databases

PhosphoSiteiQ01149.

Proteomic databases

MaxQBiQ01149.
PaxDbiQ01149.
PRIDEiQ01149.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000031668; ENSMUSP00000031668; ENSMUSG00000029661.
GeneIDi12843.
KEGGimmu:12843.
UCSCiuc009avm.1. mouse.

Organism-specific databases

CTDi1278.
MGIiMGI:88468. Col1a2.

Phylogenomic databases

eggNOGiNOG12793.
HOGENOMiHOG000085654.
HOVERGENiHBG004933.
InParanoidiQ01149.
KOiK06236.
OMAiNWYRSSK.
OrthoDBiEOG7TJ3HH.
PhylomeDBiQ01149.
TreeFamiTF344135.

Enzyme and pathway databases

ReactomeiREACT_278886. Cell surface interactions at the vascular wall.
REACT_285754. Collagen biosynthesis and modifying enzymes.
REACT_299762. Crosslinking of collagen fibrils.
REACT_300420. Extracellular matrix organization.
REACT_313067. Collagen degradation.
REACT_318656. Assembly of collagen fibrils and other multimeric structures.
REACT_319261. Integrin cell surface interactions.
REACT_320075. Non-integrin membrane-ECM interactions.
REACT_326610. Syndecan interactions.
REACT_337915. GPVI-mediated activation cascade.
REACT_343076. Anchoring fibril formation.
REACT_344422. Platelet Adhesion to exposed collagen.
REACT_346699. Scavenging by Class A Receptors.
REACT_354321. ECM proteoglycans.

Miscellaneous databases

ChiTaRSiCol1a2. mouse.
NextBioi282380.
PROiQ01149.
SOURCEiSearch...

Gene expression databases

BgeeiQ01149.
CleanExiMM_COL1A2.
ExpressionAtlasiQ01149. baseline and differential.
GenevestigatoriQ01149.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 6 hits.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Sequence analysis of a full-length cDNA for the murine pro alpha 2(I) collagen chain: comparison of the derived primary structure with human pro alpha 2(I) collagen."
    Phillips C.L., Morgan A.L., Lever L.W., Wenstrup R.J.
    Genomics 13:1345-1346(1991) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
    Tissue: Calvaria.
  2. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Strain: C57BL/6J.
    Tissue: Mammary gland.
  3. "Construction of a full-length murine pro alpha 2(I) collagen cDNA by the polymerase chain reaction."
    Phillips C.L., Lever L.W., Pinnell S.R., Quarles L.D., Wenstrup R.J.
    J. Invest. Dermatol. 97:980-984(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 1-110.
    Tissue: Calvaria.
  4. "Identification of a cell-specific transcriptional enhancer in the first intron of the mouse alpha 2 (type I) collagen gene."
    Rossi P., de Crombrugghe B.
    Proc. Natl. Acad. Sci. U.S.A. 84:5590-5594(1986) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1-23.

Entry informationi

Entry nameiCO1A2_MOUSE
AccessioniPrimary (citable) accession number: Q01149
Secondary accession number(s): Q8CGA5
Entry historyi
Integrated into UniProtKB/Swiss-Prot: March 31, 1993
Last sequence update: August 28, 2001
Last modified: March 31, 2015
This is version 134 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.