Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Collagen alpha-1(I) chain

Gene

Col1a1

Organism
Rattus norvegicus (Rat)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Type I collagen is a member of group I collagen (fibrillar forming collagen).

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Metal bindingi1266CalciumBy similarity1
Metal bindingi1268CalciumBy similarity1
Metal bindingi1269Calcium; via carbonyl oxygenBy similarity1
Metal bindingi1271Calcium; via carbonyl oxygenBy similarity1
Metal bindingi1274CalciumBy similarity1

GO - Molecular functioni

GO - Biological processi

  • blood vessel development Source: Ensembl
  • bone trabecula formation Source: Ensembl
  • cartilage development involved in endochondral bone morphogenesis Source: Ensembl
  • cellular response to amino acid stimulus Source: Ensembl
  • cellular response to epidermal growth factor stimulus Source: RGD
  • cellular response to fibroblast growth factor stimulus Source: RGD
  • cellular response to fluoride Source: RGD
  • cellular response to mechanical stimulus Source: RGD
  • cellular response to retinoic acid Source: RGD
  • cellular response to transforming growth factor beta stimulus Source: UniProtKB
  • cellular response to tumor necrosis factor Source: RGD
  • cellular response to vitamin E Source: RGD
  • collagen-activated tyrosine kinase receptor signaling pathway Source: MGI
  • collagen biosynthetic process Source: Ensembl
  • collagen fibril organization Source: Ensembl
  • embryonic skeletal system development Source: Ensembl
  • endochondral ossification Source: Ensembl
  • extracellular matrix organization Source: Reactome
  • face morphogenesis Source: Ensembl
  • intramembranous ossification Source: Ensembl
  • negative regulation of cell-substrate adhesion Source: Ensembl
  • ossification Source: RGD
  • osteoblast differentiation Source: Ensembl
  • positive regulation of canonical Wnt signaling pathway Source: Ensembl
  • positive regulation of cell migration Source: Ensembl
  • positive regulation of epithelial to mesenchymal transition Source: Ensembl
  • positive regulation of transcription, DNA-templated Source: Ensembl
  • protein heterotrimerization Source: Ensembl
  • protein localization to nucleus Source: Ensembl
  • protein transport Source: Ensembl
  • response to cAMP Source: RGD
  • response to corticosteroid Source: RGD
  • response to drug Source: RGD
  • response to estradiol Source: RGD
  • response to fluoride Source: RGD
  • response to hydrogen peroxide Source: RGD
  • response to hyperoxia Source: RGD
  • response to mechanical stimulus Source: RGD
  • response to nutrient Source: RGD
  • response to nutrient levels Source: RGD
  • response to peptide hormone Source: RGD
  • response to steroid hormone Source: RGD
  • sensory perception of sound Source: Ensembl
  • skin morphogenesis Source: Ensembl
  • tooth eruption Source: RGD
  • tooth mineralization Source: Ensembl
  • visual perception Source: Ensembl
  • wound healing Source: RGD
Complete GO annotation...

Keywords - Ligandi

Calcium, Metal-binding

Enzyme and pathway databases

ReactomeiR-RNO-114604. GPVI-mediated activation cascade.
R-RNO-1442490. Collagen degradation.
R-RNO-1474244. Extracellular matrix organization.
R-RNO-1650814. Collagen biosynthesis and modifying enzymes.
R-RNO-198933. Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.
R-RNO-2022090. Assembly of collagen fibrils and other multimeric structures.
R-RNO-202733. Cell surface interactions at the vascular wall.
R-RNO-216083. Integrin cell surface interactions.
R-RNO-2214320. Anchoring fibril formation.
R-RNO-2243919. Crosslinking of collagen fibrils.
R-RNO-3000171. Non-integrin membrane-ECM interactions.
R-RNO-3000178. ECM proteoglycans.
R-RNO-430116. GP1b-IX-V activation signalling.
R-RNO-75892. Platelet Adhesion to exposed collagen.
R-RNO-76009. Platelet Aggregation (Plug Formation).
R-RNO-8874081. MET activates PTK2 signaling.

Names & Taxonomyi

Protein namesi
Recommended name:
Collagen alpha-1(I) chain
Alternative name(s):
Alpha-1 type I collagen
Gene namesi
Name:Col1a1
OrganismiRattus norvegicus (Rat)
Taxonomic identifieri10116 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeRattus
Proteomesi
  • UP000002494 Componenti: Chromosome 10

Organism-specific databases

RGDi61817. Col1a1.

Subcellular locationi

GO - Cellular componenti

  • collagen type I trimer Source: Ensembl
  • endoplasmic reticulum Source: RGD
  • extracellular region Source: Reactome
  • extracellular space Source: RGD
  • Golgi apparatus Source: RGD
  • secretory granule Source: RGD
Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 22Sequence analysisAdd BLAST22
PropeptideiPRO_000004335823 – 151N-terminal propeptide1 PublicationAdd BLAST129
ChainiPRO_0000043359152 – 1207Collagen alpha-1(I) chainAdd BLAST1056
PropeptideiPRO_00000433601208 – 1453C-terminal propeptideBy similarityAdd BLAST246

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Glycosylationi56N-linked (GlcNAc...)Sequence analysis1
Modified residuei152Pyrrolidone carboxylic acidCurated1
Modified residuei160Allysine1
Modified residuei161PhosphoserineCombined sources1
Modified residuei1794-hydroxyprolineCurated1
Modified residuei1824-hydroxyprolineCurated1
Modified residuei1854-hydroxyprolineCurated1
Modified residuei1944-hydroxyprolineCurated1
Modified residuei1974-hydroxyprolineCurated1
Modified residuei2004-hydroxyprolineCurated1
Modified residuei2545-hydroxylysine; alternate1 Publication1
Glycosylationi254O-linked (Gal...); alternate1 Publication1
Modified residuei260PhosphoserineCombined sources1
Modified residuei5755-hydroxylysineCurated1
Modified residuei6985-hydroxylysineCurated1
Modified residuei776PhosphoserineCombined sources1
Modified residuei11533-hydroxyprolineBy similarity1
Disulfide bondi1248 ↔ 1280PROSITE-ProRule annotation
Disulfide bondi1254Interchain (with C-1271)PROSITE-ProRule annotation
Disulfide bondi1271Interchain (with C-1254)PROSITE-ProRule annotation
Disulfide bondi1288 ↔ 1451PROSITE-ProRule annotation
Glycosylationi1354N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi1359 ↔ 1404PROSITE-ProRule annotation

Post-translational modificationi

Proline residues at the third position of the tripeptide repeating unit (G-X-P) are hydroxylated in some or all of the chains. Proline residues at the second position of the tripeptide repeating unit (G-P-X) are hydroxylated in some of the chains.1 Publication
O-linked glycan consists of a Glc-Gal disaccharide bound to the oxygen atom of a post-translationally added hydroxyl group.1 Publication
Hydroxylation on proline residues within the sequence motif, GXPG, is most likely to be 4-hydroxy as this fits the requirement for 4-hydroxylation in vertebrates.1 Publication

Keywords - PTMi

Disulfide bond, Glycoprotein, Hydroxylation, Phosphoprotein, Pyrrolidone carboxylic acid

Proteomic databases

PaxDbiP02454.
PRIDEiP02454.

PTM databases

iPTMnetiP02454.
PhosphoSitePlusiP02454.

Expressioni

Tissue specificityi

Forms the fibrils of tendon, ligaments and bones. In bones the fibrils are mineralized with calcium hydroxyapatite.

Gene expression databases

BgeeiENSRNOG00000003897.
GenevisibleiP02454. RN.

Interactioni

Subunit structurei

Trimers of one alpha 2(I) and two alpha 1(I) chains. Interacts with MRC2 (PubMed:15817460). Interacts with TRAM2. Interacts with MFAP4 in a Ca (2+)-dependent manner (By similarity).By similarity1 Publication

Protein-protein interaction databases

BioGridi248045. 1 interactor.
DIPiDIP-36887N.
IntActiP02454. 2 interactors.
STRINGi10116.ENSRNOP00000005311.

Structurei

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3HQVfiber diffraction5.16A/C152-1207[»]
3HR2fiber diffraction5.16A/C152-1207[»]
ProteinModelPortaliP02454.
SMRiP02454.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP02454.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini29 – 87VWFCPROSITE-ProRule annotationAdd BLAST59
Domaini1218 – 1453Fibrillar collagen NC1PROSITE-ProRule annotationAdd BLAST236

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni152 – 167Nonhelical region (N-terminal)Add BLAST16
Regioni168 – 1181Triple-helical regionAdd BLAST1014
Regioni1176 – 1186Major antigenic determinant (of neutral salt-extracted rat skin collagen)Add BLAST11
Regioni1182 – 1207Nonhelical region (C-terminal)Add BLAST26

Motif

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Motifi734 – 736Cell attachment siteSequence analysis3
Motifi1082 – 1084Cell attachment siteSequence analysis3

Domaini

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function (By similarity).By similarity

Sequence similaritiesi

Belongs to the fibrillar collagen family.PROSITE-ProRule annotation
Contains 1 fibrillar collagen NC1 domain.PROSITE-ProRule annotation
Contains 1 VWFC domain.PROSITE-ProRule annotation

Keywords - Domaini

Collagen, Repeat, Signal

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410XNMM. LUCA.
GeneTreeiENSGT00840000129673.
HOGENOMiHOG000085654.
HOVERGENiHBG004933.
InParanoidiP02454.
KOiK06236.
OMAiPEACRIC.
OrthoDBiEOG091G03LV.
PhylomeDBiP02454.
TreeFamiTF344135.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_dom.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 12 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P02454-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MFSFVDLRLL LLLGATALLT HGQEDIPEVS CIHNGLRVPN GETWKPDVCL
60 70 80 90 100
ICICHNGTAV CDGVLCKEDL DCPNPQKREG ECCPFCPEEY VSPDAEVIGV
110 120 130 140 150
EGPKGDPGPQ GPRGPVGPPG QDGIPGQPGL PGPPGPPGPP GPPGLGGNFA
160 170 180 190 200
SQMSYGYDEK SAGVSVPGPM GPSGPRGLPG PPGAPGPQGF QGPPGEPGEP
210 220 230 240 250
GASGPMGPRG PPGPPGKNGD DGEAGKPGRP GERGPPGPQG ARGLPGTAGL
260 270 280 290 300
PGMKGHRGFS GLDGAKGDTG PAGPKGEPGS PGENGAPGQM GPRGLPGERG
310 320 330 340 350
RPGPPGSAGA RGNDGAVGAA GPPGPTGPTG PPGFPGAAGA KGEAGPQGAR
360 370 380 390 400
GSEGPQGVRG EPGPPGPAGA AGPAGNPGAD GQPGAKGANG APGIAGAPGF
410 420 430 440 450
PGARGPSGPQ GPSGAPGPKG NSGEPGAPGN KGDTGAKGEP GPAGVQGPPG
460 470 480 490 500
PAGEEGKRGA RGEPGPSGLP GPPGERGGPG SRGFPGADGV AGPKGPAGER
510 520 530 540 550
GSPGPAGPKG SPGEAGRPGE AGLPGAKGLT GSPGSPGPDG KTGPPGPAGQ
560 570 580 590 600
DGRPGPAGPP GARGQAGVMG FPGPKGTAGE PGKAGERGVP GPPGAVGPAG
610 620 630 640 650
KDGEAGAQGA PGPAGPAGER GEQGPAGSPG FQGLPGPAGP PGEAGKPGEQ
660 670 680 690 700
GVPGDLGAPG PSGARGERGF PGERGVQGPP GPAGPRGNNG APGNDGAKGD
710 720 730 740 750
TGAPGAPGSQ GAPGLQGMPG ERGAAGLPGP KGDRGDAGPK GADGSPGKDG
760 770 780 790 800
VRGLTGPIGP PGPAGAPGDK GETGPSGPAG PTGARGAPGD RGEPGPPGPA
810 820 830 840 850
GFAGPPGADG QPGAKGEPGD TGVKGDAGPP GPAGPAGPPG PIGNVGAPGP
860 870 880 890 900
KGSRGAAGPP GATGFPGAAG RVGPPGPSGN AGPPGPPGPV GKEGGKGPRG
910 920 930 940 950
ETGPAGRPGE VGPPGPPGPA GEKGSPGADG PAGSPGTPGP QGIAGQRGVV
960 970 980 990 1000
GLPGQRGERG FPGLPGPSGE PGKQGPSGAS GERGPPGPMG PPGLAGPPGE
1010 1020 1030 1040 1050
SGREGSPGAE GSPGRDGAPG AKGDRGETGP AGPPGAPGAP GAPGPVGPAG
1060 1070 1080 1090 1100
KNGDRGETGP AGPAGPIGPA GARGPAGPQG PRGDKGETGE QGDRGIKGHR
1110 1120 1130 1140 1150
GFSGLQGPPG SPGSPGEQGP SGASGPAGPR GPPGSAGSPG KDGLNGLPGP
1160 1170 1180 1190 1200
IGPPGPRGRT GDSGPAGPPG PPGPPGPPGP PSGGYDFSFL PQPPQEKSQD
1210 1220 1230 1240 1250
GGRYYRADDA NVVRDRDLEV DTTLKSLSQQ IENIRSPEGS RKNPARTCRD
1260 1270 1280 1290 1300
LKMCHSDWKS GEYWIDPNQG CNLDAIKVYC NMETGQTCVF PTQPSVPQKN
1310 1320 1330 1340 1350
WYISPNPKEK KHVWFGESMT DGFQFEYGSE GSDPADVAIQ LTFLRLMSTE
1360 1370 1380 1390 1400
ASQNITYHCK NSVAYMDQQT GNLKKSLLLQ GSNEIELRGE GNSRFTYSTL
1410 1420 1430 1440 1450
VDGCTSHTGT WGKTVIEYKT TKTSRLPIID VAPLDIGAPD QEFGMDIGPA

CFV
Length:1,453
Mass (Da):137,953
Last modified:September 22, 2009 - v5
Checksum:iBCDDC40C3167AE59
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti143P → L in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti202A → G in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti209R → P in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti232E → Q AA sequence (PubMed:4327399).Curated1
Sequence conflicti268D → N AA sequence (PubMed:5411206).Curated1
Sequence conflicti286A → T in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti307S → T in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti313N → D AA sequence (PubMed:4335087).Curated1
Sequence conflicti421N → T in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti497A → S in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti773T → A in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti794P → A in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti958E → K in CAB01633 (PubMed:10065941).Curated1
Sequence conflicti1111S → P AA sequence (PubMed:4126850).Curated1
Sequence conflicti1163S → A AA sequence (PubMed:4126850).Curated1
Sequence conflicti1166A → S AA sequence (PubMed:4126850).Curated1
Sequence conflicti1187F → L AA sequence (PubMed:4636751).Curated1
Sequence conflicti1190L → F AA sequence (PubMed:4636751).Curated1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Z78279 mRNA. Translation: CAB01633.1.
CH473948 Genomic DNA. Translation: EDM05727.1.
BC133728 mRNA. Translation: AAI33729.1.
M11432 mRNA. Translation: AAA40832.1. Sequence problems.
PIRiA90559. CGRT1S.
RefSeqiNP_445756.1. NM_053304.1.
UniGeneiRn.2953.

Genome annotation databases

EnsembliENSRNOT00000005311; ENSRNOP00000005311; ENSRNOG00000003897.
GeneIDi29393.
KEGGirno:29393.
UCSCiRGD:61817. rat.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Z78279 mRNA. Translation: CAB01633.1.
CH473948 Genomic DNA. Translation: EDM05727.1.
BC133728 mRNA. Translation: AAI33729.1.
M11432 mRNA. Translation: AAA40832.1. Sequence problems.
PIRiA90559. CGRT1S.
RefSeqiNP_445756.1. NM_053304.1.
UniGeneiRn.2953.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3HQVfiber diffraction5.16A/C152-1207[»]
3HR2fiber diffraction5.16A/C152-1207[»]
ProteinModelPortaliP02454.
SMRiP02454.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi248045. 1 interactor.
DIPiDIP-36887N.
IntActiP02454. 2 interactors.
STRINGi10116.ENSRNOP00000005311.

PTM databases

iPTMnetiP02454.
PhosphoSitePlusiP02454.

Proteomic databases

PaxDbiP02454.
PRIDEiP02454.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSRNOT00000005311; ENSRNOP00000005311; ENSRNOG00000003897.
GeneIDi29393.
KEGGirno:29393.
UCSCiRGD:61817. rat.

Organism-specific databases

CTDi1277.
RGDi61817. Col1a1.

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410XNMM. LUCA.
GeneTreeiENSGT00840000129673.
HOGENOMiHOG000085654.
HOVERGENiHBG004933.
InParanoidiP02454.
KOiK06236.
OMAiPEACRIC.
OrthoDBiEOG091G03LV.
PhylomeDBiP02454.
TreeFamiTF344135.

Enzyme and pathway databases

ReactomeiR-RNO-114604. GPVI-mediated activation cascade.
R-RNO-1442490. Collagen degradation.
R-RNO-1474244. Extracellular matrix organization.
R-RNO-1650814. Collagen biosynthesis and modifying enzymes.
R-RNO-198933. Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.
R-RNO-2022090. Assembly of collagen fibrils and other multimeric structures.
R-RNO-202733. Cell surface interactions at the vascular wall.
R-RNO-216083. Integrin cell surface interactions.
R-RNO-2214320. Anchoring fibril formation.
R-RNO-2243919. Crosslinking of collagen fibrils.
R-RNO-3000171. Non-integrin membrane-ECM interactions.
R-RNO-3000178. ECM proteoglycans.
R-RNO-430116. GP1b-IX-V activation signalling.
R-RNO-75892. Platelet Adhesion to exposed collagen.
R-RNO-76009. Platelet Aggregation (Plug Formation).
R-RNO-8874081. MET activates PTK2 signaling.

Miscellaneous databases

EvolutionaryTraceiP02454.
PROiP02454.

Gene expression databases

BgeeiENSRNOG00000003897.
GenevisibleiP02454. RN.

Family and domain databases

InterProiIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_dom.
[Graphical view]
PfamiPF01410. COLFI. 1 hit.
PF01391. Collagen. 12 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomiPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEiPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCO1A1_RAT
AccessioniPrimary (citable) accession number: P02454
Secondary accession number(s): A3KNA1, P02455, Q63079
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: September 22, 2009
Last modified: November 30, 2016
This is version 151 of the entry and version 5 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.