Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P02454 (CO1A1_RAT) Reviewed, UniProtKB/Swiss-Prot

Last modified April 16, 2014. Version 126. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Collagen alpha-1(I) chain
Alternative name(s):
Alpha-1 type I collagen
Gene names
Name:Col1a1
OrganismRattus norvegicus (Rat) [Reference proteome]
Taxonomic identifier10116 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeRattus

Protein attributes

Sequence length1453 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Type I collagen is a member of group I collagen (fibrillar forming collagen).

Subunit structure

Trimers of one alpha 2(I) and two alpha 1(I) chains. Interacts with MRC2. Interacts with TRAM2 By similarity. Ref.15

Subcellular location

Secretedextracellular spaceextracellular matrix By similarity.

Tissue specificity

Forms the fibrils of tendon, ligaments and bones. In bones the fibrils are mineralized with calcium hydroxyapatite.

Domain

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function By similarity.

Post-translational modification

Proline residues at the third position of the tripeptide repeating unit (G-X-P) are hydroxylated in some or all of the chains. Proline residues at the second position of the tripeptide repeating unit (G-P-X) are hydroxylated in some of the chains.

O-linked glycan consists of a Glc-Gal disaccharide bound to the oxygen atom of a post-translationally added hydroxyl group.

Hydroxylation on proline residues within the sequence motif, GXPG, is most likely to be 4-hydroxy as this fits the requirement for 4-hydroxylation in vertebrates.

Sequence similarities

Belongs to the fibrillar collagen family.

Contains 1 fibrillar collagen NC1 domain.

Contains 1 VWFC domain.

Ontologies

Keywords
   Cellular componentExtracellular matrix
Secreted
   DomainCollagen
Repeat
Signal
   LigandCalcium
Metal-binding
   PTMDisulfide bond
Glycoprotein
Hydroxylation
Pyrrolidone carboxylic acid
   Technical term3D-structure
Complete proteome
Direct protein sequencing
Reference proteome
Gene Ontology (GO)
   Biological_processblood vessel development

Inferred from electronic annotation. Source: Ensembl

bone trabecula formation

Inferred from electronic annotation. Source: Ensembl

cartilage development involved in endochondral bone morphogenesis

Inferred from electronic annotation. Source: Ensembl

cellular response to amino acid stimulus

Inferred from electronic annotation. Source: Ensembl

cellular response to growth factor stimulus

Inferred from expression pattern PubMed 2016852. Source: RGD

cellular response to mechanical stimulus

Inferred from electronic annotation. Source: Ensembl

cellular response to retinoic acid

Inferred from expression pattern PubMed 1921334. Source: RGD

cellular response to transforming growth factor beta stimulus

Inferred from expression pattern PubMed 19362560. Source: UniProtKB

collagen biosynthetic process

Inferred from electronic annotation. Source: Ensembl

collagen fibril organization

Inferred from electronic annotation. Source: Ensembl

embryonic skeletal system development

Inferred from electronic annotation. Source: Ensembl

endochondral ossification

Inferred from electronic annotation. Source: Ensembl

face morphogenesis

Inferred from electronic annotation. Source: Ensembl

intramembranous ossification

Inferred from electronic annotation. Source: Ensembl

negative regulation of cell-substrate adhesion

Inferred from electronic annotation. Source: Ensembl

ossification

Inferred from expression pattern PubMed 14595530. Source: RGD

osteoblast differentiation

Inferred from electronic annotation. Source: Ensembl

positive regulation of canonical Wnt signaling pathway

Inferred from electronic annotation. Source: Ensembl

positive regulation of cell migration

Inferred from electronic annotation. Source: Ensembl

positive regulation of epithelial to mesenchymal transition

Inferred from electronic annotation. Source: Ensembl

positive regulation of transcription, DNA-templated

Inferred from electronic annotation. Source: Ensembl

protein heterotrimerization

Inferred from electronic annotation. Source: Ensembl

protein localization to nucleus

Inferred from electronic annotation. Source: Ensembl

protein transport

Inferred from electronic annotation. Source: Ensembl

response to cAMP

Inferred from expression pattern PubMed 10679825. Source: RGD

response to corticosteroid

Inferred from expression pattern PubMed 14993121. Source: RGD

response to estradiol

Inferred from expression pattern PubMed 2456904. Source: RGD

response to hydrogen peroxide

Inferred from expression pattern PubMed 10051504. Source: RGD

response to inorganic substance

Inferred from expression pattern PubMed 15552839. Source: RGD

response to mechanical stimulus

Inferred from expression pattern PubMed 11748224PubMed 18988763. Source: RGD

response to nutrient

Inferred from expression pattern PubMed 15042712. Source: RGD

response to peptide hormone

Inferred from expression pattern PubMed 10679825. Source: RGD

sensory perception of sound

Inferred from electronic annotation. Source: Ensembl

skin morphogenesis

Inferred from electronic annotation. Source: Ensembl

tooth mineralization

Inferred from electronic annotation. Source: Ensembl

visual perception

Inferred from electronic annotation. Source: Ensembl

wound healing

Inferred from mutant phenotype PubMed 20647009. Source: RGD

   Cellular_componentcollagen type I

Inferred from electronic annotation. Source: Ensembl

cytoplasm

Inferred from electronic annotation. Source: Ensembl

extracellular region

Traceable author statement. Source: Reactome

extracellular space

Inferred from direct assay PubMed 2659889. Source: RGD

   Molecular_functionextracellular matrix structural constituent

Inferred from electronic annotation. Source: Ensembl

metal ion binding

Inferred from electronic annotation. Source: UniProtKB-KW

Complete GO annotation...

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2222 Potential
Propeptide23 – 151129N-terminal propeptide
PRO_0000043358
Chain152 – 12071056Collagen alpha-1(I) chain
PRO_0000043359
Propeptide1208 – 1453246C-terminal propeptide By similarity
PRO_0000043360

Regions

Domain29 – 8759VWFC
Domain1218 – 1453236Fibrillar collagen NC1
Region152 – 16716Nonhelical region (N-terminal)
Region168 – 11811014Triple-helical region
Region1176 – 118611Major antigenic determinant (of neutral salt-extracted rat skin collagen)
Region1182 – 120726Nonhelical region (C-terminal)
Motif734 – 7363Cell attachment site Potential
Motif1082 – 10843Cell attachment site Potential

Sites

Metal binding12661Calcium By similarity
Metal binding12681Calcium By similarity
Metal binding12691Calcium; via carbonyl oxygen By similarity
Metal binding12711Calcium; via carbonyl oxygen By similarity
Metal binding12741Calcium By similarity

Amino acid modifications

Modified residue1521Pyrrolidone carboxylic acid Probable
Modified residue1601Allysine
Modified residue17914-hydroxyproline Probable
Modified residue18214-hydroxyproline Probable
Modified residue18514-hydroxyproline Probable
Modified residue19414-hydroxyproline Probable
Modified residue19714-hydroxyproline Probable
Modified residue20014-hydroxyproline Probable
Modified residue25415-hydroxylysine
Modified residue57515-hydroxylysine Probable
Modified residue69815-hydroxylysine Probable
Modified residue115313-hydroxyproline By similarity
Glycosylation561N-linked (GlcNAc...) Potential
Glycosylation2541O-linked (Gal...)
Glycosylation13541N-linked (GlcNAc...) Potential
Disulfide bond1248 ↔ 1280 By similarity
Disulfide bond1254Interchain (with C-1271) By similarity
Disulfide bond1271Interchain (with C-1254) By similarity
Disulfide bond1288 ↔ 1451 By similarity
Disulfide bond1359 ↔ 1404 By similarity

Experimental info

Sequence conflict1431P → L in CAB01633. Ref.1
Sequence conflict2021A → G in CAB01633. Ref.1
Sequence conflict2091R → P in CAB01633. Ref.1
Sequence conflict2321E → Q AA sequence Ref.7
Sequence conflict2681D → N AA sequence Ref.8
Sequence conflict2861A → T in CAB01633. Ref.1
Sequence conflict3071S → T in CAB01633. Ref.1
Sequence conflict3131N → D AA sequence Ref.9
Sequence conflict4211N → T in CAB01633. Ref.1
Sequence conflict4971A → S in CAB01633. Ref.1
Sequence conflict7731T → A in CAB01633. Ref.1
Sequence conflict7941P → A in CAB01633. Ref.1
Sequence conflict9581E → K in CAB01633. Ref.1
Sequence conflict11111S → P AA sequence Ref.13
Sequence conflict11631S → A AA sequence Ref.13
Sequence conflict11661A → S AA sequence Ref.13
Sequence conflict11871F → L AA sequence Ref.14
Sequence conflict11901L → F AA sequence Ref.14

Sequences

Sequence LengthMass (Da)Tools
P02454 [UniParc].

Last modified September 22, 2009. Version 5.
Checksum: BCDDC40C3167AE59

FASTA1,453137,953
        10         20         30         40         50         60 
MFSFVDLRLL LLLGATALLT HGQEDIPEVS CIHNGLRVPN GETWKPDVCL ICICHNGTAV 

        70         80         90        100        110        120 
CDGVLCKEDL DCPNPQKREG ECCPFCPEEY VSPDAEVIGV EGPKGDPGPQ GPRGPVGPPG 

       130        140        150        160        170        180 
QDGIPGQPGL PGPPGPPGPP GPPGLGGNFA SQMSYGYDEK SAGVSVPGPM GPSGPRGLPG 

       190        200        210        220        230        240 
PPGAPGPQGF QGPPGEPGEP GASGPMGPRG PPGPPGKNGD DGEAGKPGRP GERGPPGPQG 

       250        260        270        280        290        300 
ARGLPGTAGL PGMKGHRGFS GLDGAKGDTG PAGPKGEPGS PGENGAPGQM GPRGLPGERG 

       310        320        330        340        350        360 
RPGPPGSAGA RGNDGAVGAA GPPGPTGPTG PPGFPGAAGA KGEAGPQGAR GSEGPQGVRG 

       370        380        390        400        410        420 
EPGPPGPAGA AGPAGNPGAD GQPGAKGANG APGIAGAPGF PGARGPSGPQ GPSGAPGPKG 

       430        440        450        460        470        480 
NSGEPGAPGN KGDTGAKGEP GPAGVQGPPG PAGEEGKRGA RGEPGPSGLP GPPGERGGPG 

       490        500        510        520        530        540 
SRGFPGADGV AGPKGPAGER GSPGPAGPKG SPGEAGRPGE AGLPGAKGLT GSPGSPGPDG 

       550        560        570        580        590        600 
KTGPPGPAGQ DGRPGPAGPP GARGQAGVMG FPGPKGTAGE PGKAGERGVP GPPGAVGPAG 

       610        620        630        640        650        660 
KDGEAGAQGA PGPAGPAGER GEQGPAGSPG FQGLPGPAGP PGEAGKPGEQ GVPGDLGAPG 

       670        680        690        700        710        720 
PSGARGERGF PGERGVQGPP GPAGPRGNNG APGNDGAKGD TGAPGAPGSQ GAPGLQGMPG 

       730        740        750        760        770        780 
ERGAAGLPGP KGDRGDAGPK GADGSPGKDG VRGLTGPIGP PGPAGAPGDK GETGPSGPAG 

       790        800        810        820        830        840 
PTGARGAPGD RGEPGPPGPA GFAGPPGADG QPGAKGEPGD TGVKGDAGPP GPAGPAGPPG 

       850        860        870        880        890        900 
PIGNVGAPGP KGSRGAAGPP GATGFPGAAG RVGPPGPSGN AGPPGPPGPV GKEGGKGPRG 

       910        920        930        940        950        960 
ETGPAGRPGE VGPPGPPGPA GEKGSPGADG PAGSPGTPGP QGIAGQRGVV GLPGQRGERG 

       970        980        990       1000       1010       1020 
FPGLPGPSGE PGKQGPSGAS GERGPPGPMG PPGLAGPPGE SGREGSPGAE GSPGRDGAPG 

      1030       1040       1050       1060       1070       1080 
AKGDRGETGP AGPPGAPGAP GAPGPVGPAG KNGDRGETGP AGPAGPIGPA GARGPAGPQG 

      1090       1100       1110       1120       1130       1140 
PRGDKGETGE QGDRGIKGHR GFSGLQGPPG SPGSPGEQGP SGASGPAGPR GPPGSAGSPG 

      1150       1160       1170       1180       1190       1200 
KDGLNGLPGP IGPPGPRGRT GDSGPAGPPG PPGPPGPPGP PSGGYDFSFL PQPPQEKSQD 

      1210       1220       1230       1240       1250       1260 
GGRYYRADDA NVVRDRDLEV DTTLKSLSQQ IENIRSPEGS RKNPARTCRD LKMCHSDWKS 

      1270       1280       1290       1300       1310       1320 
GEYWIDPNQG CNLDAIKVYC NMETGQTCVF PTQPSVPQKN WYISPNPKEK KHVWFGESMT 

      1330       1340       1350       1360       1370       1380 
DGFQFEYGSE GSDPADVAIQ LTFLRLMSTE ASQNITYHCK NSVAYMDQQT GNLKKSLLLQ 

      1390       1400       1410       1420       1430       1440 
GSNEIELRGE GNSRFTYSTL VDGCTSHTGT WGKTVIEYKT TKTSRLPIID VAPLDIGAPD 

      1450 
QEFGMDIGPA CFV 

« Hide

References

« Hide 'large scale' references
[1]"Expression of collagen alpha1(I) mRNA variants during tooth and bone formation in the rat."
Brandsten C., Lundmark C., Christersson C., Hammarstroem L., Wurtz T.
J. Dent. Res. 78:11-19(1999) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA].
Strain: Sprague-Dawley.
Tissue: Bone and Tooth.
[2]Mural R.J., Adams M.D., Myers E.W., Smith H.O., Venter J.C.
Submitted (JUL-2005) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
Strain: Brown Norway.
[3]"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
The MGC Project Team
Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
Strain: Brown Norway.
Tissue: Lung.
[4]"Comparative sequence studies of rat skin and tendon collagen. II. The absence of a short sequence at the amino terminus of the skin alpha-1 chain."
Bornstein P.
Biochemistry 8:63-71(1969) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 152-170.
[5]"The amino acid sequence of peptides from the cross-linking region of rat skin collagen."
Kang A.H., Bornstein P., Piez K.A.
Biochemistry 6:788-795(1967) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 156-170.
[6]"The incomplete hydroxylation of individual prolyl residues in collagen."
Bornstein P.
J. Biol. Chem. 242:2572-2574(1967) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 171-206.
[7]"Chemical studies on the cyanogen bromide peptides of rat skin collagen. Amino acid sequence of alpha 1-CB4."
Butler W.T., Ponds S.L.
Biochemistry 10:2076-2081(1971) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 207-253.
[8]"Chemical studies on the cyanogen bromide peptides of rat skin collagen. The covalent structure of alpha 1-CB5, the major hexose-containing cyanogen bromide peptide of alpha 1."
Butler W.T.
Biochemistry 9:44-50(1970) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 254-290.
[9]"Structure of rat skin collagen alpha 1-CB8. Amino acid sequence of the hydroxylamine-produced fragment HA1."
Balian G., Click E.M., Bornstein P.
Biochemistry 10:4470-4478(1971) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 291-389.
[10]"Structure of rat skin collagen alpha 1-CBB. Amino acid sequence of the hydroxyl amine-produced fragment HA2."
Balian G., Click E.M., Hermodson M.A., Bornstein P.
Biochemistry 11:3798-3806(1972) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 390-569.
[11]"Chemical studies on the cyanogen bromide peptides of rat skin collagen. Amino acid sequence of alpha 1-CB3."
Butler W.T., Underwood S.P., Finch J.E. Jr.
Biochemistry 13:2946-2953(1974) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 570-718.
[12]"Construction of DNA sequences complementary to rat alpha 1 and alpha 2 collagen mRNA and their use in studying the regulation of type I collagen synthesis by 1,25-dihydroxyvitamin D."
Genovese C., Rowe D., Kream B.
Biochemistry 23:6210-6216(1984) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 680-718.
[13]"Structural and immunogenic properties of a major antigenic determinant in neutral salt-extracted rat-skin collagen."
Stoltz M., Timpl R., Furthmayr H., Kuehn K.
Eur. J. Biochem. 37:287-294(1973) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 1103-1186.
[14]"Non-helical regions in rat collagen alpha 1-chain."
Stoltz M., Timpl R., Kuehn K.
FEBS Lett. 26:61-65(1972) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEIN SEQUENCE OF 1186-1206.
[15]"Endo180 binds to the C-terminal region of type I collagen."
Thomas E.K., Nakamura M., Wienke D., Isacke C.M., Pozzi A., Liang P.
J. Biol. Chem. 280:22596-22605(2005) [PubMed] [Europe PMC] [Abstract]
Cited for: INTERACTION WITH MRC2.
Strain: Sprague-Dawley.
[16]"A new procedure for rapid, high yield purification of Type I collagen for tissue engineering."
Xiong X., Ghosh R., Hiller E., Drepper F., Knapp B., Brunner H., Rupp S.
Process Biochem. 44:1200-1212(2009)
Cited for: IDENTIFICATION BY MASS SPECTROMETRY, PHOSPHORYLATION.
[17]"Microfibrillar structure of type I collagen in situ."
Orgel J.P.R.O., Irving T.C., Miller A., Wess T.J.
Proc. Natl. Acad. Sci. U.S.A. 103:9001-9005(2006) [PubMed] [Europe PMC] [Abstract]
Cited for: X-RAY CRYSTALLOGRAPHY (5.16 ANGSTROMS) OF 152-1207.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
Z78279 mRNA. Translation: CAB01633.1.
CH473948 Genomic DNA. Translation: EDM05727.1.
BC133728 mRNA. Translation: AAI33729.1.
M11432 mRNA. Translation: AAA40832.1. Sequence problems.
PIRCGRT1S. A90559.
RefSeqNP_445756.1. NM_053304.1.
UniGeneRn.2953.

3D structure databases

PDBe
RCSB PDB
PDBj
EntryMethodResolution (Å)ChainPositionsPDBsum
3HQVfiber diffraction5.16A/C152-1207[»]
3HR2fiber diffraction5.16A/C152-1207[»]
ProteinModelPortalP02454.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

IntActP02454. 2 interactions.

Proteomic databases

PaxDbP02454.
PRIDEP02454.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENSRNOT00000005311; ENSRNOP00000005311; ENSRNOG00000003897.
GeneID29393.
KEGGrno:29393.
UCSCRGD:61817. rat.

Organism-specific databases

CTD1277.
RGD61817. Col1a1.

Phylogenomic databases

eggNOGNOG12793.
GeneTreeENSGT00740000114967.
HOGENOMHOG000085654.
HOVERGENHBG004933.
InParanoidA3KNA1.
KOK06236.
OMAGSTMGTD.
OrthoDBEOG7TJ3HH.
PhylomeDBP02454.
TreeFamTF344135.

Enzyme and pathway databases

ReactomeREACT_150387. Gelatin degradation by MMP19.
REACT_195275. Extracellular matrix organization.
REACT_196873. Extracellular matrix organization.

Gene expression databases

GenevestigatorP02454.

Family and domain databases

InterProIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_C.
[Graphical view]
PfamPF01410. COLFI. 1 hit.
PF01391. Collagen. 12 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

EvolutionaryTraceP02454.
NextBio609017.
PROP02454.

Entry information

Entry nameCO1A1_RAT
AccessionPrimary (citable) accession number: P02454
Secondary accession number(s): A3KNA1, P02455, Q63079
Entry history
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: September 22, 2009
Last modified: April 16, 2014
This is version 126 of the entry and version 5 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families

PDB cross-references

Index of Protein Data Bank (PDB) cross-references