Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P11087 (CO1A1_MOUSE) Reviewed, UniProtKB/Swiss-Prot

Last modified April 16, 2014. Version 145. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Collagen alpha-1(I) chain
Alternative name(s):
Alpha-1 type I collagen
Gene names
Name:Col1a1
Synonyms:Cola1
OrganismMus musculus (Mouse) [Reference proteome]
Taxonomic identifier10090 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus

Protein attributes

Sequence length1453 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Type I collagen is a member of group I collagen (fibrillar forming collagen).

Subunit structure

Trimers of one alpha 2(I) and two alpha 1(I) chains. Interacts with MRC2 By similarity. Interacts with TRAM2 By similarity.

Subcellular location

Secretedextracellular spaceextracellular matrix By similarity.

Tissue specificity

Forms the fibrils of tendon, ligaments and bones. In bones the fibrils are mineralized with calcium hydroxyapatite.

Domain

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function By similarity.

Post-translational modification

Proline residues at the third position of the tripeptide repeating unit (G-X-P) are hydroxylated in some or all of the chains. Proline residues at the second position of the tripeptide repeating unit (G-P-X) are hydroxylated in some of the chains.

Sequence similarities

Belongs to the fibrillar collagen family.

Contains 1 fibrillar collagen NC1 domain.

Contains 1 VWFC domain.

Sequence caution

The sequence CAA38657.1 differs from that shown. Reason: Erroneous gene model prediction.

Ontologies

Keywords
   Cellular componentExtracellular matrix
Secreted
   Coding sequence diversityAlternative splicing
   DomainCollagen
Repeat
Signal
   LigandCalcium
Metal-binding
   PTMDisulfide bond
Glycoprotein
Hydroxylation
Pyrrolidone carboxylic acid
   Technical termComplete proteome
Reference proteome
Gene Ontology (GO)
   Biological_processblood vessel development

Inferred from mutant phenotype PubMed 14630726. Source: MGI

bone trabecula formation

Inferred from genetic interaction PubMed 17440987. Source: MGI

cartilage development involved in endochondral bone morphogenesis

Inferred from mutant phenotype PubMed 18248096. Source: MGI

cellular response to amino acid stimulus

Inferred from direct assay PubMed 20548288. Source: MGI

cellular response to mechanical stimulus

Inferred from expression pattern PubMed 21625049. Source: UniProtKB

cellular response to retinoic acid

Inferred from electronic annotation. Source: Ensembl

cellular response to transforming growth factor beta stimulus

Inferred from electronic annotation. Source: Ensembl

collagen biosynthetic process

Inferred from electronic annotation. Source: Ensembl

collagen fibril organization

Inferred from electronic annotation. Source: Ensembl

embryonic skeletal system development

Inferred from electronic annotation. Source: Ensembl

endochondral ossification

Inferred from mutant phenotype PubMed 8130375. Source: MGI

face morphogenesis

Inferred from genetic interaction PubMed 17440987. Source: MGI

intramembranous ossification

Inferred from genetic interaction PubMed 17440987. Source: MGI

negative regulation of cell-substrate adhesion

Inferred from direct assay PubMed 17018525. Source: MGI

osteoblast differentiation

Inferred from expression pattern PubMed 16311053. Source: BHF-UCL

positive regulation of canonical Wnt signaling pathway

Inferred from electronic annotation. Source: Ensembl

positive regulation of cell migration

Inferred from electronic annotation. Source: Ensembl

positive regulation of epithelial to mesenchymal transition

Inferred from electronic annotation. Source: Ensembl

positive regulation of transcription, DNA-templated

Inferred from electronic annotation. Source: Ensembl

protein heterotrimerization

Inferred from direct assay PubMed 9213002. Source: MGI

protein localization to nucleus

Inferred from electronic annotation. Source: Ensembl

protein transport

Inferred from mutant phenotype PubMed 8906420. Source: MGI

response to cAMP

Inferred from electronic annotation. Source: Ensembl

response to corticosteroid

Inferred from electronic annotation. Source: Ensembl

response to estradiol

Inferred from electronic annotation. Source: Ensembl

response to hydrogen peroxide

Inferred from electronic annotation. Source: Ensembl

response to nutrient

Inferred from electronic annotation. Source: Ensembl

response to peptide hormone

Inferred from electronic annotation. Source: Ensembl

sensory perception of sound

Inferred from electronic annotation. Source: Ensembl

skeletal system development

Inferred from mutant phenotype PubMed 8163675. Source: MGI

skeletal system morphogenesis

Inferred from genetic interaction PubMed 17440987. Source: MGI

skin development

Inferred from mutant phenotype PubMed 18248096. Source: MGI

skin morphogenesis

Inferred from electronic annotation. Source: Ensembl

tooth mineralization

Inferred from electronic annotation. Source: Ensembl

visual perception

Inferred from electronic annotation. Source: Ensembl

wound healing

Inferred from electronic annotation. Source: Ensembl

   Cellular_componentcollagen

Inferred from direct assay PubMed 10608859PubMed 15383546PubMed 17662583PubMed 7704020PubMed 8686743. Source: MGI

collagen type I

Inferred from direct assay PubMed 8018053. Source: MGI

cytoplasm

Inferred from direct assay PubMed 8018053. Source: MGI

extracellular matrix

Inferred from direct assay PubMed 17029294PubMed 2209468PubMed 7573384PubMed 7589890PubMed 7629088PubMed 8906420PubMed 8984825. Source: MGI

extracellular space

Inferred from electronic annotation. Source: Ensembl

proteinaceous extracellular matrix

Inferred from direct assay PubMed 10878613. Source: MGI

   Molecular_functionextracellular matrix structural constituent

Inferred from direct assay PubMed 15383546. Source: MGI

metal ion binding

Inferred from electronic annotation. Source: UniProtKB-KW

Complete GO annotation...

Alternative products

This entry describes 2 isoforms produced by alternative splicing. [Align] [Select]
Isoform 1 (identifier: P11087-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform 2 (identifier: P11087-2)

The sequence of this isoform differs from the canonical sequence as follows:
     803-1030: Missing.
Note: No experimental confirmation available.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2222
Propeptide23 – 151129N-terminal propeptide
PRO_0000005722
Chain152 – 12071056Collagen alpha-1(I) chain
PRO_0000005723
Propeptide1208 – 1453246C-terminal propeptide
PRO_0000005724

Regions

Domain29 – 8759VWFC
Domain1218 – 1453236Fibrillar collagen NC1
Region152 – 16716Nonhelical region (N-terminal)
Region168 – 11811014Triple-helical region
Region1182 – 120726Nonhelical region (C-terminal)
Motif734 – 7363Cell attachment site Potential
Motif1082 – 10843Cell attachment site Potential

Sites

Metal binding12661Calcium By similarity
Metal binding12681Calcium By similarity
Metal binding12691Calcium; via carbonyl oxygen By similarity
Metal binding12711Calcium; via carbonyl oxygen By similarity
Metal binding12741Calcium By similarity

Amino acid modifications

Modified residue1521Pyrrolidone carboxylic acid By similarity
Modified residue1601Allysine By similarity
Modified residue25415-hydroxylysine; alternate By similarity
Modified residue115313-hydroxyproline By similarity
Glycosylation561N-linked (GlcNAc...) Potential
Glycosylation2541O-linked (Gal...); alternate By similarity
Glycosylation13541N-linked (GlcNAc...) Ref.12
Disulfide bond1248 ↔ 1280 By similarity
Disulfide bond1254Interchain (with C-1271) By similarity
Disulfide bond1271Interchain (with C-1254) By similarity
Disulfide bond1288 ↔ 1451 By similarity
Disulfide bond1359 ↔ 1404 By similarity

Natural variations

Alternative sequence803 – 1030228Missing in isoform 2.
VSP_016548

Experimental info

Sequence conflict811E → G in AAA88912. Ref.1
Sequence conflict1061D → G in AAA88912. Ref.1
Sequence conflict1361P → H in AAH59281. Ref.3
Sequence conflict12021G → D in AAA88912. Ref.1
Sequence conflict12191E → A in AAA88912. Ref.1
Sequence conflict12221T → A in AAA88912. Ref.1
Sequence conflict13351A → T in AAA88912. Ref.1
Sequence conflict1399 – 14002TL → RV in AAA88912. Ref.1
Sequence conflict14501A → V in CAA29927. Ref.10
Sequence conflict14501A → V in CAA33904. Ref.10

Sequences

Sequence LengthMass (Da)Tools
Isoform 1 [UniParc].

Last modified December 6, 2005. Version 4.
Checksum: 0B7F06BBB9A1D5EA

FASTA1,453138,032
        10         20         30         40         50         60 
MFSFVDLRLL LLLGATALLT HGQEDIPEVS CIHNGLRVPN GETWKPEVCL ICICHNGTAV 

        70         80         90        100        110        120 
CDDVQCNEEL DCPNPQRREG ECCAFCPEEY VSPNSEDVGV EGPKGDPGPQ GPRGPVGPPG 

       130        140        150        160        170        180 
RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA SQMSYGYDEK SAGVSVPGPM GPSGPRGLPG 

       190        200        210        220        230        240 
PPGAPGPQGF QGPPGEPGEP GGSGPMGPRG PPGPPGKNGD DGEAGKPGRP GERGPPGPQG 

       250        260        270        280        290        300 
ARGLPGTAGL PGMKGHRGFS GLDGAKGDAG PAGPKGEPGS PGENGAPGQM GPRGLPGERG 

       310        320        330        340        350        360 
RPGPPGTAGA RGNDGAVGAA GPPGPTGPTG PPGFPGAVGA KGEAGPQGAR GSEGPQGVRG 

       370        380        390        400        410        420 
EPGPPGPAGA AGPAGNPGAD GQPGAKGANG APGIAGAPGF PGARGPSGPQ GPSGPPGPKG 

       430        440        450        460        470        480 
NSGEPGAPGN KGDTGAKGEP GATGVQGPPG PAGEEGKRGA RGEPGPSGLP GPPGERGGPG 

       490        500        510        520        530        540 
SRGFPGADGV AGPKGPSGER GAPGPAGPKG SPGEAGRPGE AGLPGAKGLT GSPGSPGPDG 

       550        560        570        580        590        600 
KTGPPGPAGQ DGRPGPAGPP GARGQAGVMG FPGPKGTAGE PGKAGERGLP GPPGAVGPAG 

       610        620        630        640        650        660 
KDGEAGAQGA PGPAGPAGER GEQGPAGSPG FQGLPGPAGP PGEAGKPGEQ GVPGDLGAPG 

       670        680        690        700        710        720 
PSGARGERGF PGERGVQGPP GPAGPRGNNG APGNDGAKGD TGAPGAPGSQ GAPGLQGMPG 

       730        740        750        760        770        780 
ERGAAGLPGP KGDRGDAGPK GADGSPGKDG ARGLTGPIGP PGPAGAPGDK GEAGPSGPPG 

       790        800        810        820        830        840 
PTGARGAPGD RGEAGPPGPA GFAGPPGADG QPGAKGEPGD TGVKGDAGPP GPAGPAGPPG 

       850        860        870        880        890        900 
PIGNVGAPGP KGPRGAAGPP GATGFPGAAG RVGPPGPSGN AGPPGPPGPV GKEGGKGPRG 

       910        920        930        940        950        960 
ETGPAGRPGE VGPPGPPGPA GEKGSPGADG PAGSPGTPGP QGIAGQRGVV GLPGQRGERG 

       970        980        990       1000       1010       1020 
FPGLPGPSGE PGKQGPSGSS GERGPPGPMG PPGLAGPPGE SGREGSPGAE GSPGRDGAPG 

      1030       1040       1050       1060       1070       1080 
AKGDRGETGP AGPPGAPGAP GAPGPVGPAG KNGDRGETGP AGPAGPIGPA GARGPAGPQG 

      1090       1100       1110       1120       1130       1140 
PRGDKGETGE QGDRGIKGHR GFSGLQGPPG SPGSPGEQGP SGASGPAGPR GPPGSAGSPG 

      1150       1160       1170       1180       1190       1200 
KDGLNGLPGP IGPPGPRGRT GDSGPAGPPG PPGPPGPPGP PSGGYDFSFL PQPPQEKSQD 

      1210       1220       1230       1240       1250       1260 
GGRYYRADDA NVVRDRDLEV DTTLKSLSQQ IENIRSPEGS RKNPARTCRD LKMCHSDWKS 

      1270       1280       1290       1300       1310       1320 
GEYWIDPNQG CNLDAIKVYC NMETGQTCVF PTQPSVPQKN WYISPNPKEK KHVWFGESMT 

      1330       1340       1350       1360       1370       1380 
DGFPFEYGSE GSDPADVAIQ LTFLRLMSTE ASQNITYHCK NSVAYMDQQT GNLKKALLLQ 

      1390       1400       1410       1420       1430       1440 
GSNEIELRGE GNSRFTYSTL VDGCTSHTGT WGKTVIEYKT TKTSRLPIID VAPLDIGAPD 

      1450 
QEFGLDIGPA CFV 

« Hide

Isoform 2 [UniParc].

Checksum: 5EBF34F79AD32E86
Show »

FASTA1,225117,820

References

« Hide 'large scale' references
[1]"The complete cDNA coding sequence for the mouse pro alpha 1(I) chain of type I procollagen."
Li S.W., Khillan J., Prockop D.J.
Matrix Biol. 14:593-595(1995) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
Strain: FVB/N.
[2]"Lineage-specific biology revealed by a finished genome assembly of the mouse."
Church D.M., Goodstadt L., Hillier L.W., Zody M.C., Goldstein S., She X., Bult C.J., Agarwala R., Cherry J.L., DiCuccio M., Hlavina W., Kapustin Y., Meric P., Maglott D., Birtle Z., Marques A.C., Graves T., Zhou S. expand/collapse author list , Teague B., Potamousis K., Churas C., Place M., Herschleb J., Runnheim R., Forrest D., Amos-Landgraf J., Schwartz D.C., Cheng Z., Lindblad-Toh K., Eichler E.E., Ponting C.P.
PLoS Biol. 7:E1000112-E1000112(2009) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
Strain: C57BL/6J.
[3]"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
The MGC Project Team
Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 2).
Strain: FVB/N.
Tissue: Colon.
[4]"Insertion of retrovirus into the first intron of alpha1(I) collagen gene leads to embryonic lethal mutation in mice."
Harbers K., Kuehn M., Delius H., Jaenisch R.
Proc. Natl. Acad. Sci. U.S.A. 81:1504-1508(1984) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1-25.
[5]"Genomic sequence of mouse COL1A1 encoding the collagen propeptides."
Fenton S.P., Lamande S.R., Hannagan M., Stacey A., Jaenisch R., Bateman J.F.
Biochim. Biophys. Acta 1216:469-474(1993) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1-185 AND 1030-1453.
[6]"DNA methylation represses the murine alpha 1(I) collagen promoter by an indirect mechanism."
Rhodes K., Rippe R.A., Umezawa A., Nehls M., Brenner D.A., Breindl M.
Mol. Cell. Biol. 14:5950-5960(1994) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1-942.
Strain: C57BL/6.
Tissue: Liver.
[7]"Nucleotide sequence of a cDNA clone for mouse pro alpha 1(I) collagen protein."
French B.T., Lee W.-H., Maul G.G.
Gene 39:311-312(1985) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 518-1128 (ISOFORM 1).
[8]"DNA sequence analysis of a mouse pro alpha 1 (I) procollagen gene: evidence for a mouse B1 element within the gene."
Monson J.M., Friedman J., McCarthy B.J.
Mol. Cell. Biol. 2:1362-1371(1982) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 735-1130.
[9]"Identification of a Balb/c mouse pro alpha 1(I) procollagen gene: evidence for insertions or deletions in gene coding sequences."
Monson J.M., McCarthy B.J.
DNA 1:59-69(1981) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 735-878 AND 1005-1058.
[10]"Two mRNAs of mouse pro alpha 1(I) collagen gene differ in the size of the 3'-untranslated region."
Mooslehner K., Harbers K.
Nucleic Acids Res. 16:773-773(1988) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1442-1453.
[11]"Specific hybridization probes for mouse type I, II, III and IX collagen mRNAs."
Metsaeranta M., Toman D., de Crombrugghe B., Vuorio E.
Biochim. Biophys. Acta 1089:241-243(1991) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1442-1453.
[12]"Enhanced analysis of the mouse plasma proteome using cysteine-containing tryptic glycopeptides."
Bernhard O.K., Kapp E.A., Simpson R.J.
J. Proteome Res. 6:987-995(2007) [PubMed] [Europe PMC] [Abstract]
Cited for: GLYCOSYLATION [LARGE SCALE ANALYSIS] AT ASN-1354.
Strain: C57BL/6.
Tissue: Plasma.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
U08020 mRNA. Translation: AAA88912.1.
AL662790, AL606480 Genomic DNA. Translation: CAI25880.1.
AL606480, AL662790 Genomic DNA. Translation: CAI23970.1.
BC050014 mRNA. Translation: AAH50014.1.
BC059281 mRNA. Translation: AAH59281.1.
K01688 Genomic DNA. Translation: AAA37330.1.
S67530 Genomic DNA. Translation: AAB29424.1.
S67482 Genomic DNA. No translation available.
X54876 Genomic DNA. Translation: CAA38657.1. Sequence problems.
M14423 mRNA. Translation: AAA37333.1.
M17491 Genomic DNA. Translation: AAA37334.1.
K03036 expand/collapse EMBL AC list , K03029, K03030, K03031, K03032, K03033, K03034, K03035 Genomic DNA. Translation: AAA37332.1.
X06753 Genomic DNA. Translation: CAA29927.1.
X15896 Genomic DNA. Translation: CAA33904.1.
X57981 Genomic DNA. Translation: CAA41046.1.
PIRI49558.
S21626. S57243.
RefSeqNP_031768.2. NM_007742.3.
UniGeneMm.277735.
Mm.458212.

3D structure databases

ProteinModelPortalP11087.
SMRP11087. Positions 30-89, 184-220, 1236-1453.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

BioGrid198831. 7 interactions.
IntActP11087. 2 interactions.
MINTMINT-4091294.
STRING10090.ENSMUSP00000001547.

PTM databases

PhosphoSiteP11087.

Proteomic databases

PaxDbP11087.
PRIDEP11087.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENSMUST00000001547; ENSMUSP00000001547; ENSMUSG00000001506. [P11087-1]
GeneID12842.
KEGGmmu:12842.
UCSCuc007kzn.1. mouse. [P11087-1]

Organism-specific databases

CTD1277.
MGIMGI:88467. Col1a1.

Phylogenomic databases

eggNOGNOG12793.
GeneTreeENSGT00740000114967.
HOVERGENHBG004933.
InParanoidP11087.
KOK06236.
OMAKNCPGAQ.
OrthoDBEOG7TJ3HH.
PhylomeDBP11087.
TreeFamTF344135.

Gene expression databases

ArrayExpressP11087.
BgeeP11087.
CleanExMM_COL1A1.
GenevestigatorP11087.

Family and domain databases

InterProIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_C.
[Graphical view]
PfamPF01410. COLFI. 1 hit.
PF01391. Collagen. 9 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

ChiTaRSCOL1A1. mouse.
NextBio282376.
PMAP-CutDBP11087.
PROP11087.
SOURCESearch...

Entry information

Entry nameCO1A1_MOUSE
AccessionPrimary (citable) accession number: P11087
Secondary accession number(s): Q53WT0 expand/collapse secondary AC list , Q60635, Q61367, Q61427, Q63919, Q6PCL3, Q810J9
Entry history
Integrated into UniProtKB/Swiss-Prot: July 1, 1989
Last sequence update: December 6, 2005
Last modified: April 16, 2014
This is version 145 of the entry and version 4 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families

MGD cross-references

Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot