Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

Q9YIB4 (CO1A1_CYNPY) Reviewed, UniProtKB/Swiss-Prot

Last modified July 9, 2014. Version 56. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (1) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Collagen alpha-1(I) chain
Alternative name(s):
Alpha-1 type I collagen
Gene names
Name:COL1A1
OrganismCynops pyrrhogaster (Japanese common newt)
Taxonomic identifier8330 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiAmphibiaBatrachiaCaudataSalamandroideaSalamandridaePleurodelinaeCynops

Protein attributes

Sequence length1450 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at transcript level

General annotation (Comments)

Function

Type I collagen is a member of group I collagen (fibrillar forming collagen) By similarity.

Subunit structure

Trimers of one alpha 2(I) and two alpha 1(I) chains By similarity.

Subcellular location

Secretedextracellular spaceextracellular matrix By similarity.

Domain

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function By similarity.

Post-translational modification

Proline residues at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains By similarity.

Sequence similarities

Belongs to the fibrillar collagen family.

Contains 1 fibrillar collagen NC1 domain.

Contains 1 VWFC domain.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2222 Potential
Propeptide23 – 148126N-terminal propeptide By similarity
PRO_0000286160
Chain149 – 12021054Collagen alpha-1(I) chain
PRO_0000286161
Propeptide1203 – 1450248C-terminal propeptide By similarity
PRO_0000286162

Regions

Domain31 – 9060VWFC
Domain1215 – 1450236Fibrillar collagen NC1

Sites

Metal binding12631Calcium By similarity
Metal binding12651Calcium By similarity
Metal binding12661Calcium; via carbonyl oxygen By similarity
Metal binding12681Calcium; via carbonyl oxygen By similarity
Metal binding12711Calcium By similarity

Amino acid modifications

Modified residue1491Pyrrolidone carboxylic acid By similarity
Glycosylation13511N-linked (GlcNAc...) Potential
Disulfide bond1245 ↔ 1277 By similarity
Disulfide bond1251Interchain (with C-1268) By similarity
Disulfide bond1268Interchain (with C-1251) By similarity
Disulfide bond1285 ↔ 1448 By similarity
Disulfide bond1356 ↔ 1401 By similarity

Sequences

Sequence LengthMass (Da)Tools
Q9YIB4 [UniParc].

Last modified May 1, 1999. Version 1.
Checksum: ABF8A74841B87B7C

FASTA1,450137,563
        10         20         30         40         50         60 
MFSFVDNRLL VLLAACVLLV RALDQEDIES GLCHQEGTTY SDKDVWKPEP CVICVCDNGN 

        70         80         90        100        110        120 
IMCDDVTCGD YPVDCPNAEI PFGECCPVCP DGDGTSYSEQ TGVEGPKGEV GPKGDRGLPG 

       130        140        150        160        170        180 
PPGRDGNPGL PGPPGPPGPP GLGGNFAPQM SYGYDEKSAG ISVPGPMGPM GPRGPPGPSG 

       190        200        210        220        230        240 
SPGPQGFQGP SGEPGEPGAA GALGPRGLPG PPGKNGDDGE SGKPGRPGER GPSGPQGARG 

       250        260        270        280        290        300 
LPGTAGLPGM KGHRGFNGLD GAKGDNGPAG PKGEPGNPGE NGAPGQAGPR GLPGERGRPG 

       310        320        330        340        350        360 
APGPAGARGN DGSPGAAGPP GPTGPTGPPG FPGAVGAKGD AGPQGSRGSE GPQGARGEPG 

       370        380        390        400        410        420 
APGPAGAAGP SGNPGTDGQP GGKGATGSPG IAGAPGFPGA RGAPGPQGPA GAPGPKGNNG 

       430        440        450        460        470        480 
EPGAQGNKGE PGAKGEPGPA GVQGPPGPSG EEGKRGSRGE PGPAGPPGPA GERGGPGSRG 

       490        500        510        520        530        540 
FPGSDGASGP KGAPGERGSV GPAGPKGSTG ESGRPGEPGL PGAKGLTGSP GSPGPDGKTG 

       550        560        570        580        590        600 
PAGAAGQDGH PGPPGPSGAR GQSGVMGFPG PKGAAGEPGK SGERGVAGPP GATGAPGKDG 

       610        620        630        640        650        660 
EAGAQGPPGP SGPSGERGEQ GPAGSPGFQG LPGSPGPAGE AGKPGEQGAP GDAGGPGPSG 

       670        680        690        700        710        720 
PRGERGFPGE RGGQGPAGAQ GPRGSPGSPG NDGAKGEAGA AGAPGGRGPP GLQGMPGERG 

       730        740        750        760        770        780 
SAGMPGAKGD RGDAGTKGAD GAPGKDGARG LTGPIGPPGP SGAPGDKGEG GPSGPAGPTG 

       790        800        810        820        830        840 
ARGSPGERGE PGAPGPAGIC GPPGADGQPG AKGESGDAGP KGDAGAPGPA GPTGAPGPAG 

       850        860        870        880        890        900 
NVGAPGPKGT RGAAGPPGAT GFPGAAGRLG PPGPSGNAGP PGPPGPGGKE GAKGSRGETG 

       910        920        930        940        950        960 
PAGRSGEPGP AGPPGPSGEK GSPGSDGPAG APGIPGPQGI AGQRGVVGLP GQRGERGFSG 

       970        980        990       1000       1010       1020 
LPGPAGEPGK QGPSGPNGER GPPGPSGPPG LGGPPGEPGR EGSPGSEGAP GRDGSPGPKG 

      1030       1040       1050       1060       1070       1080 
DRGENGPSGP PGAPGAPGAP GPVGPAGKNG DRGETGPAGP AGPAGPSGVR GAPGPAGARG 

      1090       1100       1110       1120       1130       1140 
DKGEAGEQGE RGMKGHRGFN GMQGPPGPPG SSGEQGAPGP SGPAGPRGPP GSSGSTGKDG 

      1150       1160       1170       1180       1190       1200 
VNGLPGPIGP PGPRGRNGDV GPAGPPGPPG PPGPPGPPSG GFDFSFMPQP PEPKSHGDGR 

      1210       1220       1230       1240       1250       1260 
YFRADDANVV RDRDLEVDTT LKSLSAQIEN IRSPEGTRKN PARTCRDLKM CHSDWKSGDY 

      1270       1280       1290       1300       1310       1320 
WIDPNQGCNL DAIKVHCNME TGETCVYPSQ ASISQKNWYT SKNPREKKHV WFGETMSDGF 

      1330       1340       1350       1360       1370       1380 
QFEYGGEGSD PADVNIQLTF LRLMATEASQ NITYHCKNSV AYMDQETGNL KKAVLLQGSN 

      1390       1400       1410       1420       1430       1440 
EIEIRAEGNS RFTYGVTEDG CTQHTGEWGK TVIEYKTTKT SRLPIIDIAP MDVGTPDQEF 

      1450 
GIDIGPVCFL 

« Hide

References

[1]"Expression of genes of type I and type II collagen in the formation and development of the blastema of regenerating newt limb."
Asahina K., Obara M., Yoshizato K.
Dev. Dyn. 216:59-71(1999) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA].
Tissue: Regenerating forelimb blastema.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AB015438 mRNA. Translation: BAA36973.1.

3D structure databases

ProteinModelPortalQ9YIB4.
ModBaseSearch...
MobiDBSearch...

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Phylogenomic databases

HOVERGENHBG004933.

Family and domain databases

InterProIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_C.
[Graphical view]
PfamPF01410. COLFI. 1 hit.
PF01391. Collagen. 12 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Entry information

Entry nameCO1A1_CYNPY
AccessionPrimary (citable) accession number: Q9YIB4
Entry history
Integrated into UniProtKB/Swiss-Prot: May 1, 2007
Last sequence update: May 1, 1999
Last modified: July 9, 2014
This is version 56 of the entry and version 1 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families