Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

Q9XSJ7 (CO1A1_CANFA) Reviewed, UniProtKB/Swiss-Prot

Last modified November 13, 2013. Version 88. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (1) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Collagen alpha-1(I) chain
Alternative name(s):
Alpha-1 type I collagen
Gene names
Name:COL1A1
OrganismCanis familiaris (Dog) (Canis lupus familiaris) [Reference proteome]
Taxonomic identifier9615 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaLaurasiatheriaCarnivoraCaniformiaCanidaeCanis

Protein attributes

Sequence length1460 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Type I collagen is a member of group I collagen (fibrillar forming collagen).

Subunit structure

Trimers of one alpha 2(I) and two alpha 1(I) chains. Interacts with MRC2 By similarity. Interacts with TRAM2 By similarity.

Subcellular location

Secretedextracellular spaceextracellular matrix By similarity.

Domain

The C-terminal propeptide, also known as COLFI domain, have crucial roles in tissue growth and repair by controlling both the intracellular assembly of procollagen molecules and the extracellular assembly of collagen fibrils. It binds a calcium ion which is essential for its function By similarity.

Post-translational modification

Proline residues at the third position of the tripeptide repeating unit (G-X-P) are hydroxylated in some or all of the chains. Proline residues at the second position of the tripeptide repeating unit (G-P-X) are hydroxylated in some of the chains.

Involvement in disease

Defects in COL1A1 are a cause of osteogenesis imperfecta (OI). Ref.1

Sequence similarities

Belongs to the fibrillar collagen family.

Contains 1 fibrillar collagen NC1 domain.

Contains 1 VWFC domain.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2222 By similarity
Propeptide23 – 157135N-terminal propeptide
PRO_0000005713
Chain158 – 12141057Collagen alpha-1(I) chain
PRO_0000005714
Propeptide1215 – 1460246C-terminal propeptide
PRO_0000005715

Regions

Domain34 – 9259VWFC
Domain1225 – 1460236Fibrillar collagen NC1
Region158 – 17417Nonhelical region (N-terminal)
Region175 – 11881014Triple-helical region
Region1189 – 121426Nonhelical region (C-terminal)
Motif741 – 7433Cell attachment site Potential
Motif1089 – 10913Cell attachment site Potential

Sites

Metal binding12731Calcium By similarity
Metal binding12751Calcium By similarity
Metal binding12761Calcium; via carbonyl oxygen By similarity
Metal binding12781Calcium; via carbonyl oxygen By similarity
Metal binding12811Calcium By similarity

Amino acid modifications

Modified residue1581Pyrrolidone carboxylic acid By similarity
Modified residue1661Allysine By similarity
Modified residue26115-hydroxylysine; alternate By similarity
Modified residue116013-hydroxyproline By similarity
Glycosylation2611O-linked (Gal...); alternate By similarity
Glycosylation13611N-linked (GlcNAc...) By similarity
Disulfide bond1255 ↔ 1287 By similarity
Disulfide bond1261Interchain (with C-1278) By similarity
Disulfide bond1278Interchain (with C-1261) By similarity
Disulfide bond1295 ↔ 1458 By similarity
Disulfide bond1366 ↔ 1411 By similarity

Natural variations

Natural variant2081G → A in OI; severe. Ref.1

Sequences

Sequence LengthMass (Da)Tools
Q9XSJ7 [UniParc].

Last modified November 1, 1999. Version 1.
Checksum: 58E3674D2B570697

FASTA1,460138,762
        10         20         30         40         50         60 
MFSFVDLRLL LLLAATALLT HGQEEGQEED IPPVTCVQNG LRYYDRDVWK PEACRICVCD 

        70         80         90        100        110        120 
NGNVLCDDVI CDETKNCPGA QVPPGECCPV CPDGEASPTD QETTGVEGPK GDTGPRGPRG 

       130        140        150        160        170        180 
PAGPPGRDGI PGQPGLPGPP GPPGPPGPPG LGGNFAPQMS YGYDEKSTGG ISVPGPMGPS 

       190        200        210        220        230        240 
GPRGLPGPPG APGPQGFQGP PGEPGEPGAS GPMGPRGPPG PPGKNGDDGE AGKPGRPGER 

       250        260        270        280        290        300 
GPPGPQGARG LPGTAGLPGM KGHRGFSGLD GAKGDAGPAG PKGEPGSPGE NGAPGQMGPR 

       310        320        330        340        350        360 
GLPGERGRPG APGPAGARGN DGATGAAGPP GPTGPAGPPG FPGAVGAKGE AGPQGARGSE 

       370        380        390        400        410        420 
GPQGVRGEPG PPGPAGAAGP AGNPGADGQP GAKGANGAPG IAGAPGFPGA RGPSGPQGPS 

       430        440        450        460        470        480 
GPPGPKGNSG EPGAPGNKGD TGAKGEPGPT GIQGPPGPAG EEGKRGARGE PGPTGLPGPP 

       490        500        510        520        530        540 
GERGGPGSRG FPGADGVAGP KGPAGERGSP GPAGPKGSPG EAGRPGEAGL PGAKGLTGSP 

       550        560        570        580        590        600 
GSPGPDGKTG PPGPAGQDGR PGPPGPPGAR GQAGVMGFPG PKGAAGEPGK AGERGVPGPP 

       610        620        630        640        650        660 
GAVGPAGKDG EAGAQGPPGP AGPAGERGEQ GPAGSPGFQG LPGPAGPPGE AGKPGEQGVP 

       670        680        690        700        710        720 
GDLGAPGPSG ARGERGFPGE RGVQGPPGPA GPRGANGAPG NDGAKGDAGA PGAPGSQGAP 

       730        740        750        760        770        780 
GLQGMPGERG AAGLPGPKGD RGDAGPKGAD GSPGKDGVRG LTGPIGPPGP AGAPGDKGEA 

       790        800        810        820        830        840 
GPSGPAGPTG ARGAPGDRGE PGPPGPAGFA GPPGADGQPG AKGEPGDAGA KGDAGPPGPA 

       850        860        870        880        890        900 
GPTGPPGPIG NVGAPGPKGA RGSAGPPGAT GFPGAAGRVG PPGPSGNAGP PGPPGPAGKE 

       910        920        930        940        950        960 
GGKGARGETG PAGRPGEVGP PGPPGPAGEK GSPGADGPAG APGTPGPQGI AGQRGVVGLP 

       970        980        990       1000       1010       1020 
GQRGERGFPG LPGPSGEPGK QGPSGTSGER GPPGPMGPPG LAGPPGESGR EGSPGAEGSP 

      1030       1040       1050       1060       1070       1080 
GRDGSPGPKG DRGETGPAGP PGAPGAPGAP GPVGPAGKNG DRGETGPAGP AGPIGPVGAR 

      1090       1100       1110       1120       1130       1140 
GPAGPQGPRG DKGETGEQGD RGIKGHRGFS GLQGPPGPPG SPGEQGPSGA SGPAGPRGPP 

      1150       1160       1170       1180       1190       1200 
GSAGSPGKDG LNGLPGPIGP PGPRGRTGDA GPVGPPGPPG PPGPPGPPSG GFDFSFLPQP 

      1210       1220       1230       1240       1250       1260 
PQEKAHDGGR YYRADDANVV RDRDLEVDTT LKSLSQQIEN IRSPEGSRKN PARTCRDLKM 

      1270       1280       1290       1300       1310       1320 
CHSDWKSGEY WIDPNQGCNL DAIKVFCNME TGETCVYPTQ PQVAQKNWYI SKNPKEKRHV 

      1330       1340       1350       1360       1370       1380 
WYGESMTDGF QFEYGGQGSD PADVAIQLTF LRLMSTEASQ NITYHCKNSV AYMDQQTGNL 

      1390       1400       1410       1420       1430       1440 
KKALLLQGSN EIEIRAEGNS RFTYSVTYDG CTSHTGAWGK TVIEYKTTKT SRLPIIDVAP 

      1450       1460 
LDVGAPDQEF GMDIGPVCFL 

« Hide

References

[1]"Sequence of normal canine COL1A1 cDNA and identification of a heterozygous alpha1(I) collagen Gly208Ala mutation in a severe case of canine osteogenesis imperfecta."
Campbell B.G., Wootton J.A.M., MacLeod J.N., Minor R.R.
Arch. Biochem. Biophys. 384:37-46(2000) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA], VARIANT OI ALA-208.
Tissue: Skin.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AF153062 mRNA. Translation: AAD34619.1.
RefSeqNP_001003090.1. NM_001003090.1.
UniGeneCfa.100.

3D structure databases

ModBaseSearch...
MobiDBSearch...

Proteomic databases

PaxDbQ9XSJ7.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

GeneID403651.
KEGGcfa:403651.

Organism-specific databases

CTD1277.

Phylogenomic databases

eggNOGNOG12793.
HOGENOMHOG000085654.
HOVERGENHBG004933.
InParanoidQ9XSJ7.
KOK06236.

Family and domain databases

InterProIPR008160. Collagen.
IPR000885. Fib_collagen_C.
IPR001007. VWF_C.
[Graphical view]
PfamPF01410. COLFI. 1 hit.
PF01391. Collagen. 12 hits.
PF00093. VWC. 1 hit.
[Graphical view]
ProDomPD002078. Fib_collagen_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTSM00038. COLFI. 1 hit.
SM00214. VWC. 1 hit.
[Graphical view]
PROSITEPS51461. NC1_FIB. 1 hit.
PS01208. VWFC_1. 1 hit.
PS50184. VWFC_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

NextBio20817156.

Entry information

Entry nameCO1A1_CANFA
AccessionPrimary (citable) accession number: Q9XSJ7
Entry history
Integrated into UniProtKB/Swiss-Prot: May 30, 2000
Last sequence update: November 1, 1999
Last modified: November 13, 2013
This is version 88 of the entry and version 1 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families