Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

Q14993 (COJA1_HUMAN) Reviewed, UniProtKB/Swiss-Prot

Last modified July 9, 2014. Version 133. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (5) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Collagen alpha-1(XIX) chain
Alternative name(s):
Collagen alpha-1(Y) chain
Gene names
Name:COL19A1
OrganismHomo sapiens (Human) [Reference proteome]
Taxonomic identifier9606 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo

Protein attributes

Sequence length1142 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

May act as a cross-bridge between fibrils and other extracellular matrix molecules. Involved in skeletal myogenesis in the developing esophagus. May play a role in organization of the pericellular matrix or the sphinteric smooth muscle. Ref.8

Subunit structure

Oligomer; disulfide-linked. Ref.8

Subcellular location

Secretedextracellular spaceextracellular matrix By similarity.

Tissue specificity

Localized to vascular, neuronal, mesenchymal, and some epithelial basement membrane zones in umbilical cord. Ref.8

Domain

The numerous interruptions in the triple helix may make this molecule either elastic or flexible. Ref.8

Post-translational modification

Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains.

Sequence similarities

Belongs to the fibril-associated collagens with interrupted helices (FACIT) family.

Contains 11 collagen-like domains.

Contains 1 laminin G-like domain.

Sequence caution

The sequence CAC12699.3 differs from that shown. Reason: Erroneous gene model prediction.

The sequence CAI42319.2 differs from that shown. Reason: Erroneous gene model prediction.

The sequence CAI42496.2 differs from that shown. Reason: Erroneous gene model prediction.

Ontologies

Keywords
   Biological processCell adhesion
Differentiation
Myogenesis
   Cellular componentExtracellular matrix
Secreted
   Coding sequence diversityPolymorphism
   DomainCollagen
Repeat
Signal
   Molecular functionDevelopmental protein
   PTMDisulfide bond
Hydroxylation
   Technical termComplete proteome
Reference proteome
Gene Ontology (GO)
   Biological_processcell adhesion

Non-traceable author statement Ref.1. Source: UniProtKB

cell differentiation

Inferred from electronic annotation. Source: UniProtKB-KW

collagen catabolic process

Traceable author statement. Source: Reactome

extracellular matrix disassembly

Traceable author statement. Source: Reactome

extracellular matrix organization

Non-traceable author statement Ref.5. Source: UniProtKB

single organismal cell-cell adhesion

Non-traceable author statement Ref.5. Source: UniProtKB

skeletal muscle tissue development

Inferred from electronic annotation. Source: Ensembl

skeletal system development

Traceable author statement PubMed 9143499. Source: ProtInc

   Cellular_componentcollagen trimer

Non-traceable author statement Ref.1. Source: UniProtKB

endoplasmic reticulum lumen

Traceable author statement. Source: Reactome

extracellular region

Traceable author statement. Source: Reactome

proteinaceous extracellular matrix

Non-traceable author statement Ref.5. Source: UniProtKB

   Molecular_functionextracellular matrix structural constituent

Non-traceable author statement Ref.5Ref.1. Source: UniProtKB

protein binding, bridging

Non-traceable author statement Ref.5. Source: UniProtKB

Complete GO annotation...

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2323 Potential
Chain24 – 11421119Collagen alpha-1(XIX) chain
PRO_0000005797

Regions

Domain50 – 234185Laminin G-like
Domain292 – 34958Collagen-like 1
Domain350 – 39142Collagen-like 2
Domain392 – 43342Collagen-like 3
Domain474 – 51643Collagen-like 4
Domain568 – 62457Collagen-like 5
Domain626 – 67853Collagen-like 6
Domain728 – 77851Collagen-like 7
Domain779 – 81436Collagen-like 8
Domain845 – 90359Collagen-like 9
Domain904 – 94744Collagen-like 10
Domain948 – 100457Collagen-like 11
Region292 – 35160Triple-helical region 1 (COL1)
Region370 – 42960Triple-helical region 2 (COL2)
Region448 – 688241Triple-helical region 3 (COL3)
Region700 – 818119Triple-helical region 4 (COL4)
Region833 – 1012180Triple-helical region 5 (COL5)
Region1054 – 111158Triple-helical region 6 (COL6)
Motif952 – 9543Cell attachment site Potential

Natural variations

Natural variant3521A → G.
Corresponds to variant rs2273426 [ dbSNP | Ensembl ].
VAR_024419
Natural variant3611G → D in a breast cancer sample; somatic mutation. Ref.9
VAR_035746
Natural variant4061G → E.
Corresponds to variant rs13204209 [ dbSNP | Ensembl ].
VAR_048782
Natural variant4961E → G.
Corresponds to variant rs13204209 [ dbSNP | Ensembl ].
VAR_048783
Natural variant10191K → N in a breast cancer sample; somatic mutation. Ref.9
VAR_035747

Experimental info

Sequence conflict891I → MY in BAA23309. Ref.2
Sequence conflict106 – 11914FRVRR…ERWFL → ETTVPFWRFFVLET in AAA36358. Ref.6
Sequence conflict2791Q → L in BAA07368. Ref.1
Sequence conflict354 – 3552AG → GC in AAA58468. Ref.5
Sequence conflict3651D → V in BAA07368. Ref.1
Sequence conflict3651D → V in AAA58468. Ref.5
Sequence conflict441 – 4422YY → DD in BAA07368. Ref.1
Sequence conflict441 – 4422YY → DD in AAA58468. Ref.5
Sequence conflict622 – 6243PQG → QRD in AAA58468. Ref.5
Sequence conflict816 – 8238GIPFNERN → VSCSRLKI in AAA36358. Ref.6
Sequence conflict9371Q → E in BAA07368. Ref.1
Sequence conflict11401G → C in BAA23309. Ref.2

Sequences

Sequence LengthMass (Da)Tools
Q14993 [UniParc].

Last modified July 5, 2005. Version 3.
Checksum: F1153CE751387943

FASTA1,142115,221
        10         20         30         40         50         60 
MRLTGPWKLW LWMSIFLLPA STSVTVRDKT EESCPILRIE GHQLTYDNIN KLEVSGFDLG 

        70         80         90        100        110        120 
DSFSLRRAFC ESDKTCFKLG SALLIRDTIK IFPKGLPEEY SVAAMFRVRR NAKKERWFLW 

       130        140        150        160        170        180 
QVLNQQNIPQ ISIVVDGGKK VVEFMFQATE GDVLNYIFRN RELRPLFDRQ WHKLGISIQS 

       190        200        210        220        230        240 
QVISLYMDCN LIARRQTDEK DTVDFHGRTV IATRASDGKP VDIELHQLKI YCSANLIAQE 

       250        260        270        280        290        300 
TCCEISDTKC PEQDGFGNIA SSWVTAHASK MSSYLPAKQE LKDQCQCIPN KGEAGLPGAP 

       310        320        330        340        350        360 
GSPGQKGHKG EPGENGLHGA PGFPGQKGEQ GFEGSKGETG EKGEQGEKGD PALAGLNGEN 

       370        380        390        400        410        420 
GLKGDLGPHG PPGPKGEKGD TGPPGPPALP GSLGIQGPQG PPGKEGQRGR RGKTGPPGKP 

       430        440        450        460        470        480 
GPPGPPGPPG IQGIHQTLGG YYNKDNKGND EHEAGGLKGD KGETGLPGFP GSVGPKGQKG 

       490        500        510        520        530        540 
EPGEPFTKGE KGDRGEPGVI GSQGVKGEPG DPGPPGLIGS PGLKGQQGSA GSMGPRGPPG 

       550        560        570        580        590        600 
DVGLPGEHGI PGKQGIKGEK GDPGGIIGPP GLPGPKGEAG PPGKSLPGEP GLDGNPGAPG 

       610        620        630        640        650        660 
PRGPKGERGL PGVHGSPGDI GPQGIGIPGR TGAQGPAGEP GIQGPRGLPG LPGTPGTPGN 

       670        680        690        700        710        720 
DGVPGRDGKP GLPGPPGDPI ALPLLGDIGA LLKNFCGNCQ ASVPGLKSNK GEEGGAGEPG 

       730        740        750        760        770        780 
KYDSMARKGD IGPRGPPGIP GREGPKGSKG ERGYPGIPGE KGDEGLQGIP GIPGAPGPTG 

       790        800        810        820        830        840 
PPGLMGRTGH PGPTGAKGEK GSDGPPGKPG PPGPPGIPFN ERNGMSSLYK IKGGVNVPSY 

       850        860        870        880        890        900 
PGPPGPPGPK GDPGPVGEPG AMGLPGLEGF PGVKGDRGPA GPPGIAGMSG KPGAPGPPGV 

       910        920        930        940        950        960 
PGEPGERGPV GDIGFPGPEG PSGKPGINGK DGIPGAQGIM GKPGDRGPKG ERGDQGIPGD 

       970        980        990       1000       1010       1020 
RGSQGERGKP GLTGMKGAIG PMGPPGNKGS MGSPGHQGPP GSPGIPGIPA DAVSFEEIKK 

      1030       1040       1050       1060       1070       1080 
YINQEVLRIF EERMAVFLSQ LKLPAAMLAA QAYGRPGPPG KDGLPGPPGD PGPQGYRGQK 

      1090       1100       1110       1120       1130       1140 
GERGEPGIGL PGSPGLPGTS ALGLPGSPGA PGPQGPPGPS GRCNPEDCLY PVSHAHQRTG 


GN 

« Hide

References

« Hide 'large scale' references
[1]"The mRNA for alpha 1(XIX) collagen chain, a new member of FACITs, contains a long unusual 3' untranslated region and displays many unique splicing variants."
Inoguchi K., Yoshioka H., Khaleduzzaman M., Ninomiya Y.
J. Biochem. 117:137-146(1995) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA].
[2]"Structure of the human type XIX collagen (COL19A1) gene, which suggests it has arisen from an ancestor gene of the FACIT family."
Khaleduzzaman M., Sumiyoshi H., Ueki Y., Inoguchi K., Ninomiya Y., Yoshioka H.
Genomics 45:304-312(1997) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA].
[3]"The DNA sequence and analysis of human chromosome 6."
Mungall A.J., Palmer S.A., Sims S.K., Edwards C.A., Ashurst J.L., Wilming L., Jones M.C., Horton R., Hunt S.E., Scott C.E., Gilbert J.G.R., Clamp M.E., Bethel G., Milne S., Ainscough R., Almeida J.P., Ambrose K.D., Andrews T.D. expand/collapse author list , Ashwell R.I.S., Babbage A.K., Bagguley C.L., Bailey J., Banerjee R., Barker D.J., Barlow K.F., Bates K., Beare D.M., Beasley H., Beasley O., Bird C.P., Blakey S.E., Bray-Allen S., Brook J., Brown A.J., Brown J.Y., Burford D.C., Burrill W., Burton J., Carder C., Carter N.P., Chapman J.C., Clark S.Y., Clark G., Clee C.M., Clegg S., Cobley V., Collier R.E., Collins J.E., Colman L.K., Corby N.R., Coville G.J., Culley K.M., Dhami P., Davies J., Dunn M., Earthrowl M.E., Ellington A.E., Evans K.A., Faulkner L., Francis M.D., Frankish A., Frankland J., French L., Garner P., Garnett J., Ghori M.J., Gilby L.M., Gillson C.J., Glithero R.J., Grafham D.V., Grant M., Gribble S., Griffiths C., Griffiths M.N.D., Hall R., Halls K.S., Hammond S., Harley J.L., Hart E.A., Heath P.D., Heathcott R., Holmes S.J., Howden P.J., Howe K.L., Howell G.R., Huckle E., Humphray S.J., Humphries M.D., Hunt A.R., Johnson C.M., Joy A.A., Kay M., Keenan S.J., Kimberley A.M., King A., Laird G.K., Langford C., Lawlor S., Leongamornlert D.A., Leversha M., Lloyd C.R., Lloyd D.M., Loveland J.E., Lovell J., Martin S., Mashreghi-Mohammadi M., Maslen G.L., Matthews L., McCann O.T., McLaren S.J., McLay K., McMurray A., Moore M.J.F., Mullikin J.C., Niblett D., Nickerson T., Novik K.L., Oliver K., Overton-Larty E.K., Parker A., Patel R., Pearce A.V., Peck A.I., Phillimore B.J.C.T., Phillips S., Plumb R.W., Porter K.M., Ramsey Y., Ranby S.A., Rice C.M., Ross M.T., Searle S.M., Sehra H.K., Sheridan E., Skuce C.D., Smith S., Smith M., Spraggon L., Squares S.L., Steward C.A., Sycamore N., Tamlyn-Hall G., Tester J., Theaker A.J., Thomas D.W., Thorpe A., Tracey A., Tromans A., Tubby B., Wall M., Wallis J.M., West A.P., White S.S., Whitehead S.L., Whittaker H., Wild A., Willey D.J., Wilmer T.E., Wood J.M., Wray P.W., Wyatt J.C., Young L., Younger R.M., Bentley D.R., Coulson A., Durbin R.M., Hubbard T., Sulston J.E., Dunham I., Rogers J., Beck S.
Nature 425:805-811(2003) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
[4]"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
The MGC Project Team
Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
[5]"Synteny between the loci for a novel FACIT-like collagen locus (D6S228E) and alpha 1 (IX) collagen (COL9A1) on 6q12-q14 in humans."
Yoshioka H., Zhang H., Ramirez F., Mattei M.-G., Moradi-Ameli M., van der Rest M., Gordon M.K.
Genomics 13:884-886(1992) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 26-624.
Tissue: Rhabdomyosarcoma.
[6]"Human cDNA clones transcribed from an unusually high-molecular-weight RNA encode a new collagen chain."
Myers J.C., Sun M.J., D'Ippolito J.A., Jabs E.W., Neilson E.G., Dion A.S.
Gene 123:211-217(1993) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 106-1020.
[7]"The triple-helical region of human type XIX collagen consists of multiple collagenous subdomains and exhibits limited sequence homology to alpha 1(XVI)."
Myers J.C., Yang H., D'Ippolito J.A., Presente A., Miller M.K., Dion A.S.
J. Biol. Chem. 269:18549-18557(1994) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 738-1142.
Tissue: Skin.
[8]"Type XIX collagen purified from human umbilical cord is characterized by multiple sharp kinks delineating collagenous subdomains and by intermolecular aggregates via globular, disulfide-linked, and heparin-binding amino termini."
Myers J.C., Li D., Amenta P.S., Clark C.C., Nagaswami C., Weisel J.W.
J. Biol. Chem. 278:32047-32057(2003) [PubMed] [Europe PMC] [Abstract]
Cited for: FUNCTION, SUBUNIT, TISSUE SPECIFICITY, DOMAIN.
[9]"The consensus coding sequences of human breast and colorectal cancers."
Sjoeblom T., Jones S., Wood L.D., Parsons D.W., Lin J., Barber T.D., Mandelker D., Leary R.J., Ptak J., Silliman N., Szabo S., Buckhaults P., Farrell C., Meeh P., Markowitz S.D., Willis J., Dawson D., Willson J.K.V. expand/collapse author list , Gazdar A.F., Hartigan J., Wu L., Liu C., Parmigiani G., Park B.H., Bachman K.E., Papadopoulos N., Vogelstein B., Kinzler K.W., Velculescu V.E.
Science 314:268-274(2006) [PubMed] [Europe PMC] [Abstract]
Cited for: VARIANTS [LARGE SCALE ANALYSIS] ASP-361 AND ASN-1019.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
D38163 mRNA. Translation: BAA07368.1.
AB004629 Genomic DNA. Translation: BAA23309.1.
AL118519 Genomic DNA. Translation: CAB99331.2.
AL133388, AL136445, AL160262 Genomic DNA. Translation: CAI42496.2. Sequence problems.
AL133388 expand/collapse EMBL AC list , AL118519, AL136445, AL359539, AL160262 Genomic DNA. Translation: CAI42497.1.
AL136445, AL133388, AL160262 Genomic DNA. Translation: CAC12699.3. Sequence problems.
AL136445 expand/collapse EMBL AC list , AL118519, AL133388, AL160262, AL359539 Genomic DNA. Translation: CAI42716.1.
AL160262, AL133388, AL136445 Genomic DNA. Translation: CAI42319.2. Sequence problems.
AL160262 expand/collapse EMBL AC list , AL118519, AL133388, AL136445, AL359539 Genomic DNA. Translation: CAI42322.1.
AL359539 expand/collapse EMBL AC list , AL118519, AL133388, AL136445, AL160262 Genomic DNA. Translation: CAI16492.1.
BC113362 mRNA. Translation: AAI13363.1.
BC113364 mRNA. Translation: AAI13365.1.
M63597 mRNA. Translation: AAA58468.1.
L12347 mRNA. Translation: AAA36358.1.
U09279 mRNA. Translation: AAA21146.1.
AH000850 Genomic DNA. Translation: AAA21147.1.
CCDSCCDS4970.1.
PIRJX0369.
RefSeqNP_001849.2. NM_001858.5.
UniGeneHs.444842.

3D structure databases

ProteinModelPortalQ14993.
SMRQ14993. Positions 51-245.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

BioGrid107705. 1 interaction.

PTM databases

PhosphoSiteQ14993.

Polymorphism databases

DMDM68840003.

Proteomic databases

PaxDbQ14993.
PRIDEQ14993.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENST00000322773; ENSP00000316030; ENSG00000082293.
GeneID1310.
KEGGhsa:1310.
UCSCuc003pfc.1. human.

Organism-specific databases

CTD1310.
GeneCardsGC06P070633.
HGNCHGNC:2196. COL19A1.
HPAHPA042422.
MIM120165. gene.
neXtProtNX_Q14993.
PharmGKBPA26712.
GenAtlasSearch...

Phylogenomic databases

eggNOGNOG275976.
HOVERGENHBG060240.
InParanoidQ14993.
OMAERWFLWQ.
OrthoDBEOG7353W7.
PhylomeDBQ14993.
TreeFamTF351778.

Enzyme and pathway databases

ReactomeREACT_118779. Extracellular matrix organization.

Gene expression databases

ArrayExpressQ14993.
BgeeQ14993.
GenevestigatorQ14993.

Family and domain databases

Gene3D2.60.120.200. 1 hit.
InterProIPR008160. Collagen.
IPR008985. ConA-like_lec_gl_sf.
IPR013320. ConA-like_subgrp.
IPR001791. Laminin_G.
[Graphical view]
PfamPF01391. Collagen. 11 hits.
[Graphical view]
SMARTSM00210. TSPN. 1 hit.
[Graphical view]
SUPFAMSSF49899. SSF49899. 1 hit.
ProtoNetSearch...

Other

ChiTaRSCOL19A1. human.
GeneWikiCollagen,_type_XIX,_alpha_1.
GenomeRNAi1310.
NextBio5357.
PROQ14993.
SOURCESearch...

Entry information

Entry nameCOJA1_HUMAN
AccessionPrimary (citable) accession number: Q14993
Secondary accession number(s): Q00559 expand/collapse secondary AC list , Q05850, Q12885, Q13676, Q14DH1, Q5JUF0, Q5T424, Q9H572, Q9NPZ2, Q9NQP2
Entry history
Integrated into UniProtKB/Swiss-Prot: November 16, 2001
Last sequence update: July 5, 2005
Last modified: July 9, 2014
This is version 133 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Relevant documents

SIMILARITY comments

Index of protein domains and families

MIM cross-references

Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot

Human polymorphisms and disease mutations

Index of human polymorphisms and disease mutations

Human entries with polymorphisms or disease mutations

List of human entries with polymorphisms or disease mutations

Human chromosome 6

Human chromosome 6: entries, gene names and cross-references to MIM