Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Collagen alpha-1(IX) chain

Gene

Col9a1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at transcript leveli

Functioni

Structural component of hyaline cartilage and vitreous of the eye.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Metal bindingi213ZincBy similarity1
Metal bindingi215ZincBy similarity1
Metal bindingi253ZincBy similarity1

GO - Molecular functioni

GO - Biological processi

  • cartilage development Source: MGI
  • growth plate cartilage development Source: MGI
  • tissue homeostasis Source: MGI
Complete GO annotation...

Keywords - Ligandi

Metal-binding, Zinc

Names & Taxonomyi

Protein namesi
Recommended name:
Collagen alpha-1(IX) chain
Gene namesi
Name:Col9a1
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Unplaced

Organism-specific databases

MGIiMGI:88465. Col9a1.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 23Sequence analysisAdd BLAST23
ChainiPRO_000000576624 – 921Collagen alpha-1(IX) chainAdd BLAST898

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi44 ↔ 242By similarity
Disulfide bondi198 ↔ 252By similarity
Disulfide bondi411InterchainBy similarity
Disulfide bondi415InterchainBy similarity

Post-translational modificationi

Covalently linked to the telopeptides of type II collagen by lysine-derived cross-links.
Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains.

Keywords - PTMi

Disulfide bond, Hydroxylation

Proteomic databases

MaxQBiQ05722.
PaxDbiQ05722.
PRIDEiQ05722.

PTM databases

iPTMnetiQ05722.
PhosphoSitePlusiQ05722.

Expressioni

Gene expression databases

CleanExiMM_COL9A1.

Interactioni

Subunit structurei

Heterotrimer of an alpha 1(IX), an alpha 2(IX) and an alpha 3(IX) chain.

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000051579.

Structurei

3D structure databases

ProteinModelPortaliQ05722.
SMRiQ05722.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini50 – 244Laminin G-likeAdd BLAST195
Domaini269 – 325Collagen-like 1Add BLAST57
Domaini326 – 356Collagen-like 2Add BLAST31
Domaini358 – 403Collagen-like 3Add BLAST46
Domaini416 – 472Collagen-like 4Add BLAST57
Domaini473 – 512Collagen-like 5Add BLAST40
Domaini604 – 656Collagen-like 6Add BLAST53
Domaini657 – 711Collagen-like 7Add BLAST55
Domaini712 – 755Collagen-like 8Add BLAST44
Domaini790 – 847Collagen-like 9Add BLAST58

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni24 – 268Nonhelical region (NC4)Add BLAST245
Regioni269 – 405Triple-helical region (COL3)Add BLAST137
Regioni406 – 417Nonhelical region (NC3)Add BLAST12
Regioni418 – 756Triple-helical region (COL2)Add BLAST339
Regioni757 – 786Nonhelical region (NC2)Add BLAST30
Regioni787 – 901Triple-helical region (COL1)Add BLAST115
Regioni902 – 921Nonhelical region (NC1)Add BLAST20

Domaini

Each subunit is composed of three triple-helical domains interspersed with non-collagenous domains. The globular domain at the N-terminus of type IX collagen molecules represents the NC4 domain which may participate in electrostatic interactions with polyanionic glycosaminoglycans in cartilage.

Sequence similaritiesi

Contains 9 collagen-like domains.Curated
Contains 1 laminin G-like domain.Curated

Keywords - Domaini

Collagen, Repeat, Signal

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410Y4B3. LUCA.
HOGENOMiHOG000085653.
HOVERGENiHBG004933.
InParanoidiQ05722.
KOiK08131.
PhylomeDBiQ05722.

Family and domain databases

InterProiIPR008160. Collagen.
IPR013320. ConA-like_dom.
IPR001791. Laminin_G.
[Graphical view]
PfamiPF01391. Collagen. 9 hits.
[Graphical view]
SMARTiSM00210. TSPN. 1 hit.
[Graphical view]
SUPFAMiSSF49899. SSF49899. 1 hit.

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Note: Additional isoforms seem to exist.
Isoform Long (identifier: Q05722-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MKNFWKISVF FCVCSCLGPW VSATLKRRAR FPANSISNGG SELCPKIRIG
60 70 80 90 100
QDDLPGFDLI SQFQIEKAAS RRTIQRVVGS TALQVAYKLG SNVDFRIPTR
110 120 130 140 150
HLYPSGLPEE YSFLTTFRMT GSTLEKHWNI WQIQDSAGRE QVGVKINGQT
160 170 180 190 200
KSVAFSYKGL DGSLQTAAFL NLPSLFDSRW HKLMIGVERT SATLFIDCIR
210 220 230 240 250
IESLPIKPRG QIDADGFAVL GKLVDNPQVS VPFELQWMLI HCDPLRPRRE
260 270 280 290 300
TCHELPIRIT TSQTTDERGP PGEQGPPGPP GPPGVPGIDG IDGDRGPKGP
310 320 330 340 350
PGPPGPPGDP GKPGAPGKPG TPGADGLTGP DGSPGSVGPR GQKGEPGVPG
360 370 380 390 400
SRGFPGRGIP GPPGPPGTTG LPGELGRVGP IGDPGKRGPP GPPGPPGPSG
410 420 430 440 450
TIGFHDGDPL CPNSCPPGRS GYPGLPGMRG HKGAKGEIGE PGRQGHKGEE
460 470 480 490 500
GDQGELGEVG AQGPPGPQGL RGITGIVGDK GEKGARGFDG EPGPQGIPGA
510 520 530 540 550
AGDQGQRGPP GETGPKGDRG IQGSRGIPGS PGPKGDTGLP GVDGRDGIPG
560 570 580 590 600
MPGTKGEAGK PGPPGDVGLQ GLPGVPGIPG AKGVAGEKGN TGAPGKPGQL
610 620 630 640 650
GSSGKPGQQG PPGEVGPRGP RGLPGSRGPV GPEGSPGIPG KLGSVGSPGL
660 670 680 690 700
PGLPGPPGLP GMKGDRGVFG EPGPKGEQGA SGEEGEAGAR GDLGDMGQPG
710 720 730 740 750
PKGSVGNPGE PGLRGPEGIR GLPGVEGPRG PPGPRGMQGE QGATGLPGIQ
760 770 780 790 800
GPPGRAPTDQ HIKQVCMRVV QEHFVEMAAS LKRPDTGASG LPGRPGPPGP
810 820 830 840 850
PGPPGENGFP GQMGIRGLPG IKGPPGALGL RGPKGDLGEK GERGPPGRGP
860 870 880 890 900
KGLPGAIGLP GDPGPASYGK NGRDGEQGPP GVAGIPGVPG PPGPPGPPGF
910 920
CEPASCTLQS GQRAFSKGPD K
Length:921
Mass (Da):92,092
Last modified:July 15, 1999 - v2
Checksum:iBC79177D36DCFFFC
GO
Isoform Short (identifier: Q05722-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-260: Missing.
     261-261: T → MAWAAWGRGVLGLSLMLSGLRLCAAQT

Show »
Length:687
Mass (Da):65,493
Checksum:i9F20E93756155B19
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti110E → R in AAA21834 (PubMed:8061915).Curated1
Sequence conflicti238M → V in AAA21834 (PubMed:8061915).Curated1
Sequence conflicti289D → G in AAA21834 (PubMed:8061915).Curated1
Sequence conflicti461A → D in AAA21834 (PubMed:8061915).Curated1
Sequence conflicti516K → E in AAA21834 (PubMed:8061915).Curated1
Sequence conflicti569 – 570LQ → IA in AAA21834 (PubMed:8061915).Curated2
Sequence conflicti740E → D in AAA21834 (PubMed:8061915).Curated1
Sequence conflicti775V → A in AAA21834 (PubMed:8061915).Curated1
Sequence conflicti775V → A in CAA41049 (PubMed:2054384).Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0011431 – 260Missing in isoform Short. CuratedAdd BLAST260
Alternative sequenceiVSP_001144261T → MAWAAWGRGVLGLSLMLSGL RLCAAQT in isoform Short. Curated1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D17511 mRNA. Translation: BAA04463.1.
L12215 mRNA. Translation: AAA21834.1.
X57984 Genomic DNA. Translation: CAA41049.1.
M32136, M32132 Genomic DNA. Translation: AAA53523.1.
M32136, M32134 Genomic DNA. Translation: AAA53522.1.
CCDSiCCDS35527.1. [Q05722-1]
PIRiA35980.
S40495.
S42617.
RefSeqiNP_031766.3. NM_007740.3.
UniGeneiMm.154662.

Genome annotation databases

GeneIDi12839.
KEGGimmu:12839.

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D17511 mRNA. Translation: BAA04463.1.
L12215 mRNA. Translation: AAA21834.1.
X57984 Genomic DNA. Translation: CAA41049.1.
M32136, M32132 Genomic DNA. Translation: AAA53523.1.
M32136, M32134 Genomic DNA. Translation: AAA53522.1.
CCDSiCCDS35527.1. [Q05722-1]
PIRiA35980.
S40495.
S42617.
RefSeqiNP_031766.3. NM_007740.3.
UniGeneiMm.154662.

3D structure databases

ProteinModelPortaliQ05722.
SMRiQ05722.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000051579.

PTM databases

iPTMnetiQ05722.
PhosphoSitePlusiQ05722.

Proteomic databases

MaxQBiQ05722.
PaxDbiQ05722.
PRIDEiQ05722.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi12839.
KEGGimmu:12839.

Organism-specific databases

CTDi1297.
MGIiMGI:88465. Col9a1.

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410Y4B3. LUCA.
HOGENOMiHOG000085653.
HOVERGENiHBG004933.
InParanoidiQ05722.
KOiK08131.
PhylomeDBiQ05722.

Miscellaneous databases

PROiQ05722.
SOURCEiSearch...

Gene expression databases

CleanExiMM_COL9A1.

Family and domain databases

InterProiIPR008160. Collagen.
IPR013320. ConA-like_dom.
IPR001791. Laminin_G.
[Graphical view]
PfamiPF01391. Collagen. 9 hits.
[Graphical view]
SMARTiSM00210. TSPN. 1 hit.
[Graphical view]
SUPFAMiSSF49899. SSF49899. 1 hit.
ProtoNetiSearch...

Entry informationi

Entry nameiCO9A1_MOUSE
AccessioniPrimary (citable) accession number: Q05722
Secondary accession number(s): Q61269
, Q61270, Q61433, Q61940
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 1, 1995
Last sequence update: July 15, 1999
Last modified: November 2, 2016
This is version 137 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.