Skip Header

 
Contribute Send feedback
Read comments (0) or add your own

Reviewed, UniProtKB/Swiss-Prot Q9QZR9 (CO4A4_MOUSE)

Last modified June 16, 2009. Version 55. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data | Customize display text xml rdf/xml gff fasta
Names and origin · Protein attributes · General annotation (Comments) · Ontologies · Alternative products · Sequence annotation (Features) · Sequences · References · Cross-references · Entry information · Relevant documents

Names and origin

Protein namesRecommended name:
    Collagen alpha-4(IV) chain
Gene names
Name: Col4a4
OrganismMus musculus (Mouse)
Taxonomic identifier10090 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMus

Protein attributes

Sequence length1682 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at transcript level.

General annotation (Comments)

Function

Type IV collagen is the major structural component of glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork together with laminins, proteoglycans and entactin/nidogen. Ref.3 Ref.4 UniProtKB P53420

Subunit structure

There are six type IV collagen isoforms, alpha 1(IV)-alpha 6(IV), each of which can form a triple helix structure with 2 other chains to generate type IV collagen network. The alpha 3(IV) chain forms a triple helical protomer with alpha 4(IV) and alpha 5(IV); this triple helical structure dimerizes through NC1-NC1 domain interactions such that the alpha 3(IV), alpha 4(IV) and alpha 5(IV) chains of one protomer connect with the alpha 5(IV), alpha 4(IV) and alpha 3(IV) chains of the opposite protomer, respectively By similarity. Associates with LAMB2 at the neuromuscular junction and in GBM. Ref.3 UniProtKB P53420

Subcellular location

Secretedextracellular spaceextracellular matrixbasement membrane. Note: Colocalizes with COL4A3 and COL4A5 in GBM, tubular basement membrane (TBM) and synaptic basal lamina (BL). Ref.3

Tissue specificity

Highly expressed in kidney and lung. Detected at lower levels in heart, muscle and skin. Ref.3

Developmental stage

The expression of collagen IV undergoes a developmental shift in the developing lens capsule. During the early stages of lens capsule development expression of collagens alpha 1(IV), alpha 2(IV), alpha 5(IV) and alpha 6(IV) is observed; this is consistent with the presence of fibrillar alpha 1(IV)-alpha 1(IV)-alpha 2(IV) protomers and of elastic alpha 5(IV)-alpha 5(IV)-alpha 6(IV) protomers. In the later stages of development components of the more cross-linked alpha 3(IV)-alpha 4(IV)-alpha 5(IV) protomer appear. Ref.5

Domain

Alpha chains of type IV collagen have a non-collagenous domain (NC1) at their C-terminus, frequent interruptions of the G-X-Y repeats in the long central triple-helical domain (which may cause flexibility in the triple helix), and a short N-terminal triple-helical 7S domain By similarity. UniProtKB P53420

Post-translational modification

Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains By similarity. UniProtKB P53420

Type IV collagens contain numerous cysteine residues which are involved in inter- and intramolecular disulfide bonding. 12 of these, located in the NC1 domain, are conserved in all known type IV collagens By similarity. UniProtKB P53420

Miscellaneous

The kidneys of transgenic mice where the 5' portions of both COL4A3 and COL4A4 and the shared intergenic promoter region were deleted exhibit morphological and ultrastructural features characteristic of the human hereditary disorder Alport syndrome, including disorganization and multilamellar structure of the GBM and delayed onset glomerulonephritis. Ref.1

Sequence similarities

Belongs to the type IV collagen family.

Contains 1 collagen IV NC1 (C-terminal non-collagenous) domain.

Alternative products

This entry describes 2 isoforms produced by alternative splicing. [Align] [Select]
Isoform 1 Ref.3 Ref.1 (identifier: Q9QZR9-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform 2 Ref.2 (identifier: Q9QZR9-2)

The sequence of this isoform differs from the canonical sequence as follows:
     470-488: GRRGAKGAKGNKGLCTCPP → PLWIKQTLYMWSCSPFSFY
     489-1682: Missing.
Note: No experimental confirmation available.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 3232 Potential
Chain33 – 16821650Collagen alpha-4(IV) chain
PRO_0000283793

Regions

Domain1457 – 1682226Collagen IV NC1
Region31 – 56267S domain By similarity UniProtKB P53420
Region57 – 14511395Triple-helical region By similarity UniProtKB P53420
Motif86 – 883Cell attachment site Potential
Motif137 – 1393Cell attachment site Potential
Motif181 – 1833Cell attachment site Potential
Motif587 – 5893Cell attachment site Potential
Motif593 – 5953Cell attachment site Potential
Motif716 – 7183Cell attachment site Potential
Motif980 – 9823Cell attachment site Potential
Motif992 – 9943Cell attachment site Potential
Motif1144 – 11463Cell attachment site Potential

Sites

Site1197 – 11982Cleavage; by collagenase By similarity UniProtKB P53420

Amino acid modifications

Glycosylation431N-linked (GlcNAc...) Potential
Glycosylation1341N-linked (GlcNAc...) Potential
Glycosylation6611N-linked (GlcNAc...) Potential
Disulfide bond1472 ↔ 1561Or C-1472 with C-1558 By similarity UniProtKB P53420
Disulfide bond1505 ↔ 1558Or C-1505 with C-1561 By similarity UniProtKB P53420
Disulfide bond1517 ↔ 1523 By similarity UniProtKB P53420
Disulfide bond1580 ↔ 1678Or C-1580 with C-1675 By similarity UniProtKB P53420
Disulfide bond1614 ↔ 1675Or C-1614 with C-1678 By similarity UniProtKB P53420
Disulfide bond1626 ↔ 1633 By similarity UniProtKB P53420

Natural variations

Alternative sequence470 – 48819GRRGA…CTCPP → PLWIKQTLYMWSCSPFSFY in isoform 2. Ref.2
VSP_052356
Alternative sequence489 – 16821194Missing in isoform 2. Ref.2
VSP_052357

Experimental info

Sequence conflict13731L → F in CAA84530. Ref.3
Sequence conflict16541A → P in CAA84530. Ref.3

Sequences

Sequence LengthMass (Da)Tools
Isoform 1 [UniParc].

Last modified May 1, 2000. Version 1.
Checksum: 6F7B679EDD76E904

FASTA1,682164,096
        10         20         30         40         50         60 
MRCFFRWTKS FVTAPWSLIF ILFTIQYEYG SGKKYGGPCG GRNCSVCQCF PEKGSRGHPG 

        70         80         90        100        110        120 
PLGPQGPIGP LGPLGPIGIP GEKGERGDSG SPGPPGEKGD KGPTGVPGFP GVDGVPGHPG 

       130        140        150        160        170        180 
PPGPRGKPGV DGYNGSRGDP GYPGERGAPG PGGPPGQPGE NGEKGRSVYI TGGVKGIQGD 

       190        200        210        220        230        240 
RGDPGPPGLP GSRGAQGSPG PMGHAGAPGL AGPIGHPGSP GLKGNPATGL KGQRGEPGEV 

       250        260        270        280        290        300 
GQRGPPGPTL LVQPPDLSIY KGEKGVKGMP GMIGPPGPPG RKGAPGVGIK GEKGIPGFPG 

       310        320        330        340        350        360 
PRGEPGSHGP PGFPGFKGIQ GAAGEPGLFG FLGPKGDLGD RGYPGPPGIL LTPAPPLKGV 

       370        380        390        400        410        420 
PGDPGPPGYY GEIGDVGLPG PPGPPGRPGE TCPGMMGPPG PPGVPGPPGF PGEAGVPGRL 

       430        440        450        460        470        480 
DCAPGKPGKP GLPGLPGAPG PEGPPGSDVI YCRPGCPGPM GEKGKVGPPG RRGAKGAKGN 

       490        500        510        520        530        540 
KGLCTCPPGP MGPPGPPGPP GRQGSKGDLG LPGWHGEKGD PGQPGAEGPP GPPGRPGAMG 

       550        560        570        580        590        600 
PPGHKGEKGD MVISRVKGQK GERGLDGPPG FPGPHGQDGG DGRPGERGDP GPRGDHKDAA 

       610        620        630        640        650        660 
PGERGLPGLP GPPGRTGPEG PPGLGFPGPP GQRGLPGEPG RPGTRGFDGT KGQKGDSILC 

       670        680        690        700        710        720 
NVSYPGKPGL PGLDGPPGLK GFPGPPGAPG MRCPDGQKGQ RGKPGMSGIP GPPGFRGDMG 

       730        740        750        760        770        780 
DPGIKGEKGT SPIGPPGPPG SPGKDGQKGI PGDPAFGDPG PPGERGLPGA PGMKGQKGHP 

       790        800        810        820        830        840 
GCPGAGGPPG IPGSPGLKGP KGREGSRGFP GIPGSPGHSC ERGAPGIPGQ PGLPGTPGDP 

       850        860        870        880        890        900 
GAPGWKGQPG DMGPSGPAGM KGLPGLPGLP GADGLRGPPG IPGPNGEDGL PGLPGLKGLP 

       910        920        930        940        950        960 
GLPGFPGFPG ERGKPGPDGE PGRKGEVGEK GWPGLKGDLG ERGAKGDRGL PGDAGEAVTS 

       970        980        990       1000       1010       1020 
RKGEPGDAGP PGDGGFSGER GDKGSSGMRG GRGDPGRDGL PGLHRGQPGI DGPPGPPGPP 

      1030       1040       1050       1060       1070       1080 
GPPGSPGLRG VIGFPGFPGD QGDPGSPGPP GFPGDDGARG PKGYKGDPAS QCGPPGPKGE 

      1090       1100       1110       1120       1130       1140 
PGSPGYQGRT GVPGEKGFPG DEGPRGPPGR PGQPGSFGPP GCPGDPGMPG LKGHPGEVGD 

      1150       1160       1170       1180       1190       1200 
PGPRGDAGDF GRPGPAGVKG PLGSPGLNGL HGLKGEKGTK GASGLLEMGP PGPMGMPGQK 

      1210       1220       1230       1240       1250       1260 
GEKGDPGSPG ISPPGLPGEK GFPGPPGRPG PPGPAGAPGR AAKGDIPDPG PPGDRGPPGP 

      1270       1280       1290       1300       1310       1320 
DGPRGVPGPP GSPGNVDLLK GDPGDCGLPG PPGSRGPPGP PGCQGPPGCD GKDGQKGPMG 

      1330       1340       1350       1360       1370       1380 
LPGLPGPPGL PGAPGEKGLP GPPGRKGPVG PPGCRGEPGP PADVDSCPRI PGLPGVPGPR 

      1390       1400       1410       1420       1430       1440 
GPEGAMGEPG RRGLPGPGCK GEPGPDGRRG QDGIPGSPGP PGRKGDTGEA GCPGAPGPPG 

      1450       1460       1470       1480       1490       1500 
PTGDPGPKGF GPGSLSGFLL VLHSQTDQEP ACPVGMPRLW TGYSLLYMEG QEKAHNQDLG 

      1510       1520       1530       1540       1550       1560 
LAGSCLPVFS TLPFAYCNIH QVCHYAQRND RSYWLSSAAP LPMMPLSEEE IRSYISRCAV 

      1570       1580       1590       1600       1610       1620 
CEAPAQAVAV HSQDQSIPPC PRTWRSLWIG YSFLMHTGAG DQGGGQALMS PGSCLEDFRA 

      1630       1640       1650       1660       1670       1680 
APFVECQGRQ GTCHFFANEY SFWLTTVNPD LQFASGPSPD TLKEVQAQRR KISRCQVCMK 


HS 

« Hide

Isoform 2.

Checksum: 480DF110A11293C3
Show »

FASTA48847,945

References

« Hide 'large scale' references
[1]"Insertional mutation of the collagen genes Col4a3 and Col4a4 in a mouse model of Alport syndrome."
Lu W., Phillips C.L., Killen P.D., Hlaing T., Harrison W.R., Elder F.F.B., Miner J.H., Overbeek P.A., Meisler M.H.
Genomics 61:113-124(1999) [PubMed: 10534397] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
Tissue: Embryonic kidney.
[2]"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
The MGC Project Team
Genome Res. 14:2121-2127(2004) [PubMed: 15489334] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
[3]"Collagen IV alpha 3, alpha 4, and alpha 5 chains in rodent basal laminae: sequence, distribution, association with laminins, and developmental switches."
Miner J.H., Sanes J.R.
J. Cell Biol. 127:879-891(1994) [PubMed: 7962065] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 1371-1682 (ISOFORM 1), FUNCTION, ASSOCIATED WITH LAMB2, SUBCELLULAR LOCATION, TISSUE SPECIFICITY.
Strain: BALB/c.
Tissue: Kidney.
[4]"Nidogen-1. Expression and ultrastructural localization during the onset of mesoderm formation in the early mouse embryo."
Miosge N., Quondamatteo F., Klenczar C., Herken R.
J. Histochem. Cytochem. 48:229-238(2000) [PubMed: 10639489] [Abstract]
Cited for: FUNCTION.
[5]"Collagen IV in the developing lens capsule."
Kelley P.B., Sado Y., Duncan M.K.
Matrix Biol. 21:415-423(2002) [PubMed: 12225806] [Abstract]
Cited for: DEVELOPMENTAL STAGE.
+Additional computationally mapped references.

Cross-references

Sequence databases

AF169388 mRNA. Translation: AAD50450.1.
BC117709 mRNA. Translation: AAI17710.1.
Z35167 mRNA. Translation: CAA84530.1.
IPIIPI00626353.
IPI00844776.
PIRI48303.
RefSeqNP_031761.1.
UniGeneMm.40253
Mm.460425

3D structure databases

HSSPHSSP built from PDB template 1LI1 based on UniProtKB P08572.
ModBaseSearch...

PTM databases

PhosphoSiteQ9QZR9.

Genome annotation databases

EnsemblENSMUSG00000067158. Mus musculus. [Contig view]
GeneID12829.
KEGGmmu:12829.

Organism-specific databases

MGIMGI:104687. Col4a4.

Phylogenomic databases

HOVERGENQ9QZR9.
OMAQ9QZR9. DMGDPGF.

Gene expression databases

ArrayExpressQ9QZR9.
BgeeQ9QZR9.
CleanExMM_COL4A4.

Family and domain databases

InterProIPR008160. Collagen.
IPR001442. Procollagn4_C.
[Graphical view]
Gene3DG3DSA:2.170.240.10. Procollagn4_C. 1 hit.
PfamPF01413. C4. 2 hits.
PF01391. Collagen. 23 hits.
[Graphical view]
ProDomPD000007. Clg_helix. 6 hits.
PD003923. Procollagn4_C. 2 hits.
[Graphical view] [Entries sharing at least one domain]
SMARTSM00111. C4. 2 hits.
[Graphical view]
PROSITEPS51403. NC1_IV. 1 hit.
[Graphical view]
ProtoNetSearch...

Other Resources

NextBio282326.
SOURCESearch...

Entry information

Entry nameCO4A4_MOUSE
AccessionPrimary (citable) accession number: Q9QZR9
Secondary accession number(s): Q149M2, Q64457
Entry history
Integrated into UniProtKB/Swiss-Prot: April 17, 2007
Last sequence update: May 1, 2000
Last modified: June 16, 2009
This is version 55 of the entry and version 1 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation projectHPI (Human Proteome Initiative)

Relevant documents

MGD cross-references

Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot

SIMILARITY comments

Index of protein domains and families

Names and origin · Protein attributes · General annotation (Comments) · Ontologies · Alternative products · Sequence annotation (Features) · Sequences · References · Cross-references · Entry information · Relevant documents