Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Collagen alpha-1(IV) chain

Gene

Col4a1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Type IV collagen is the major structural component of glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork together with laminins, proteoglycans and entactin/nidogen.
Arresten, comprising the C-terminal NC1 domain, inhibits angiogenesis and tumor formation. The C-terminal half is found to possess the anti-angiogenic activity. Specifically inhibits endothelial cell proliferation, migration and tube formation. Inhibits expression of hypoxia-inducible factor 1alpha and ERK1/2 and p38 MAPK activation. Ligand for alpha1/beta1 integrin (By similarity).By similarity

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Angiogenesis

Enzyme and pathway databases

ReactomeiR-BTA-3000480. Scavenging by Class A Receptors.
R-MMU-1442490. Collagen degradation.
R-MMU-1474244. Extracellular matrix organization.
R-MMU-1650814. Collagen biosynthesis and modifying enzymes.
R-MMU-186797. Signaling by PDGF.
R-MMU-2022090. Assembly of collagen fibrils and other multimeric structures.
R-MMU-216083. Integrin cell surface interactions.
R-MMU-2214320. Anchoring fibril formation.
R-MMU-3000157. Laminin interactions.
R-MMU-3000171. Non-integrin membrane-ECM interactions.
R-MMU-419037. NCAM1 interactions.

Names & Taxonomyi

Protein namesi
Recommended name:
Collagen alpha-1(IV) chain
Cleaved into the following chain:
Gene namesi
Name:Col4a1
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 8

Organism-specific databases

MGIiMGI:88454. Col4a1.

Subcellular locationi

GO - Cellular componenti

  • basement membrane Source: UniProtKB
  • collagen type IV trimer Source: MGI
  • extracellular matrix Source: UniProtKB
  • extracellular region Source: Reactome
Complete GO annotation...

Keywords - Cellular componenti

Basement membrane, Extracellular matrix, Secreted

Pathology & Biotechi

Disruption phenotypei

Mice develop perinatal cerebral hemorrhage and porencephaly. The mutant protein inhibits the secretion of mutant and normal proteins into the basement membrane of embryonic origin. The mutation is semidominant.1 Publication

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 27Add BLAST27
PropeptideiPRO_000000575028 – 172N-terminal propeptide (7S domain)Add BLAST145
ChainiPRO_0000005751173 – 1669Collagen alpha-1(IV) chainAdd BLAST1497
ChainiPRO_00003904831445 – 1669ArrestenAdd BLAST225

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Glycosylationi126N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi1460 ↔ 1551Or C-1460 with C-1548PROSITE-ProRule annotation
Disulfide bondi1493 ↔ 1548Or C-1493 with C-1551PROSITE-ProRule annotation
Disulfide bondi1505 ↔ 1511PROSITE-ProRule annotation
Cross-linki1533S-Lysyl-methionine sulfilimine (Met-Lys) (interchain with K-1651)By similarity
Disulfide bondi1570 ↔ 1665Or C-1570 with C-1662PROSITE-ProRule annotation
Disulfide bondi1604 ↔ 1662Or C-1604 with C-1665PROSITE-ProRule annotation
Disulfide bondi1616 ↔ 1622PROSITE-ProRule annotation
Cross-linki1651S-Lysyl-methionine sulfilimine (Lys-Met) (interchain with M-1533)By similarity

Post-translational modificationi

Prolines at the third position of the tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of the chains.
Type IV collagens contain numerous cysteine residues which are involved in inter- and intramolecular disulfide bonding. 12 of these, located in the NC1 domain, are conserved in all known type IV collagens.
Proteolytic processing produces the C-terminal NC1 peptide, arresten.By similarity
The trimeric structure of the NC1 domains is stabilized by covalent bonds between Lys and Met residues.By similarity

Keywords - PTMi

Disulfide bond, Glycoprotein, Hydroxylation

Proteomic databases

MaxQBiP02463.
PaxDbiP02463.
PeptideAtlasiP02463.
PRIDEiP02463.

PTM databases

iPTMnetiP02463.
PhosphoSitePlusiP02463.

Expressioni

Gene expression databases

BgeeiENSMUSG00000031502.
CleanExiMM_COL4A1.
ExpressionAtlasiP02463. baseline and differential.
GenevisibleiP02463. MM.

Interactioni

Subunit structurei

There are six type IV collagen isoforms, alpha 1(IV)-alpha 6(IV), each of which can form a triple helix structure with 2 other chains to generate type IV collagen network.

GO - Molecular functioni

Protein-protein interaction databases

BioGridi198816. 1 interactor.
IntActiP02463. 1 interactor.
MINTiMINT-4388747.
STRINGi10090.ENSMUSP00000033898.

Structurei

3D structure databases

ProteinModelPortaliP02463.
SMRiP02463.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini1445 – 1669Collagen IV NC1PROSITE-ProRule annotationAdd BLAST225

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni173 – 1440Triple-helical regionAdd BLAST1268

Domaini

Alpha chains of type IV collagen have a non-collagenous domain (NC1) at their C-terminus, frequent interruptions of the G-X-Y repeats in the long central triple-helical domain (which may cause flexibility in the triple helix), and a short N-terminal triple-helical 7S domain.

Sequence similaritiesi

Belongs to the type IV collagen family.PROSITE-ProRule annotation
Contains 1 collagen IV NC1 (C-terminal non-collagenous) domain.PROSITE-ProRule annotation

Keywords - Domaini

Collagen, Repeat, Signal

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410XNMM. LUCA.
GeneTreeiENSGT00840000129673.
HOVERGENiHBG004933.
InParanoidiP02463.
KOiK06237.
OMAiCEANGPP.
OrthoDBiEOG091G0613.
PhylomeDBiP02463.
TreeFamiTF316865.

Family and domain databases

Gene3Di2.170.240.10. 1 hit.
InterProiIPR008160. Collagen.
IPR001442. Collagen_VI_NC.
IPR016187. CTDL_fold.
[Graphical view]
PfamiPF01413. C4. 2 hits.
PF01391. Collagen. 17 hits.
[Graphical view]
SMARTiSM00111. C4. 2 hits.
[Graphical view]
SUPFAMiSSF56436. SSF56436. 2 hits.
PROSITEiPS51403. NC1_IV. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P02463-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MGPRLSVWLL LLFAALLLHE ERSRAAAKGD CGGSGCGKCD CHGVKGQKGE
60 70 80 90 100
RGLPGLQGVI GFPGMQGPEG PHGPPGQKGD AGEPGLPGTK GTRGPPGAAG
110 120 130 140 150
YPGNPGLPGI PGQDGPPGPP GIPGCNGTKG ERGPLGPPGL PGFSGNPGPP
160 170 180 190 200
GLPGMKGDPG EILGHVPGTL LKGERGFPGI PGMPGSPGLP GLQGPVGPPG
210 220 230 240 250
FTGPPGPPGP PGPPGEKGQM GSSFQGPKGD KGEQGVSGPP GVPGQAQVKE
260 270 280 290 300
KGDFAPTGEK GQKGEPGFPG VPGYGEKGEP GKQGPRGKPG KDGEKGERGS
310 320 330 340 350
PGIPGDSGYP GLPGRQGPQG EKGEAGLPGP PGTVIGTMPL GEKGDRGYPG
360 370 380 390 400
APGLRGEPGP KGFPGTPGQP GPPGFPTPGQ AGAPGFPGER GEKGDQGFPG
410 420 430 440 450
VSLPGPSGRD GAPGPPGPPG PPGQPGHTNG IVECQPGPPG DQGPPGTPGQ
460 470 480 490 500
PGLTGEVGQK GQKGESCLAC DTEGLRGPPG PQGPPGEIGF PGQPGAKGDR
510 520 530 540 550
GLPGRDGLEG LPGPQGSPGL IGQPGAKGEP GEIFFDMRLK GDKGDPGFPG
560 570 580 590 600
QPGMPGRAGT PGRDGHPGLP GPKGSPGSIG LKGERGPPGG VGFPGSRGDI
610 620 630 640 650
GPPGPPGVGP IGPVGEKGQA GFPGGPGSPG LPGPKGEAGK VVPLPGPPGA
660 670 680 690 700
AGLPGSPGFP GPQGDRGFPG TPGRPGIPGE KGAVGQPGIG FPGLPGPKGV
710 720 730 740 750
DGLPGEIGRP GSPGRPGFNG LPGNPGPQGQ KGEPGIGLPG LKGQPGLPGI
760 770 780 790 800
PGTPGEKGSI GGPGVPGEQG LTGPPGLQGI RGDPGPPGVQ GPAGPPGVPG
810 820 830 840 850
IGPPGAMGPP GGQGPPGSSG PPGIKGEKGF PGFPGLDMPG PKGDKGSQGL
860 870 880 890 900
PGLTGQSGLP GLPGQQGTPG VPGFPGSKGE MGVMGTPGQP GSPGPAGTPG
910 920 930 940 950
LPGEKGDHGL PGSSGPRGDP GFKGDKGDVG LPGMPGSMEH VDMGSMKGQK
960 970 980 990 1000
GDQGEKGQIG PTGDKGSRGD PGTPGVPGKD GQAGHPGQPG PKGDPGLSGT
1010 1020 1030 1040 1050
PGSPGLPGPK GSVGGMGLPG SPGEKGVPGI PGSQGVPGSP GEKGAKGEKG
1060 1070 1080 1090 1100
QSGLPGIGIP GRPGDKGDQG LAGFPGSPGE KGEKGSAGTP GMPGSPGPRG
1110 1120 1130 1140 1150
SPGNIGHPGS PGLPGEKGDK GLPGLDGVPG VKGEAGLPGT PGPTGPAGQK
1160 1170 1180 1190 1200
GEPGSDGIPG SAGEKGEQGV PGRGFPGFPG SKGDKGSKGE VGFPGLAGSP
1210 1220 1230 1240 1250
GIPGVKGEQG FMGPPGPQGQ PGLPGTPGHP VEGPKGDRGP QGQPGLPGHP
1260 1270 1280 1290 1300
GPMGPPGFPG INGPKGDKGN QGWPGAPGVP GPKGDPGFQG MPGIGGSPGI
1310 1320 1330 1340 1350
TGSKGDMGLP GVPGFQGQKG LPGLQGVKGD QGDQGVPGPK GLQGPPGPPG
1360 1370 1380 1390 1400
PYDVIKGEPG LPGPEGPPGL KGLQGPPGPK GQQGVTGSVG LPGPPGVPGF
1410 1420 1430 1440 1450
DGAPGQKGET GPFGPPGPRG FPGPPGPDGL PGSMGPPGTP SVDHGFLVTR
1460 1470 1480 1490 1500
HSQTTDDPLC PPGTKILYHG YSLLYVQGNE RAHGQDLGTA GSCLRKFSTM
1510 1520 1530 1540 1550
PFLFCNINNV CNFASRNDYS YWLSTPEPMP MSMAPISGDN IRPFISRCAV
1560 1570 1580 1590 1600
CEAPAMVMAV HSQTIQIPQC PNGWSSLWIG YSFVMHTSAG AEGSGQALAS
1610 1620 1630 1640 1650
PGSCLEEFRS APFIECHGRG TCNYYANAYS FWLATIERSE MFKKPTPSTL
1660
KAGELRTHVS RCQVCMRRT
Length:1,669
Mass (Da):160,679
Last modified:February 20, 2007 - v4
Checksum:iEFEEC72AF301E5CF
GO

Sequence cautioni

The sequence AAH72650 differs from that shown. Insertion sequence.Curated
The sequence AAH72650 differs from that shown. Reason: Frameshift at position 1547.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti26A → P in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti186S → L in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti319Q → S in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti369Q → L in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti403L → F in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti481P → L in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti493Q → H in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti621G → S in BAE27208 (PubMed:16141072).Curated1
Sequence conflicti712S → I in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti813Q → E in AAA50292 (PubMed:2703490).Curated1
Sequence conflicti982Q → H in CAA29946 (PubMed:3338568).Curated1
Sequence conflicti1397V → S in AAA37342 (PubMed:3755692).Curated1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
J04694 mRNA. Translation: AAA50292.1.
AK142097 mRNA. Translation: BAE24936.1.
AK146487 mRNA. Translation: BAE27208.1.
AK147284 mRNA. Translation: BAE27820.1.
AK147355 mRNA. Translation: BAE27863.1.
AK147661 mRNA. Translation: BAE28055.1.
BC002269 mRNA. Translation: AAH02269.1.
BC056620 mRNA. Translation: AAH56620.1.
BC072650 mRNA. Translation: AAH72650.1. Sequence problems.
X06777 mRNA. Translation: CAA29946.1.
J03758 mRNA. Translation: AAA37439.1.
J03944 Genomic DNA. Translation: AAA37442.1.
J04448 Genomic DNA. Translation: AAA37437.1.
M23333 Genomic DNA. Translation: AAA51625.1.
M12879 Genomic DNA. Translation: AAA37343.1.
M13024 Genomic DNA. No translation available.
M13025 Genomic DNA. No translation available.
M13026 Genomic DNA. Translation: AAA37344.1.
M13027 Genomic DNA. Translation: AAA37345.1.
M13043 Genomic DNA. Translation: AAA37346.1.
M14042 mRNA. Translation: AAA37342.1.
X02201 mRNA. Translation: CAA26132.1.
M15832 mRNA. Translation: AAA37340.1.
CCDSiCCDS40219.1.
PIRiA33525. CGMS4B.
RefSeqiNP_034061.2. NM_009931.2.
UniGeneiMm.738.

Genome annotation databases

EnsembliENSMUST00000033898; ENSMUSP00000033898; ENSMUSG00000031502.
GeneIDi12826.
KEGGimmu:12826.
UCSCiuc009kvb.2. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
J04694 mRNA. Translation: AAA50292.1.
AK142097 mRNA. Translation: BAE24936.1.
AK146487 mRNA. Translation: BAE27208.1.
AK147284 mRNA. Translation: BAE27820.1.
AK147355 mRNA. Translation: BAE27863.1.
AK147661 mRNA. Translation: BAE28055.1.
BC002269 mRNA. Translation: AAH02269.1.
BC056620 mRNA. Translation: AAH56620.1.
BC072650 mRNA. Translation: AAH72650.1. Sequence problems.
X06777 mRNA. Translation: CAA29946.1.
J03758 mRNA. Translation: AAA37439.1.
J03944 Genomic DNA. Translation: AAA37442.1.
J04448 Genomic DNA. Translation: AAA37437.1.
M23333 Genomic DNA. Translation: AAA51625.1.
M12879 Genomic DNA. Translation: AAA37343.1.
M13024 Genomic DNA. No translation available.
M13025 Genomic DNA. No translation available.
M13026 Genomic DNA. Translation: AAA37344.1.
M13027 Genomic DNA. Translation: AAA37345.1.
M13043 Genomic DNA. Translation: AAA37346.1.
M14042 mRNA. Translation: AAA37342.1.
X02201 mRNA. Translation: CAA26132.1.
M15832 mRNA. Translation: AAA37340.1.
CCDSiCCDS40219.1.
PIRiA33525. CGMS4B.
RefSeqiNP_034061.2. NM_009931.2.
UniGeneiMm.738.

3D structure databases

ProteinModelPortaliP02463.
SMRiP02463.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi198816. 1 interactor.
IntActiP02463. 1 interactor.
MINTiMINT-4388747.
STRINGi10090.ENSMUSP00000033898.

PTM databases

iPTMnetiP02463.
PhosphoSitePlusiP02463.

Proteomic databases

MaxQBiP02463.
PaxDbiP02463.
PeptideAtlasiP02463.
PRIDEiP02463.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000033898; ENSMUSP00000033898; ENSMUSG00000031502.
GeneIDi12826.
KEGGimmu:12826.
UCSCiuc009kvb.2. mouse.

Organism-specific databases

CTDi1282.
MGIiMGI:88454. Col4a1.

Phylogenomic databases

eggNOGiKOG3544. Eukaryota.
ENOG410XNMM. LUCA.
GeneTreeiENSGT00840000129673.
HOVERGENiHBG004933.
InParanoidiP02463.
KOiK06237.
OMAiCEANGPP.
OrthoDBiEOG091G0613.
PhylomeDBiP02463.
TreeFamiTF316865.

Enzyme and pathway databases

ReactomeiR-BTA-3000480. Scavenging by Class A Receptors.
R-MMU-1442490. Collagen degradation.
R-MMU-1474244. Extracellular matrix organization.
R-MMU-1650814. Collagen biosynthesis and modifying enzymes.
R-MMU-186797. Signaling by PDGF.
R-MMU-2022090. Assembly of collagen fibrils and other multimeric structures.
R-MMU-216083. Integrin cell surface interactions.
R-MMU-2214320. Anchoring fibril formation.
R-MMU-3000157. Laminin interactions.
R-MMU-3000171. Non-integrin membrane-ECM interactions.
R-MMU-419037. NCAM1 interactions.

Miscellaneous databases

ChiTaRSiCol4a1. mouse.
PROiP02463.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000031502.
CleanExiMM_COL4A1.
ExpressionAtlasiP02463. baseline and differential.
GenevisibleiP02463. MM.

Family and domain databases

Gene3Di2.170.240.10. 1 hit.
InterProiIPR008160. Collagen.
IPR001442. Collagen_VI_NC.
IPR016187. CTDL_fold.
[Graphical view]
PfamiPF01413. C4. 2 hits.
PF01391. Collagen. 17 hits.
[Graphical view]
SMARTiSM00111. C4. 2 hits.
[Graphical view]
SUPFAMiSSF56436. SSF56436. 2 hits.
PROSITEiPS51403. NC1_IV. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCO4A1_MOUSE
AccessioniPrimary (citable) accession number: P02463
Secondary accession number(s): Q3UHJ4
, Q3UJE7, Q3UQV2, Q53X35, Q6GQS7, Q6PHB5, Q99LQ8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: February 20, 2007
Last modified: November 30, 2016
This is version 162 of the entry and version 4 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.