Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Overview

StatusReference proteome
Proteinsi <p>Number of protein entries associated with this proteome: UniProtKB entries for regular proteomes or UniParc entries for redundant proteomes (<a href="/help/proteome%5Fredundancy">more...</a>)</p> 55,315
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="http://www.uniprot.org/manual/proteomes%5Fmanual">proteome</a>. It consists of the characters 'UP' followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000000589
Taxonomy10090 - Mus musculus
StrainC57BL/6J
Genome assembly and annotationi <p>Identifier for the genome assembly (<a href="https://www.ensembl.org/Help/Faq?id=216">more...</a>)</p> GCA_000001635.8 from Ensembl full
Buscoi <p>The Benchmarking Universal Single-Copy Ortholog (BUSCO) assessment tool is used, for eukaryotic and bacterial proteomes, to provide quantitative measures of UniProt proteome data completeness in terms of expected gene content. BUSCO scores include percentages of complete (C) single-copy (S) genes, complete (C) duplicated (D) genes, fragmented (F) and missing (M) genes, as well as the total number of orthologous clusters (n) used in the BUSCO assessment, and the name of the taxonomic lineage dataset used.</p> C:99.7%[S:50.8%,D:48.9%],F:0.1%,M:0.3%,n:13798 glires_odb10
Completenessi <p>Complete Proteome Detector (CPD) is an algorithm which employs statistical evaluation of the completeness and quality of proteomes in UniProt, by looking at the sizes of taxonomically close proteomes. Possible values are 'Standard', 'Close to Standard' and 'Outlier'.</p> Close to standard (high value)

The house mouse (Mus musculus) is a common rodent that is distributed throughout the world. It has become a frequently used model for understanding human disease and development due to its small size, short life-cycle and rapid breeding cycle.

The mouse was the second mammal to have its genome sequenced. The mouse strain used for sequencing was C57BL/6J, which is the most widely used inbred strain. This strain is a permissive background for the expression of most mutations, but is resilient to many tumors.

C57BL/6J mice are susceptible to atherosclerosis, type 2 diabetes and diet-induced obesity. It also develops age-related hearing loss. 75% of the mouse genes are in 1:1 orthologous relationships to human genes and have most likely maintained their ancestral function in both species.

The reference proteome of Mus musculus is derived from the genome sequence of strain C57BL/6J that was published in 2009. The size of the genome is 2.6 Gb.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)
Component representationProteins
Chromosome 13182
Chromosome 24045
Chromosome 32522
Chromosome 42949
Chromosome 53479
Chromosome 63016
Chromosome 75020
Chromosome 82476
Chromosome 93240
Chromosome 102460
Chromosome 113540
Chromosome 121860
Chromosome 131993
Chromosome 142189
Chromosome 152013
Chromosome 161718
Chromosome 172835
Chromosome 181365
Chromosome 191816
Chromosome Y130
Chromosome X1756
Unplaced
1748
Mitochondrion13
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again