Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Chordata protein annotation project

Statistics

UniProt release 2017_12 - Dec-20, 2017 contains a total of 556,388 reviewed entries, which includes 85,252 entries from 3,171 species of Chordata.

Homo sapiens (Human) - 20,244 reviewed entries.

Number of canonical and isoform protein sequences: 47,177 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 75.8%
at transcript level 17.3%
inferred from homology 3.5%
predicted 0.7%
uncertain 2.8%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 20,244 72,642 unique 100%
Alternative products 10,633 32,975 52.5%
General annotation 19,641 118,410 97%
Function 15,997 16,734 79%
Catalytic activity 3,399 3,953 16.8%
Subcellular location 16,349 uniprot:(organism:9606 reviewed:yes) N/A
Sequence annotation 19,753 536,238 97.6%
Amino acid modifications 13,507 95,606 66.7%
Natural Variant 12,865 78,969 63.5%
Cross-references 20,244 1,623,461 100%
EMBL 20,229 151,458 99.9%
InterPro 19,310 76,675 95.4%
PDB 6,226 45,570 30.8%
RefSeq 18,938 55,435 93.5%
MIM 14,902 20,673 73.6%
HGNC 20,034 20,176 99%

100% of reviewed human entries are annotated with at least one keyword .

93.4% of reviewed human entries are annotated with at least one GO (Gene Ontology ) term.

Mus musculus (Mouse) - 16,950 reviewed entries.

Number of canonical and isoform protein sequences: 31,919 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 73%
at transcript level 24.3%
inferred from homology 2.2%
predicted 0.4%
uncertain 0.1%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 16,950 28,574 unique 100%
Alternative products 4,885 13,255 28.8%
General annotation 16,435 82,772 97%
Function 14,052 14,410 82.9%
Catalytic activity 3,273 3,767 19.3%
Subcellular location 14,227 uniprot:(organism:10090 reviewed:yes) 83.9%
Sequence annotation 16,660 282,186 98.3%
Amino acid modifications 12,108 83,924 71.4%
Natural Variant 382 1,189 2.3%
Cross-references 16,949 875,289 100%
EMBL 16,822 77,291 99.2%
InterPro 16,658 66,959 98.3%
PDB 1,675 5,785 9.9%
RefSeq 15,942 31,781 94.1%
MGI 16,778 16,818 99%

100% of reviewed mouse entries are annotated with at least one keyword .

96.7% of reviewed mouse entries are annotated with at least one GO (Gene Ontology ) term.

Rattus norvegicus (Rat) - 8,020 reviewed entries.

Number of canonical and isoform protein sequences: 12,812 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 55.6%
at transcript level 39.6%
inferred from homology 4.7%
predicted 0.1%
uncertain 0%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 8,020 11,926 unique 100%
Alternative products 984 2,716 12.3%
General annotation 7,827 35,119 97.6%
Function 7,110 7,322 88.7%
Catalytic activity 1,901 2,225 23.7%
Subcellular location 7,056 uniprot:(organism:10116 reviewed:yes) 88%
Sequence annotation 7,859 124,997 98%
Amino acid modifications 6,211 43,411 77.4%
Natural Variant 112 277 1.4%
Cross-references 8,014 329,071 100%
EMBL 7,913 15,830 98.7%
InterPro 7,898 32,892 98.5%
PDB 585 3,202 7.3%
RefSeq 7,117 11,465 88.7%
RGD 7,943 7,944 99%

100% of reviewed rat entries are annotated with at least one keyword .

98.2% of reviewed rat entries are annotated with at least one GO (Gene Ontology ) term.