Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Chordata protein annotation project

Statistics

UniProt release 2017_11 - Nov-22, 2017 contains a total of 556,196 reviewed entries, which includes 85,225 entries from 3,170 species of Chordata.

Homo sapiens (Human) - 20,243 reviewed entries.

Number of canonical and isoform protein sequences: 47,221 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 75.7%
at transcript level 17.3%
inferred from homology 3.5%
predicted 0.7%
uncertain 2.8%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 20,243 72,440 unique 100%
Alternative products 10,632 32,982 52.5%
General annotation 19,638 118,250 97%
Function 15,977 16,706 78.9%
Catalytic activity 3,402 3,950 16.8%
Subcellular location 16,338 uniprot:(organism:9606 reviewed:yes) N/A
Sequence annotation 19,751 535,009 97.6%
Amino acid modifications 13,506 95,598 66.7%
Natural Variant 12,856 78,570 63.5%
Cross-references 20,243 1,620,953 100%
EMBL 20,228 151,433 99.9%
InterPro 19,295 76,394 95.3%
PDB 6,197 45,251 30.6%
RefSeq 18,936 55,429 93.5%
MIM 14,857 20,616 73.4%
HGNC 20,032 20,174 99%

100% of reviewed human entries are annotated with at least one keyword .

93.4% of reviewed human entries are annotated with at least one GO (Gene Ontology ) term.

Mus musculus (Mouse) - 16,944 reviewed entries.

Number of canonical and isoform protein sequences: 31,977 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 73%
at transcript level 24.3%
inferred from homology 2.2%
predicted 0.4%
uncertain 0.1%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 16,944 28,485 unique 100%
Alternative products 4,885 13,264 28.8%
General annotation 16,428 82,647 97%
Function 14,030 14,383 82.8%
Catalytic activity 3,275 3,764 19.3%
Subcellular location 14,211 uniprot:(organism:10090 reviewed:yes) 83.9%
Sequence annotation 16,653 281,808 98.3%
Amino acid modifications 12,103 83,903 71.4%
Natural Variant 381 1,188 2.2%
Cross-references 16,943 873,556 100%
EMBL 16,816 77,251 99.2%
InterPro 16,642 66,724 98.2%
PDB 1,671 5,745 9.9%
RefSeq 15,933 31,739 94%
MGI 16,772 16,812 99%

100% of reviewed mouse entries are annotated with at least one keyword .

96.6% of reviewed mouse entries are annotated with at least one GO (Gene Ontology ) term.

Rattus norvegicus (Rat) - 8,017 reviewed entries.

Number of canonical and isoform protein sequences: 12,907 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 55.5%
at transcript level 39.6%
inferred from homology 4.7%
predicted 0.1%
uncertain 0%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 8,017 11,916 unique 100%
Alternative products 984 2,723 12.3%
General annotation 7,824 35,073 97.6%
Function 7,104 7,314 88.6%
Catalytic activity 1,900 2,220 23.7%
Subcellular location 7,050 uniprot:(organism:10116 reviewed:yes) 87.9%
Sequence annotation 7,856 124,885 98%
Amino acid modifications 6,209 43,386 77.4%
Natural Variant 112 277 1.4%
Cross-references 8,011 328,733 100%
EMBL 7,910 15,819 98.7%
InterPro 7,892 32,805 98.4%
PDB 584 3,192 7.3%
RefSeq 7,114 11,459 88.7%
RGD 7,940 7,941 99%

100% of reviewed rat entries are annotated with at least one keyword .

98.2% of reviewed rat entries are annotated with at least one GO (Gene Ontology ) term.