Skip Header

ProgramChordata protein annotation program

Statistics

UniProt release 2013_05 - May-01, 2013 contains a total of 540,052 reviewed entries, which includes 83,807 entries from 3,121 species of Chordata.

Homo sapiens (Human) - 20,255 reviewed entries.

Number of canonical and isoform protein sequences: 38,517 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 66.9%
at transcript level 28.4%
inferred from homology 1%
predicted 0.5%
uncertain 3.1%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 20,171 60,592 unique 99.6%
Alternative products 9,043 27,538 44.6%
General annotation 19,887 153,781 98.2%
Function 14,810 15,155 73.1%
Catalytic activity 3,059 3,488 15.1%
Subcellular location 15,481 31,908 76.4%
Sequence annotation 19,634 438,861 96.9%
Amino acid modifications 11,349 65,421 56%
Natural Variant 12,489 68,418 61.7%
Cross-references 20,255 1,186,934 100%
EMBL 20,139 144,614 99.4%
InterPro 18,332 62,194 90.5%
PDB 4,809 25,847 23.7%
RefSeq 18,777 32,729 92.7%
MIM 13,687 18,052 67.6%
HGNC 19,684 19,852 97.2%

100% of reviewed human entries are annotated with at least one keyword .

90.4% of reviewed human entries are annotated with at least one GO (Gene Ontology ) term.

Mus musculus (Mouse) - 16,612 reviewed entries.

Number of canonical and isoform protein sequences: 24,497 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 47.6%
at transcript level 51.1%
inferred from homology 0.8%
predicted 0.4%
uncertain 0.1%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 16,536 23,117 unique 99.5%
Alternative products 4,736 12,779 28.5%
General annotation 16,393 115,616 98.7%
Function 12,877 13,077 77.5%
Catalytic activity 2,896 3,273 17.4%
Subcellular location 13,315 27,417 80.2%
Sequence annotation 16,204 232,668 97.5%
Amino acid modifications 10,094 56,548 60.8%
Natural Variant 367 1,160 2.2%
Cross-references 16,612 676,074 100%
EMBL 16,481 74,965 99.2%
InterPro 15,607 53,200 94%
PDB 1,307 3,927 7.9%
RefSeq 15,384 19,720 92.6%
MGI 16,443 16,489 99%

100% of reviewed mouse entries are annotated with at least one keyword .

93.7% of reviewed mouse entries are annotated with at least one GO (Gene Ontology ) term.

Rattus norvegicus (Rat) - 7,853 reviewed entries.

Number of canonical and isoform protein sequences: 9,386 (download data in FASTA format)

Evidence for the existence of protein Percentage of entries
at protein level 40.2%
at transcript level 55.5%
inferred from homology 4%
predicted 0.3%
uncertain 0%

Annotation categories Entries with Number of annotations Coverage
PubMed citations 7,405 10,692 unique 94.3%
Alternative products 945 2,589 12%
General annotation 7,773 53,071 99%
Function 6,659 6,793 84.8%
Catalytic activity 1,758 2,019 22.4%
Subcellular location 6,703 14,715 85.4%
Sequence annotation 7,580 102,252 96.5%
Amino acid modifications 5,220 28,973 66.5%
Natural Variant 114 273 1.5%
Cross-references 7,853 262,065 100%
EMBL 7,745 14,976 98.6%
InterPro 7,474 26,316 95.2%
PDB 502 2,173 6.4%
RefSeq 6,940 7,697 88.4%
RGD 7,763 7,767 98.9%

100% of reviewed rat entries are annotated with at least one keyword .

96.6% of reviewed rat entries are annotated with at least one GO (Gene Ontology ) term.