Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Introduction

Number of entries
New entries 5,117,258
Updated entries 11,294,535
Unchanged entries 103,832,056
Total 120,243,849
Entries with updated sequences 971
With a fragmented AA sequence 11,171,747
With known alternative products 0
Protein Existence (PE) Number of entries
1 Evidence at protein level 144,063
2 Evidence at transcript level 1,158,492
3 Inferred from homology 29,451,155
4 Predicted 89,490,139
5 Uncertain 0

Taxonomic Origin


Statistics on the number of species

Number of species in
New entries 29,845
Updated entries 107,181
Unchanged entries 609,948
Total 678,574

Sequence data

The shortest sequence is C4PYW0 at 2 AA while the longest sequence is A0A1V4K6M4 at 36,991 AA

Some annotation statistics

General Annotation (comments)

Annotations Entries
Allergenic properties 0 0
Alternative products 0 0
Biophysicochemical properties 0 0
Biotechnological use 0 0
Catalytic activity 13,785,224 12,536,308
Caution 66,979,763 65,495,692
Cofactor 9,864,568 0
Developmental stage 0 0
Involvement in disease 0 0
Disruption phenotype 0 0
Domain 918,156 869,065
Enzyme regulation 330,204 330,202
Function 15,921,833 15,159,638
Induction 57,847 57,847
Mass spectrometry 0 0
Miscellaneous 546,370 537,982
Pathway 7,011,048 6,313,502
Pharmaceutical use 0 0
Polymorphism 0 0
Post-translational modification 903,234 754,535
RNA Editing 0 0
Sequence caution 0 0
Sequence similarities 29,515,102 29,113,756
Subcellular Location 0 0
Subunit structure 8,328,250 8,237,115
Tissue specificity 0 0
Toxic dose 0 0

Sequence Annotation (featues)

Annotations Entries
Molecule processing 17,370,423 8,707,010
Chain 8,669,599 8,657,761
Initiator methionine 37,770 37,770
Peptide 682 417
Propeptide 17,089 17,089
Signal peptide 8,645,150 8,645,140
Transit peptide 133 133
Regions 229,542,602 78,875,555
Calcium binding 260,654 128,858
Coiled-coil 17,624,213 11,719,738
Compositional bias 4,470 4,470
DNA binding 3,001,289 2,659,607
Domain 84,311,796 60,813,376
Motif 1,383,766 965,010
Nucleotide binding 6,758,001 4,267,490
Repeat 4,684,076 1,119,371
Region 5,025,770 2,675,109
Topological domain 257,614 99,875
Transmembrane 105,810,969 23,302,915
Zinc finger 418,852 330,372
Sites 37,399,436 8,103,967
Active site 7,147,764 4,346,467
Metal binding 12,624,102 3,358,137
Binding site 15,675,913 4,026,691
Other 1,951,657 1,156,875
Amino acid modifications 4,504,281 2,527,420
Cross-link 28,217 26,267
Disulfide bond 1,892,583 506,517
Glycosylation 21,296 20,153
Lipidation 293,700 154,070
Modified residue 2,263,476 2,033,902
Non-standard residue 5,009 4,816
Experimental info 16,967,367 11,250,450
Mutagenesis 0 0
Non-adjacent residues 0 0
Non-terminal residue 16,880,148 11,222,669
Sequence conflict 0 0
Sequence uncertainty 87,219 72,673

Citation usage

Citation type Citations Entries
Submission101,870,09190,222,864
Journal article39,055,34936,888,555
Book11,37811,313
Thesis14,95114,892
Patent11
Unpublished observations00
Online journal article00

Additional automatically mapped literature

Citation type Citations Entries
Journal articles 735,040 447,348

For information about which journals are used in citing or mapping to UniProtKB see the journals section.

Database Cross-Reference Statistics

DatabaseEntities linked toEntries
Sequence databases
EMBL130,896,949116,382,363
PIR162,693130,450
RefSeq44,180,45443,100,764
UniGene866,109732,349
3D structure databases
DisProt9696
PDB37,07418,219
PDBsum36,52717,874
ProteinModelPortal7,207,2277,207,227
SMR1,203,3951,203,395
Protein-protein interaction databases
CORUM114114
ComplexPortal157121
DIP3,2183,217
ELM107107
IntAct28,72728,727
MINT2,6302,630
STRING6,443,4836,443,234
Chemistry
BindingDB260260
ChEMBL965965
DrugBank742449
GuidetoPHARMACOLOGY44
SwissLipids8282
Protein family/group databases
Allergome3,9493,184
CAZy129,092120,803
ESTHER74,55274,254
MEROPS243,563243,562
MoonDB11
MoonProt6464
PeroxiBase2,4752,467
REBASE31,47831,467
TCDB8,1688,157
UniLectin158158
mycoCLAP447447
PTM databases
CarbonylDB265265
GlyConnect1313
PhosphoSitePlus2,2422,242
SwissPalm1,9731,973
UniCarbKB1717
iPTMnet5,1435,143
Polymorphism and mutation databases
2D gel databases
COMPLUYEAST-2DPAGE44
OGP33
REPRODUCTION-2DPAGE6261
SWISS-2DPAGE11
World-2DPAGE316311
Proteomic databases
EPD14,15214,152
MaxQB42,42842,428
PRIDE341,047341,047
PaxDb328,130328,130
PeptideAtlas129,018129,018
ProMEX3,2753,275
TopDownProteomics280280
Protocols and materials databases
DNASU41,30640,867
Genome annotation databases
Ensembl1,908,0391,864,114
EnsemblBacteria39,180,10436,966,740
EnsemblFungi6,182,8006,076,500
EnsemblMetazoa1,178,9551,126,765
EnsemblPlants2,164,4561,964,933
EnsemblProtists1,872,7831,760,841
GeneDB114,676112,896
GeneID10,485,78010,379,434
Gramene2,164,4561,964,933
KEGG16,130,86015,708,492
PATRIC17,407,63117,400,486
UCSC93,16292,957
VectorBase578,280559,614
WBParaSite854,112845,705
Organism-specific databases
ArachnoServer200200
Araport15,23815,171
CGD20,80520,739
CTD1,137,8521,135,895
ConoServer159159
EuPathDB671,630670,935
FlyBase208,353207,056
GeneCards1,3151,296
H-InvDB587440
HGNC52,00651,910
LegioList2,4962,483
Leproma1,2711,269
MGI61,80361,368
MIM44
MalaCards1212
OpenTargets49,94149,892
PharmGKB3,1363,136
PomBase22
PseudoCAP4,4494,445
RGD21,62520,732
SGD77
TAIR11,89911,837
TubercuList1,000999
VGNC78,52978,529
WormBase55,87455,490
Xenbase34,43234,353
ZFIN53,08352,461
dictyBase7,9877,765
euHCVdb75,26775,264
Phylogenomic databases
GeneTree1,831,5841,831,506
HOGENOM2,998,5042,998,423
HOVERGEN300,387300,374
InParanoid2,347,0502,347,050
KO7,077,3627,048,433
OMA6,855,0576,854,980
OrthoDB14,257,00214,256,882
PhylomeDB461,358461,358
TreeFam558,543558,507
eggNOG13,813,8186,924,169
Enzyme and pathway databases
BRENDA9,5689,278
BioCyc6,073,6296,055,343
Reactome276,229100,066
SABIO-RK633633
SIGNOR77
SignaLink3,7983,798
UniPathway6,986,4606,292,474
Other
ChiTaRS131,488131,487
EvolutionaryTrace5,9455,945
GenomeRNAi30,02830,028
PMAP-CutDB130130
PRO2,2602,260
Gene expression databases
Bgee534,182533,882
CollecTF199199
ExpressionAtlas643,201642,942
Genevisible15,84815,841
Ontologies
Family and domain databases
CDD21,099,03018,559,937
Gene3D51,721,65743,061,762
HAMAP13,159,12813,011,552
InterPro302,541,74492,727,589
PANTHER24,588,97023,734,061
PIRSF10,366,41110,279,751
PRINTS15,677,30714,138,235
PROSITE59,821,21539,872,218
Pfam116,410,56884,576,904
ProDom1,696,5651,624,401
SFLD1,075,766559,125
SMART28,297,39521,502,817
SUPFAM77,448,49161,319,406
TIGRFAMs24,697,90322,720,987

Web resource

0 UniProtKB/TrEMBL entries have at least one link to a webpage of general interest on the protein.

Amino acid distribution statistics

  • 9.1%Alanine
  • 5.7%Arginine
  • 3.8%Asparagine
  • 5.4%Aspartate
  • 1.2%Cysteine
  • 3.7%Glutamine
  • 6.1%Glutamate
  • 7.2%Glycine
  • 2.1%Histidine
  • 5.7%Isoleucine
  • 9.8%Leucine
  • 4.9%Lysine
  • 2.3%Methionine
  • 3.9%Phenylalanine
  • 4.8%Proline
  • 6.6%Serine
  • 5.5%Threonine
  • 1.2%Tryptophan
  • 2.9%Tyrosine
  • 6.8%Valine
  • Aliphatic
  • Acidic
  • Small hydroxy
  • Basic
  • Amide
  • Aromatic
  • Sulfur

Miscellaneous Statistics

1,863,750 entries are encoded on a mitochondrion, and 759,160 are encoded on a plasmid.

764,358 entries are encoded on a plastid, of which 785 are encoded on apicoplasts, 639,857 on chloroplasts, 1 on organellar chromatophores, 8 on cyanelles, 1,521 on non-photosynthetic plastids and 3,190 on unspecified types of plastid.

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health