Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

UniProt release 15.8

Published September 22, 2009


300'000 HAMAP cross-references in UniProtKB/Swiss-Prot

Bacteria and archaea can live in pretty much every environmental niche we know of. From the bottom of the ocean floor to arctic ice, from wastewater treatment sludge to animal-and plant- associated environments, bacteria and archaea are everywhere. To explore this diversity the number of bacterial (and to a lesser extent archaeal) genomes being sequenced is rising practically exponentially, giving rise to huge numbers of protein sequences that are annotated to varying degrees of quality. To be able to use this data appropriately quality annotation is however essential. In order to supply this we started the HAMAP project (High-quality Automatic and Manual Annotation of microbial Proteins) in 2000. In this project, proteins from complete bacterial and archaeal proteomes, together with related plastid proteins, are automatically annotated based on manually created annotation templates for complete protein annotation, with template-based feature propagation. The annotation templates and much more are available on the HAMAP website. As of January 2008 the sequences annotated by the HAMAP pipeline that fulfill all of its stringent criteria have been entering automatically into the Swiss-Prot section of UniProtKB (see release 54.7 news).

There are now 304'013 UniProtKB/Swiss-Prot entries with a HAMAP cross-reference line to at least one of the 1'595 HAMAP families; 278'635 are bacterial, 14'601 are archaeal and 10'777 are encoded in plastids. Note that some of these entries are the templates for their families (see for example P31120); they include extra information not propagated to all members (for example biophysical chemical characterization, mutagenesis experiments, 3D structures, induction and so on) that has allowed their use as models to annotate further entries (compare entry B6I1P9 containing propagated annotation based on the family rule MF_01554 with model entry P31120).

This large number of semi-automatically annotated entries means that nearly 60% of UniProtKB/Swiss-Prot consists of HAMAP entries; add to this the approximately 30'000 other bacterial and archaeal entries that are not members of a HAMAP family and you find that the total number of bacterial and archaeal entries in UniProtKB/Swiss-Prot begins to reflect their preponderance in nature...

UniProtKB News

Changes in subcellular location controlled vocabulary

New subcellular locations:

  • Acrosome outer membrane
  • Spindle pole

Changes concerning the controlled vocabulary for PTMs

New terms for the feature key 'Cross-link' ('CROSSLNK' in the flat file):

  • 3-(O4'-tyrosyl)-valine (Val-Tyr)
  • 5-(methoxymethyl)thiazole-4-carboxylic acid (Val-Cys)
  • 5-methyloxazole-4-carboxylic acid (Ser-Thr)
  • Glutamyl lysine isopeptide (Glu-Lys) (interchain with K-...)
  • Glutamyl lysine isopeptide (Lys-Glu) (interchain with E-...)
  • Oxazole-4-carboxylic acid (Cys-Ser)
  • Oxazole-4-carboxylic acid (Gly-Ser)
  • Oxazoline-4-carboxylic acid (Cys-Ser)
  • Thiazole-4-carboxylic acid (Gly-Cys)

New terms for the feature key 'Modified residue' ('MOD_RES' in the flat file):

  • (3R)-3-hydroxyasparagine
  • (3R)-3-hydroxyaspartate
  • (3R,4R)-3,4-dihydroxyproline
  • (3S)-3-hydroxyasparagine
  • 1-amino-2-propanone
  • 4,5,5'-trihydroxyleucine
  • 5-glutamyl polyglutamate
  • 5-glutamyl polyglycine
  • 5-hydroxy-3-methylproline
  • Cyclo[(prolylserin)-O-yl] cysteinate
  • Glutamyl 5-glycerylphosphorylethanolamine
  • N-carbamoylalanine
  • N6-(pyridoxal phosphate)lysine
  • N6-(retinylidene)lysine
  • N6-biotinyllysine
  • N6-lipoyllysine
  • O-(2-aminoethylphosphoryl)serine
  • O-(2-cholinephosphoryl)serine
  • O-(pantetheine 4'-phosphoryl)serine
  • O-(phosphoribosyl dephospho-coenzyme A)serine
  • O-AMP-threonine

Deleted terms:

  • 2-oxazoline-4-carboxylic acid (Cys-Ser)
  • 3-hydroxy-5-methylproline
  • 5-methyloxazole (Ser-Thr)
  • Oxazole (Cys-Ser)
  • Oxazole (Gly-Ser)
  • Thiazole (Gly-Cys)
  • Thiazoline-4-carboxylic acid (Thr-Cys)