Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.

UniProt release 4.1

Published February 15, 2005


Massive number of changes to entry names

We recently allowed entry names to consist of up to 11 characters instead of 10. An entry name consists of two parts, a prefix which is a mnemonic code representing the protein name and a suffix which is a mnemonic species identification code. Example: RECA_BACSU is the entry name for the recA protein of Bacillus subtilis. The increase from 10 to 11 characters allows the protein name mnemonic to increase from 4 to 5 characters.

Thanks to this change, we are now able to assign more meaningful entry names to a significant number of entries. In the past month we have gone through almost all of the Swiss-Prot entries and have checked to see if they could benefit from an entry name update. As a consequence of this process we updated more than 35'000 entry names (about 20% of all the entries in Swiss-Prot). In about 33'000 cases we created ID prefixes consisting of 5 characters and in the rest of the cases we changed existing prefixes of 3 to 4 characters to more meaningful and consistent prefixes of the same length.

Due to this massive changes in entry names it is probable that the names of some protein entries that you are using have changed. It is therefore useful to remind our users that we provide a tool, the IDtracker which allows to trace the identifiers (ID) of protein entries. You can use this tool to enquire on the whereabouts of one or more entry names and to obtain the newly assigned names and primary accession numbers.

It can seem paradoxical that we insist in warning users that they should always use accession numbers when citing an entry, yet we strive to provide meaningful entry names. The reason for this dichotomy is simple: if you want to refer to specific entries in a publication or in any document , you need to ensure that your reference is stable, unique and unambiguous. Such a mechanism is provided by the accession numbers. Accession numbers are stable identifiers and you can be sure that you will be always able to track down a specific entry if you use an accession number. But when you want to access an entry from a web server or a sequence analysis program, then it is much easier to remember an entry name than an accession number. The human mind is structured in such a way that is generally easier for most of us to remember something like "APOA5_HUMAN" rather than something like "Q6Q788".

UniProtKB News

Cross-references to GeneDB_Spombe

We changed the Data bank identifier for the Schizosaccharomyces pombe GeneDB Prototype from GeneDB_SPombe to GeneDB_Spombe.

Cross-references to LegioList

Cross-references have been added to LegioList, a database which provides a complete dataset of DNA and protein sequences derived from L. pneumophila strain Paris and strain Lens, linked to the relevant annotations and functional assignments. LegioList is available at

The format of the explicit links in the flat file is:

Resource abbreviation LegioList
Resource identifier Ordered locus name.
DR   LegioList; lpp2301; -.