UniProt release 2011_09
Published September 21, 2011
Reference proteomes in UniProt
With the significant increase in the number of complete genomes sequenced, it is critically important to organise this data in a way that allows users to effectively navigate the growing number of available complete proteome sequences. The approach adopted by UniProt to meet this challenge is to define a set of “reference proteomes” which are “landmarks” in proteome space.
Reference proteomes have been selected to provide broad coverage of the tree of life, and constitute a representative cross-section of the taxonomic diversity to be found within UniProtKB. They include the proteomes of well-studied model organisms and other proteomes of interest for biomedical and biotechnological research. Species of particular importance may be represented by numerous reference proteomes for specific ecotypes or strains of interest.
Currently, UniProt has defined 455 reference proteomes in collaboration with Ensembl and NCBI Reference Sequence collection. The keyword ‘Reference proteome’ has been created to allow their easy retrieval, and the keyword ‘Virus reference strain’ has been deprecated to reflect this.
The reference proteome will be continuously reviewed as new proteomes of interest become available and as existing taxonomic classifications are revised. We would very much welcome feedback on our current list of reference proteomes and suggestions for new candidates via firstname.lastname@example.org.
Changes to keywords
Replacement of the keyword ‘Virus reference strain’ by ‘Reference proteome’
We have introduced the more widely applicable keyword ‘Reference proteome’ to replace the keyword ‘Virus reference strain’. All ‘Virus reference strains’ are now defined as ‘Reference proteomes’. See preceding text for further information on ‘Reference proteomes’.New keyword: