Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Automatic gene-centric isoform mapping for eukaryotic reference proteome entries

Last modified January 26, 2021

In eukaryotic reference proteomes, unreviewed entries that are likely to belong to the same gene are computationally mapped, based on gene identifiers from Ensembl, EnsemblGenomes and model organism databases.

Some proteomes have been (manually and algorithmically) selected as reference proteomes. They cover well-studied model organisms and other organisms of interest for biomedical research and phylogeny. In this context, we provide data sets for reference proteomes where only one form of a protein, usually the best annotated version in UniProtKB, is present. The relationships identified when generating these data sets are also used when displaying individual entries on the UniProt website:

A single gene can code for multiple proteins through biological events such as alternative splicing, initiation and promoter usage. While the UniProtKB/Swiss-Prot expert curation process includes the identification and review of different forms of a protein and their description in a single UniProtKB/Swiss-Prot entry, its focus is the functional annotation of proteins. For this reason, not all potential isoforms of a protein that are available in UniProtKB/TrEMBL can be reviewed and merged into a single entry. This results in a larger number of UniProtKB entries than genes for many of the eukaryotic reference proteomes. In order to identify potential isoforms that have not (yet) been reviewed by a biocurator, we have established an automatic gene-centric mapping between entries from eukaryotic reference proteomes that are likely to belong to the same gene. This mapping is based on gene identifiers from Ensembl, EnsemblGenomes and model organism databases and, in cases where none of these are available, on gene names assigned by the original sequencing projects.

See also:

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again