Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Annotation guidelines

Last modified September 20, 2013

Standard operating procedure (SOP) for UniProt manual curation

This document describes the manual curation procedure used by the UniProt Consortium members.The UniProt manual curation process comprises manual review of results from a range of sequence analysis programs and literature curation of experimental data as well as attribution of all information to its original source.

UniProtKB/Swiss-Prot document: Standard operating procedure (SOP) for UniProt manual curation

UniProt web page: UniProt manual annotation program

How do we manually annotate a UniProtKB entry?

Why is UniProtKB composed of 2 sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL?

Protein naming guidelines

Ambiguities regarding gene/protein names are a major problem in the literature and in the sequence databases which tend to propagate the confusion. As administrators of UniProt we feel that we can play a major role in standardization of protein nomenclature. The following document lists the rules and syntax applied to all UniProtKB/Swiss-Prot proteins.

UniProtKB/Swiss-Prot document: Protein naming guidelines

User manual: Protein names

Why does the UniProtKB use so many different names for the same protein?

The document ‘Protein nomenclature publication list’ lists references that are important in defining the nomenclature or terminology relative to proteins in general and in particular on specific family or groups of proteins.

The document ‘Generalised protein naming guidelines’ is a subset of the UniProtKB document ‘Protein naming guidelines’ which has been developed with the International Nucleotide Sequence Database collaboration (INSDC)to provide guidelines for their submitters.

The document ‘Prokaryotic protein naming guidelines’ is a subset of the UniProtKB document ‘Protein naming guidelines’ which has been developed with the International Nucleotide Sequence Database Collaboration (INSDC)to provide guidelines for submitters of prokaryotic data.

Criteria description for protein existence

Some of protein sequences exhibit strong similarity to known proteins in closely related species. For other proteins there is experimental evidence, such as Edman sequencing, clear identification by mass spectrometry (MSI), X-ray or NMR structure, detection by antibodies, etc. However, for some other proteins, there is no evidence at all. To indicate these different levels of evidence for the existence of a protein, we have introduced a PE (Protein Existence) level. The following document lists the criteria used to assign a protein existence (PE) level to entries.

UniProtKB/Swiss-Prot document: Criteria used to assign the PE level of entries

User manual: Protein existence

Why do we keep dubious sequences in UniProtKB? How to discard them from a protein set?