Last modified September 21, 2015
UniProtKB Keywords constitute a controlled vocabulary with a hierarchical structure. Keywords summarise the content of a UniProtKB entry and facilitate the search for proteins of interest.
Keywords are classified in 10 categories:
- Biological process
- Cellular component
- Coding sequence diversity
- Developmental stage
- Molecular function
- Post-translational modification
- Technical term
An entry often contains several keywords. Inside a category, the keywords are stored in alphabetical order.
Keywords can be used to retrieve subsets of protein entries or to generate indexes of entries based on functional, structural, or other categories.
Keywords in UniProtKB/TrEMBL
UniProtKB/TrEMBL makes use of the same list of keywords as UniProtKB/Swiss-Prot but, because most keywords in an entry are added in the manual annotation process, UniProtKB/TrEMBL entries generally contain fewer keywords than UniProtKB/Swiss-Prot entries. The main sources of UniProtKB/TrEMBL keywords are:
- The underlying nucleotide entry. The nucleotide databases (e.g. EMBL) contain keywords that are transferred to the corresponding UniProtKB/TrEMBL entry provided they are also present in the UniProtKB keyword list.
- The program which creates UniProtKB/TrEMBL entries. This adds keywords based on information in the underlying nucleotide entry. For example, if a nucleotide entry contains the word “kinase” in the description field, the program will add the keyword “Kinase” to the corresponding UniProtKB/TrEMBL entry.
- Automatic annotation.