UniProt release 14.7
Published January 20, 2009
UniPathway, a metabolic door to UniProtKB/Swiss-Prot
Due to the importance of using standardized nomenclature, annotations in UniProtKB/Swiss-Prot are progressively moving towards structured controlled vocabularies. In this context, the UniPathway project (a collaborative project involving the SIB and INRIA) aims at providing an extra resource dedicated to the exploration of metabolism using a structured controlled vocabulary for concisely describing the role of a protein in metabolism.
The metabolism of living organisms can be understood as a network of biochemical reactions, generally catalyzed by enzymes. Dealing with this network as a whole is a complex task and a classical approach is to divide it into more manageable segments, called pathways. This approach is always somewhat arbitrary and depends upon the final usage. Usually, a first level of segmentation is achieved on the basis of biological criteria. For instance, one could divide by considering the sub-network of all reactions involved in the amino-acid biosynthesis or, more specifically, in L-lysine biosynthesis only, or even more specifically, in L-lysine biosynthesis via the AAA pathway. It results in a series of coarse- to fine-grained divisions (the coarsest is called a 'super-pathway').
Whenever possible, we further refine this first-level segmentation to a second-level one, in order to split the pathways into linear segments (i.e. sub-networks without branches) called 'sub-pathways'. Such a fine-grained segmentation allows representation of pathway variants. Indeed, depending on an organism (or a set of organisms), the chemical route from one compound to another can be performed in different ways. It is important to represent these variations within the same pathway since UniProtKB covers a large number of species. In addition, it offers a convenient way to label the enzymatic reactions that constitute a metabolic pathway by their relative position ('step') in the sub-pathway.
The role of a protein in metabolism is described in the 'Pathway' subsection of the 'General annotation (Comments)' section. The syntax is 'super-pathway; pathway; sub-pathway: step n/m'. For examples of metabolic pathway annotations, see: P49367, P38998 and P11454. In this last example, the biochemical reactions of the pathway are not yet known. P11454 was therefore only annotated at the level of the pathway.
In the current version of UniProtKB/Swiss-Prot, close to 82'000 entries are annotated with the UniPathway controlled vocabulary. The UniProt web site supplies direct links to the UniPathway web server that provides more detailed information on pathways, sub-pathways and biochemical reactions.
New document on pathway controlled vocabulary
The document pathlist.txt is available by ftp and on the Web site. It describes the controlled vocabulary used in the 'Pathway' subsection of the 'General annotation (Comments)' section in the following format:
--------- ------------------------------- ---------------------------- Line code Content Occurrence in an entry --------- ------------------------------- ---------------------------- ID Identifier Once; starts an entry AC Accession number Once CL UniPathway class Once DE Definition Once or more SY Synonym(s) Optional; once or more HI Relationship is-a Optional; once or more HP Relationship part-of Optional; once or more DR Cross-reference(s) Optional; once or more // Terminator Once; ends an entry
ID D-alanine biosynthesis. AC UPA00042 CL Pathway. DE Biosynthesis of D-alanine. D-alanine is used either as an energy DE source or as a component of bacterial cell wall, where it is directly DE involved in the cross-linking of adjacent peptidoglycan chains. In DE Gram-positive bacteria, D-alanine can also be found to variable DE extents in cell wall teichoic acid and lipoteichoic acid residues. SY D-2-aminopropionic acid biosynthesis. HI UPA00402; amino-acid biosynthesis. DR GO; GO:0030632; P:D-alanine biosynthetic process. DR KEGG; map00252; Alanine and aspartate metabolism. DR KEGG; map00473; D-Alanine metabolism. DR MetaCyc; ALADEG-PWY. //
Syntax modification of the 'Pathway' subsection
We have structured the 'Pathway' subsection of the 'General annotation (Comments)' section (comment line (CC) topic PATHWAY in the flat file), using the controlled vocabulary provided by the UniPathway resource, in order to improve the consistency of annotation and to allow to parse its content.
The new format of PATHWAY topic in the flat file is:
CC -!- PATHWAY: Super-pathway; Pathway(; Sub-pathway: Enzymatic_reaction)?([regulation])?.Where:
- Super-pathway: Describes a class of metabolic pathways, e.g.
- Pathway: Describes a metabolic pathway, e.g.
L-lysine biosynthesis via AAA pathway
- Sub-pathway: Describes a linear sequence of enzymatic reactions in the format:
final_product from initial_substrate
where final_product and initial_substrate are the labels of the corresponding chemical compounds, e.g.
L-alpha-aminoadipate from 2-oxoglutarate
- Enzymatic_reaction: Describes the enzymatic reaction catalyzed by the protein in the format:
where n is the relative position of the enzymatic reaction in the sub-pathway and m is the total number of enzymatic reactions in the sub-pathway.
[regulation]: Indicates that a protein acts as transcriptional regulator of the genes coding for enzymes of the pathway.
Note: Perl-style multipliers indicate whether a pattern (as delimited by parentheses) is optional.
CC -!- PATHWAY: Amino-acid biosynthesis; L-lysine biosynthesis via AAA CC pathway; L-alpha-aminoadipate from 2-oxoglutarate: step 2/4.P0A877:
CC -!- PATHWAY: Amino-acid biosynthesis; L-tryptophan biosynthesis; L- CC tryptophan from chorismate: step 5/5.P95477:
CC -!- PATHWAY: Siderophore biosynthesis; pseudomonine biosynthesis.P52957:
CC -!- PATHWAY: Mycotoxin biosynthesis; sterigmatocystin biosynthesis CC [regulation].
Changes concerning keywords