Skip Header

UniProt release 14.7

Published January 20, 2009

Headlines

UniPathway, a metabolic door to UniProtKB/Swiss-Prot

Due to the importance of using standardized nomenclature, annotations in UniProtKB/Swiss-Prot are progressively moving towards structured controlled vocabularies. In this context, the UniPathway project (a collaborative project involving the SIB and INRIA) aims at providing an extra resource dedicated to the exploration of metabolism using a structured controlled vocabulary for concisely describing the role of a protein in metabolism.

The metabolism of living organisms can be understood as a network of biochemical reactions, generally catalyzed by enzymes. Dealing with this network as a whole is a complex task and a classical approach is to divide it into more manageable segments, called pathways. This approach is always somewhat arbitrary and depends upon the final usage. Usually, a first level of segmentation is achieved on the basis of biological criteria. For instance, one could divide by considering the sub-network of all reactions involved in the amino-acid biosynthesis or, more specifically, in L-lysine biosynthesis only, or even more specifically, in L-lysine biosynthesis via the AAA pathway. It results in a series of coarse- to fine-grained divisions (the coarsest is called a 'super-pathway').

Whenever possible, we further refine this first-level segmentation to a second-level one, in order to split the pathways into linear segments (i.e. sub-networks without branches) called 'sub-pathways'. Such a fine-grained segmentation allows representation of pathway variants. Indeed, depending on an organism (or a set of organisms), the chemical route from one compound to another can be performed in different ways. It is important to represent these variations within the same pathway since UniProtKB covers a large number of species. In addition, it offers a convenient way to label the enzymatic reactions that constitute a metabolic pathway by their relative position ('step') in the sub-pathway.

The role of a protein in metabolism is described in the 'Pathway' subsection of the 'General annotation (Comments)' section. The syntax is 'super-pathway; pathway; sub-pathway: step n/m'. For examples of metabolic pathway annotations, see: P49367, P38998 and P11454. In this last example, the biochemical reactions of the pathway are not yet known. P11454 was therefore only annotated at the level of the pathway.

In the current version of UniProtKB/Swiss-Prot, close to 82'000 entries are annotated with the UniPathway controlled vocabulary. The UniProt web site supplies direct links to the UniPathway web server that provides more detailed information on pathways, sub-pathways and biochemical reactions.

UniProtKB News

New document on pathway controlled vocabulary

The document pathlist.txt is available by ftp and on the Web site. It describes the controlled vocabulary used in the 'Pathway' subsection of the 'General annotation (Comments)' section in the following format:

---------  -------------------------------   ----------------------------
Line code  Content                           Occurrence in an entry
---------  -------------------------------   ----------------------------
ID         Identifier                        Once; starts an entry
AC         Accession number                  Once
CL         UniPathway class                  Once
DE         Definition                        Once or more
SY         Synonym(s)                        Optional; once or more
HI         Relationship is-a                 Optional; once or more
HP         Relationship part-of              Optional; once or more
DR         Cross-reference(s)                Optional; once or more
//         Terminator                        Once; ends an entry

Example:

ID   D-alanine biosynthesis.
AC   UPA00042
CL   Pathway.
DE   Biosynthesis of D-alanine. D-alanine is used either as an energy
DE   source or as a component of bacterial cell wall, where it is directly
DE   involved in the cross-linking of adjacent peptidoglycan chains. In
DE   Gram-positive bacteria, D-alanine can also be found to variable
DE   extents in cell wall teichoic acid and lipoteichoic acid residues.
SY   D-2-aminopropionic acid biosynthesis.
HI   UPA00402; amino-acid biosynthesis.
DR   GO; GO:0030632; P:D-alanine biosynthetic process.
DR   KEGG; map00252; Alanine and aspartate metabolism.
DR   KEGG; map00473; D-Alanine metabolism.
DR   MetaCyc; ALADEG-PWY.
//

Syntax modification of the 'Pathway' subsection

We have structured the 'Pathway' subsection of the 'General annotation (Comments)' section (comment line (CC) topic PATHWAY in the flat file), using the controlled vocabulary provided by the UniPathway resource, in order to improve the consistency of annotation and to allow to parse its content.

The new format of PATHWAY topic in the flat file is:

CC   -!- PATHWAY: Super-pathway; Pathway(; Sub-pathway: Enzymatic_reaction)?([regulation])?.
     
Where:
  • Super-pathway: Describes a class of metabolic pathways, e.g. Amino-acid biosynthesis
  • Pathway: Describes a metabolic pathway, e.g. L-lysine biosynthesis via AAA pathway
  • Sub-pathway: Describes a linear sequence of enzymatic reactions in the format:
    final_product from initial_substrate
    where final_product and initial_substrate are the labels of the corresponding chemical compounds, e.g. L-alpha-aminoadipate from 2-oxoglutarate
  • Enzymatic_reaction: Describes the enzymatic reaction catalyzed by the protein in the format:
    step n/m
    where n is the relative position of the enzymatic reaction in the sub-pathway and m is the total number of enzymatic reactions in the sub-pathway.
  • [regulation]: Indicates that a protein acts as transcriptional regulator of the genes coding for enzymes of the pathway.

Note: Perl-style multipliers indicate whether a pattern (as delimited by parentheses) is optional.

Examples:

P49367:
      CC   -!- PATHWAY: Amino-acid biosynthesis; L-lysine biosynthesis via AAA
      CC       pathway; L-alpha-aminoadipate from 2-oxoglutarate: step 2/4.
     
P0A877:
      CC   -!- PATHWAY: Amino-acid biosynthesis; L-tryptophan biosynthesis; L-
      CC       tryptophan from chorismate: step 5/5.
     
P95477:
      CC   -!- PATHWAY: Siderophore biosynthesis; pseudomonine biosynthesis.
     
P52957:
      CC   -!- PATHWAY: Mycotoxin biosynthesis; sterigmatocystin biosynthesis
      CC       [regulation].
     

Changes concerning keywords

New keywords: