Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Standardization of 'Catalytic activity' annotations

A ‘Catalytic activity’ annotation describes a catalytic activity of an enzyme, i.e. a chemical reaction that the enzyme catalyzes. Up to now, UniProt has followed the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) for the description of enzymatic activities, except for reactions that are described in the scientific literature, but that are not (yet) covered by the NC-IUBMB. The focus of the NC-IUBMB is the nomenclature and classification of enzymes by the reactions they catalyze. For this purpose the NC-IUBMB typically describes an exemplary reaction for each class of enzymes, with the understanding that individual members of the class may use alternative reactants. The NC-IUBMB use their own names for the reactants. To allow UniProt to curate reactions at the level of specific enzymes instead of enzyme classes, and to use standardized names for reactants, we will use chemical reaction descriptions from the Rhea database whenever possible. For catalytic activities that can only be described in the form of free text, we will continue to follow the NC-IUBMB descriptions. In the future, we will also curate the physiological direction of a reaction, i.e. the direction of the net flow of reactants in vivo, where evidence for it is available.

Due to their focus on nomenclature, cross-references to Enzyme Commission (EC) numbers have historically been added to the Protein names subsection of UniProtKB entries. To link the EC numbers to the reactions on which they are based, we are going to add them also to ‘Catalytic activity’ annotations.

‘Catalytic activity’ annotations are found in UniProtKB entries, as well as in UniRule and SAAS annotation rules.

Below is a description of how this change will affect the different file formats in which UniProt entries are distributed.

Text format

Note: Regex symbols indicate whether a pattern (as delimited by parentheses) is optional (?) or may occur 1 or more times (+).

Reaction description from Rhea:

 CC   -!- CATALYTIC ACTIVITY:
 CC       Reaction=<RheaText>; Xref=<RheaXref>(, <ReactantXref>)+;
 CC        ( EC=<EcNumber>;)?( Evidence={<Evidences>};)?
(CC       PhysiologicalDirection=left-to-right; Xref=<RheaXref>; Evidence={<Evidences>};)
(CC       PhysiologicalDirection=right-to-left; Xref=<RheaXref>; Evidence={<Evidences>};)

Where:

  • <RheaText>: Textual representation of an undirectional Rhea reaction.
  • <RheaXref>: Cross-reference to a Rhea reaction.
  • <ReactantXref>: Cross-reference to a reactant.
  • <EcNumber>: EC number of the corresponding enzyme class, when available.
  • <Evidences>: List of evidences, when available.

Example: Q9ULW8

Current format (based on NC-IUBMB):

CC   -!- CATALYTIC ACTIVITY: Protein L-arginine + H(2)O = protein L-
CC       citrulline + NH(3). {ECO:0000269|PubMed:27866708}.

New format (based on Rhea):

CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=H2O + L-arginyl-[protein] = L-citrullyl-[protein] +
CC         NH4(+); Xref=Rhea:RHEA:18089, Rhea:RHEA-COMP:10532, Rhea:RHEA-
CC         COMP:10588, ChEBI:CHEBI:15377, ChEBI:CHEBI:28938,
CC         ChEBI:CHEBI:29965, ChEBI:CHEBI:83397; EC=3.5.3.15;
CC         Evidence={ECO:0000269|PubMed:27866708};
CC       PhysiologicalDirection=left-to-right; Xref=RHEA:18090;
CC         Evidence={ECO:0000269|PubMed:27866708};

Reaction description from NC-IUBMB:

CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=<IUBMBText>; EC=<EcNumber>;( Evidence={<Evidences>};)?

Where:

  • <IUBMBText>: An NC-IUBMB reaction description.
  • <EcNumber>: EC number of the corresponding enzyme class.
  • <Evidences>: List of evidences, when available.

Example: P17050

Current format (based on NC-IUBMB):

CC   -!- CATALYTIC ACTIVITY: Cleavage of non-reducing alpha-(1->3)-N-
CC       acetylgalactosamine residues from human blood group A and AB mucin
CC       glycoproteins, Forssman hapten and blood group A lacto series
CC       glycolipids. {ECO:0000269|PubMed:19683538}.

New format (based on NC-IUBMB):

CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=Cleavage of non-reducing alpha-(1->3)-N-
CC         acetylgalactosamine residues from human blood group A and AB
CC         mucin glycoproteins, Forssman hapten and blood group A lacto
CC         series glycolipids.; EC=3.2.1.49;
CC         Evidence={ECO:0000269|PubMed:19683538};

XML format

We will extend the UniProt XSD with new elements and types as shown below in red color:

    <xs:complexType name="commentType">
        ...
        <xs:sequence>
            <xs:element name="molecule" type="moleculeType" minOccurs="0"/>
            <xs:choice minOccurs="0">
                ...
                <xs:sequence>
                    <xs:annotation>
                        <xs:documentation>Used in 'catalytic activity' annotations.</xs:documentation>
                    </xs:annotation>
                    <xs:element name="reaction" type="reactionType"/>
                    <xs:element name="physiologicalReaction" type="physiologicalReactionType" minOccurs="0" maxOccurs="2"/>
                </xs:sequence>
                ...
            </xs:choice>
            ...
        </xs:sequence>
        ...
    </xs:complexType>
    ...
    <xs:complexType name="reactionType">
        <xs:annotation>
            <xs:documentation>Describes a chemical reaction.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="text" type="xs:string"/>
            <xs:element name="dbReference" type="dbReferenceType" minOccurs="1" maxOccurs="unbounded"/>
        </xs:sequence>
        <xs:attribute name="evidence" type="intListType" use="optional"/>
    </xs:complexType>

    <xs:complexType name="physiologicalReactionType">
        <xs:annotation>
            <xs:documentation>Describes a physiological reaction.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="dbReference" type="dbReferenceType"/>
        </xs:sequence>
        <xs:attribute name="direction" use="required">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="left-to-right"/>
                    <xs:enumeration value="right-to-left"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>
        <xs:attribute name="evidence" type="intListType" use="optional"/>
    </xs:complexType>

Reaction description from Rhea:

Example: Q9ULW8

Current format (based on NC-IUBMB):

<comment type="catalytic activity">
  <text evidence="3">Protein L-arginine + H(2)O = protein L-citrulline + NH(3).</text>
</comment>

New format (based on Rhea):

<comment type="catalytic activity">
  <reaction evidence="3">
    <text>H2O + L-arginyl-[protein] = L-citrullyl-[protein] + NH4(+)</text>
    <dbReference type="Rhea" id="RHEA:18089"/>
    <dbReference type="Rhea" id="RHEA-COMP:10532"/>
    <dbReference type="Rhea" id="RHEA-COMP:10588"/>
    <dbReference type="ChEBI" id="CHEBI:15377"/>
    <dbReference type="ChEBI" id="CHEBI:28938"/>
    <dbReference type="ChEBI" id="CHEBI:29965"/>
    <dbReference type="ChEBI" id="CHEBI:83397"/>
    <dbReference type="EC" id="3.5.3.15"/>
  </reaction>
  <physiologicalReaction direction="left-to-right" evidence="3">
    <dbReference type="Rhea" id="RHEA:18090"/>
  </physiologicalReaction>
</comment>

Reaction description from NC-IUBMB:

Example: P17050

Current format (based on NC-IUBMB):

<comment type="catalytic activity">
  <text evidence="6">Cleavage of non-reducing alpha-(1->3)-N-acetylgalactosamine residues from human blood group A and AB mucin glycoproteins, Forssman hapten and blood group A lacto series glycolipids.</text>
</comment>

New format (based on NC-IUBMB):

<comment type="catalytic activity">
  <reaction evidence="6">
    <text>Cleavage of non-reducing alpha-(1->3)-N-acetylgalactosamine residues from human blood group A and AB mucin glycoproteins, Forssman hapten and blood group A lacto series glycolipids.</text>
    <dbReference type="EC" id="3.2.1.49"/>
  </reaction>
</comment>

RDF format

Note: Evidence-related statements are omitted since their format will not change. In the current format, evidence is attributed via reification of the rdfs:comment statement. In the new format, the up:catalyticActivity and up:catalyzedPhysiologicalReaction statements will be reified.

Reaction description from Rhea:

Example: Q9ULW8

Current format (based on NC-IUBMB):

uniprot:Q9ULW8
  up:annotation <Q9ULW8#SIP017EC216DF0EDC2E> .

<Q9ULW8#SIP017EC216DF0EDC2E>
  rdf:type up:Catalytic_Activity_Annotation ;
  rdfs:comment "Protein L-arginine + H(2)O = protein L-citrulline + NH(3)." .

New format (based on Rhea):

uniprot:Q9ULW8
  up:annotation <Q9ULW8#SIP017EC216DF0EDC2E> .

<Q9ULW8#SIP017EC216DF0EDC2E>
  rdf:type up:Catalytic_Activity_Annotation ;
  up:catalyticActivity <Q9ULW8#SIP017EC216DF0EDC2F> ;
  up:catalyzedPhysiologicalReaction <http://rdf.rhea-db.org/18090> .

<Q9ULW8#SIP017EC216DF0EDC2F>
  rdf:type up:Catalytic_Activity ;
  up:catalyzedReaction <http://rdf.rhea-db.org/18089> ;
  up:enzymeClass enzyme:3.5.3.15 .

Reaction description from NC-IUBMB:

Example: P17050

Current format (based on NC-IUBMB):

uniprot:P17050
  up:annotation <P17050#SIP0FD272930B1683DE> .

<P17050#SIP0FD272930B1683DE>
  rdf:type up:Catalytic_Activity_Annotation ;
  rdfs:comment "Cleavage of non-reducing alpha-(1->3)-N-acetylgalactosamine residues from human blood group A and AB mucin glycoproteins, Forssman hapten and blood group A lacto series glycolipids." .
  

New format (based on NC-IUBMB):

uniprot:P17050
  up:annotation <P17050#SIP0FD272930B1683DE> .

<P17050#SIP0FD272930B1683DE>
  rdf:type up:Catalytic_Activity_Annotation ;
  up:catalyticActivity <P17050#SIP0FD272930B1683DF> .

<P17050#SIP0FD272930B1683DF>
  rdf:type up:Catalytic_Activity ;
  skos:closeMatch enzyme:3.2.1.49#SIP0FD272930B1683DG ;
  up:enzymeClass enzyme:3.2.1.49 .

Note that this change will be accompanied by a change of the RDF representation of enzyme related data.

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again