Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

UniProt release 2014_03

Published March 19, 2014


Minority report

We are a minority in our own body. Over 90% of our cells are actually not human, but microbial. The majority of these microbes reside in the gut. The gut microbiota is typically dominated by bacteria, more specifically by Bacteroidetes and Firmicutes. The exact composition of gut microbiota varies between individuals and depends upon lifestyle, diet, hygienic preferences, use of antibiotics, etc. Gut microbes have a profound influence on human physiology and nutrition. Among others, they contribute to harvesting energy from food.

All guidelines for a healthy diet emphasize the necessity of eating fruit, vegetables and whole grains. These products are rich in dietary fibers, i.e. non-starch polysaccharides, most of which cannot be digested by the hydrolases encoded by our genome. Our inherent ability to digest carbohydrates is restricted to starch and simple saccharides, not xyloglucans (XyGs), a family of highly branched plant cell wall polysaccharides, which are abundant in plants. In view of the prevalence of XyGs in our diet, the mechanism of degradation of these complex polysaccharides by bacteria was expected to be important to human energy acquisition, but until recently it was still unclear. Very interesting work by Larsbrink et al., published in February, sheds light on XyG metabolism. The authors identified a polysaccharide utilization locus (PUL) in the genome of a common human gut symbiont, Bacteroides ovatus. PUL is transcriptionally upregulated in response to growth on galactoxyloglucan. It is predicted to encode 10 genes, including 8 glycoside hydrolases. All of them were subjected to in-depth molecular characterization through reverse genetics, in vitro protein biochemistry and enzymology. Finally, the 3D structure of the endo-xyloglucanase BoGH5A, which generates short XyG oligosaccharides, was solved. This study unraveled all the details of the enzymatic pathways by which the most common dietary polysaccharides are digested in our gut.

Although XyG utilization loci (XyGULs) have been identified in only a few other gut-resident Bacteroidetes, including B. cellulosyliticus, B. uniformis, B. fluxus, Dysgonomonas mossii and D. gadei, most human beings harbor at least one of these Bacteroides XyGULs in their gut, suggesting their importance in human nutrition.

The importance of the gut microbiome goes far beyond an active role in food digestion. It also acts on intestinal function, promoting gut-associated lymphoid tissue maturation, tissue regeneration, gut motility, and morphogenesis of the vascular system surrounding the gut. It additionally affects many other physiopathological aspects, such as the nervous system and bone homeostasis. Not surprisingly, changes in the microbiota composition or a complete lack of a gut microbiota has been shown to affect metabolism, tissue homeostasis and behavior.

As of this release, manually reviewed B. ovatus XyGUL gene products are available in UniProtKB/Swiss-Prot. Let’s bet that they will be followed by many more proteins encoded by our other genome(s) in the near future.

UniProtKB news

Cross-references for isoform sequences

Some of the resources to which we link contain information that is specific to an isoform sequence and where this is known we now indicate the corresponding UniProtKB isoform sequence identifier in our cross-references as described below. The first resources for which we provide such isoform-specific cross-references are Ensembl and UCSC.

Text format

The UniProtKB isoform sequence identifier is shown in square brackets at the end of the DR line as an optional field:

DR   ResourceAbbreviation; ResourceIdentifier(; AdditionalField)+. [IsoId]


DR   Ensembl; ENST00000281772; ENSP00000281772; ENSG00000144445. [A0AUZ9-1]
DR   Ensembl; ENST00000418791; ENSP00000405724; ENSG00000144445. [A0AUZ9-2]
DR   Ensembl; ENST00000452086; ENSP00000401408; ENSG00000144445. [A0AUZ9-3]
DR   Ensembl; ENST00000457374; ENSP00000393432; ENSG00000144445. [A0AUZ9-3]
DR   UCSC; uc002vds.3; human. [A0AUZ9-1]
DR   UCSC; uc002vdt.3; human. [A0AUZ9-2]
DR   UCSC; uc002vdx.1; human. [A0AUZ9-4]

XML format

To show the UniProtKB isoform sequence identifier in dbReference elements, we added an optional molecule element to the dbReferenceType. For consistency, we also changed the type of the molecule element that is found in the commentType. The XSD has been changed as highlited below:

    <code><xs:complexType name="commentType">
                        <xs:documentation>Used in 'subcellular location' annotations.</xs:documentation>
                    <code style="color:red"><!-- <xs:element name="molecule" type="xs:string" minOccurs="0"/> -->
                    <xs:element name="molecule" type="moleculeType" minOccurs="0"/></code>
                    <code><xs:element name="subcellularLocation" type="subcellularLocationType" maxOccurs="unbounded"/>
    <xs:complexType name="dbReferenceType">
            <code style="color:red"><xs:element name="molecule" type="moleculeType" minOccurs="0"/></code>
            <code><xs:element name="property" type="propertyType" minOccurs="0" maxOccurs="unbounded"/>
    <code style="color:red"><xs:complexType name="moleculeType">
            <xs:documentation>Describes a molecule by name or unique identifier.</xs:documentation>
            <xs:extension base="xs:string">
                <xs:attribute name="id" type="xs:string" use="optional"/>


<dbReference type="Ensembl" id="ENST00000281772">
  <molecule id="A0AUZ9-1"/>
  <property type="protein sequence ID" value="ENSP00000281772"/>
  <property type="gene ID" value="ENSG00000144445"/>
<dbReference type="Ensembl" id="ENST00000418791">
  <molecule id="A0AUZ9-2"/>
  <property type="protein sequence ID" value="ENSP00000405724"/>
  <property type="gene ID" value="ENSG00000144445"/>
<dbReference type="Ensembl" id="ENST00000452086">
  <molecule id="A0AUZ9-3"/>
  <property type="protein sequence ID" value="ENSP00000401408"/>
  <property type="gene ID" value="ENSG00000144445"/>
<dbReference type="Ensembl" id="ENST00000457374">
  <molecule id="A0AUZ9-3"/>
  <property type="protein sequence ID" value="ENSP00000393432"/>
  <property type="gene ID" value="ENSG00000144445"/>
<dbReference type="UCSC" id="uc002vds.3">
  <molecule id="A0AUZ9-1"/>
  <property type="organism name" value="human"/>
<dbReference type="UCSC" id="uc002vdt.3">
  <molecule id="A0AUZ9-2"/>
  <property type="organism name" value="human"/>
<dbReference type="UCSC" id="uc002vdx.1">
  <molecule id="A0AUZ9-4"/>
  <property type="organism name" value="human"/>

Changes to the controlled vocabulary of human diseases

New diseases:

Modified diseases:

Changes to the controlled vocabulary for PTMs

New terms for the feature key ‘Cross-link’ (‘CROSSLNK’ in the flat file):

  • Isoaspartyl lysine isopeptide (Lys-Asp)

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health