Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

UniProt release 2015_05

Published April 29, 2015

Headline

A never-ending race between evolution and genomic integrity

Primate evolution has been accompanied by several waves of retrotransposon insertions. Nowadays about 50% of our genome is composed of endogenous retroelements (EREs). Although many of them have lost their transposition ability, some remain quite active. For instance, among the 500,000 copies of long interspersed element-1 (LINE1 or L1) present in the human genome, about 100 are retrotransposition-competent, and over 40 of them are highly active. Other EREs, such as short interspersed nuclear elements (SINEs), including Alu repeats, and SINE-VNTR-Alu (SVA), a composite hominid-restricted ERE, also actively move in the genome. It is currently estimated that new, non-parental L1 integrations occur in nearly 1/100 births and roughly every 20th newborn baby has a new Alu retrotransposon somewhere in its DNA.

Obviously having DNA jumping around our genome may be quite harmful and our cells work hard to repress EREs. Transcriptional silencing is controlled by TRIM28 and KRAB domain-containing Zinc finger proteins (KRAB-ZNFs). TRIM28 forms a repressive complex (KAP1 complex) by interacting with CHD3, a subunit of the nucleosome remodeling and deacetylation (NuRD) complex, and SETDB1, which specifically methylates histone H3 at ‘Lys-9’, inducing heterochromatinization. KRAB-ZNFs bind DNA and recruit the KAP1 complex to target sites.

KRAB-ZNF genes are one of the fastest growing gene families in primates, possibly to limit the activity of newly emerged ERE classes. This hypothesis has gained support in an elegant study recently published in Nature. In this article, Jacobs et al. used a heterologous cell system in which murine embryonic stem cells harbored a copy of human chromosome 11, which contains a number of EREs, including SVA and the L1 subfamily L1PA. In this cellular environment, the primate-specific EREs were derepressed. Individual overexpression of highly expressed human KRAB-ZNFs, confirmed by reporter gene assays, allowed the identification of genes involved in the repression of specific ERE (sub)families: ZNF91 and ZNF93 which acted on SVA and L1PA4, respectively. The authors then traced back the phylogenic history of these genes in the primate lineage and analyzed the parallel evolution of their target EREs. They could show that a new wave of L1PA insertions in great ape genomes was made possible through the deletion of a 129-bp element in L1PA3, which destroyed the ZNF93-binding site. This could be interpreted as an ERE response to a series of structural changes in ZNF93 that occurred soon before and improved host repression of L1PA activity.

In conclusion, the expansion of a new ERE drives the evolution of a host repressor which leads to a subsequent change in ERE to escape repression, and so on. It is a never-ending race of our genome with itself, which leads inexorably to greater and greater complexity.

As of this release, updated human ZNF91 and ZNF93 entries are available in UniProtKB/Swiss-Prot.

UniProtKB news

Removal of IPI species proteome data sets from FTP site

Since the closure of IPI in 2011, UniProt has provided proteome data sets for IPI species on its FTP site. In UniProt release 2015_03, we have started to provide new data sets for reference proteomes which cover also the IPI species and we have now removed the old ‘proteomes’ FTP directory that contained only data for the IPI species.

UniProtKB XSD change for evidence attribution

We have made the following changes to the UniProtKB XSD to allow a more fine-grained attribution of evidence to the parts of comment annotations that contain “free-text” descriptions:

  • The cardinality of all existing text elements was changed from maxOccurs="1" to maxOccurs="unbounded".
  • The phDependence, redoxPotential and temperatureDependence child elements of the bpcCommentGroup now have a sequence of text child elements.
  • The note child element of the isoformType was replaced by a sequence of text child elements.

The XSD changes are highlighted in red color below:

    <xs:complexType name="commentType">
        ...
            <xs:element name="text" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
        ...
    <xs:group name="bpcCommentGroup">
       ...
             <xs:element name="absorption" minOccurs="0">
                ...
                        <xs:element name="text" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                ...
            <xs:element name="kinetics" minOccurs="0">
                ...
                        <xs:element name="text" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                ...

            <!-- The following 3 elements will in future each have a sequence of <text> child elements:
            <xs:element name="phDependence" type="evidencedStringType" minOccurs="0"/>
            <xs:element name="redoxPotential" type="evidencedStringType" minOccurs="0"/>
            <xs:element name="temperatureDependence" type="evidencedStringType" minOccurs="0"/>
            -->
            <xs:element name="phDependence" minOccurs="0">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="text" type="evidencedStringType" maxOccurs="unbounded"/>
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
            <xs:element name="redoxPotential" minOccurs="0">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="text" type="evidencedStringType" maxOccurs="unbounded"/>
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
            <xs:element name="temperatureDependence" minOccurs="0">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="text" type="evidencedStringType" maxOccurs="unbounded"/>
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        ...
    <xs:complexType name="isoformType">
        ...
            <!-- The <note> element will be replaced by a sequence of <text> elements:
            <xs:element name="note" minOccurs="0">
                <xs:complexType>
                    <xs:simpleContent>
                        <xs:extension base="xs:string">
                            <xs:attribute name="evidence" type="intListType" use="optional"/>
                        </xs:extension>
                    </xs:simpleContent>
                </xs:complexType>
            </xs:element>
            -->
            <xs:element name="text" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>

Cross-references to BioMuta

Cross-references have been added to BioMuta, a curated single-nucleotide variation and disease association database.

BioMuta is available at https://hive.biochemistry.gwu.edu/tools/biomuta/.

The format of the explicit links is:

Resource abbreviation BioMuta
Resource identifier Gene name.

Example: P02787

Show all entries having a cross-reference to BioMuta.

Text format

Example: P02787

DR   BioMuta; TF; -.

XML format

Example: P02787

<dbReference type="BioMuta" id="TF"/>

Changes to the controlled vocabulary of human diseases

New diseases:

Modified diseases:

Changes to the controlled vocabulary for PTMs

New term for the feature key ‘Lipidation’ (‘LIPID’ in the flat file):

  • O-palmitoleyl serine