UniProt
Swiss-ProtTrEMBL
UniProt Knowledgebase
Swiss-Prot Protein Knowledgebase
TrEMBL Protein Database

Forthcoming changes
Release 15.10 of 03-Nov-2009

Also read about recent changes, and recent and forthcoming changes for the XML version of the UniProt Knowledgebase.

Table of contents

Format change in the cross-references to HOGENOM and HOVERGEN
Introduction of the new event types 'Protein splicing' and 'Miscellaneous' in the comment line (CC) topic ALTERNATIVE PRODUCTS
Change of the comment line (CC) topic INTERACTION

Format change in the cross-references to HOGENOM and HOVERGEN

Not before: 15-Dec-2009

We are going to modify the cross-references to the HOGENOM and HOVERGEN databases: The primary identifier, which is currently a UniProtKB accession number, will be replaced by a HOGENOM and HOVERGEN identifier, resp. The secondary identifier will remain a dash '-'.

Example:

Current format:

DR   HOGENOM; Q9D8H7; -.

DR   HOVERGEN; Q9D8H7; -.

New format:

DR   HOGENOM; HBG025762; -.

DR   HOVERGEN; HBG074595; -.
Introduction of the new event types 'Protein splicing' and 'Miscellaneous' in the comment line (CC) topic ALTERNATIVE PRODUCTS

Not before: 15-Dec-2009

The comment line topic ALTERNATIVE PRODUCTS, together with the feature key VAR_SEQ, describes alternative protein sequences (isoforms) that are the result of alternative splicing, alternative initiation, alternative promoter usage and ribosomal frameshifting events.

We are going to broaden this topic with the new event type Protein splicing to describe protein sequences that arise by intein processing events. Note that other protein maturation events, such as the hedgehog protein processing or other types of cleavages, will not be described by this event.

We will also introduce the general event type Miscellaneous to describe uncommon molecular mechanisms, such as ribosome shunt, ribosome skipping (PMID:12522142) or ribosome termination-reinitiation (PMID:18056426) events.

Example with intein:

P17255:

Current format:

FT   INIT_MET      1      1       Removed.
FT   CHAIN         2    283       Vacuolar ATP synthase catalytic subunit
FT                                A, 1st part.
FT                                /FTId=PRO_0000002458.
FT   CHAIN       284    737       Endonuclease PI-SceI.
FT                                /FTId=PRO_0000002459.
FT   CHAIN       738   1071       Vacuolar ATP synthase catalytic subunit
FT                                A, 2nd part.
FT                                /FTId=PRO_0000002460.

New format:

CC   -!- ALTERNATIVE PRODUCTS:
CC       Event=Protein splicing; Named isoforms=3;
CC         Comment=This protein undergoes a protein self splicing that
CC         involves a post-translational excision of the intervening region
CC         (intein) followed by peptide ligation;
CC       Name=Intein-containing vacuolar ATP synthase catalytic subunit A;
CC         IsoId=P17255-1; Sequence=Displayed;
CC         Note=Unprocessed;
CC       Name=Vacuolar ATP synthase catalytic subunit A;
CC         IsoId=P17255-2; Sequence=VSP_000002;
CC         Note=Mature;
CC       Name=Endonuclease PI-SceI;
CC         IsoId=P17255-3; Sequence=VSP_000001, VSP_000003;
CC         Note=Intein;
..
FT   INIT_MET      1      1       Removed.
FT   CHAIN         2    737       Intein-containing vacuolar ATP synthase
FT                                catalytic subunit A.
FT   REGION      284    737       Endonuclease PI-SceI.
FT   VAR_SEQ       2    283       Missing (in isoform Endonuclease PI-SceI).
FT                                /FTId=VSP_000001.
FT   VAR_SEQ     284    737       Missing (in isoform Vacuolar ATP synthase
FT                                catalytic subunit A).
FT                                /FTId=VSP_000002.
FT   VAR_SEQ     738   1071       Missing (in isoform Endonuclease PI-SceI).
FT                                /FTId=VSP_000003.

Example for ribosomal termination-reinitiation:

Q672H9:

Current format:

CC   -!- MISCELLANEOUS: Translated by a ribosomal termination-reinitiation
CC       process from the bicistronic mRNA encoding for VP1 and VP2.

New format:

CC   -!- ALTERNATIVE PRODUCTS:
CC       Event=Alternative initiation, Miscellaneous; Named isoforms=3;
CC       Name=Protein VP2; Synonyms=VP2;
CC         IsoId=Q672H9-1; Sequence=Displayed;
CC         Note=Produced by ribosomal termination-reinitiation at the end
CC         of VP1 ORF;
CC       Name=Uncharacterized protein VP3;
CC         IsoId=Q672I0-1; Sequence=External;
CC         Note=Produced by alternative initiation from the subgenomic RNA;
CC       Name=Subgenomic capsid protein; Synonyms=VP1;
CC         IsoId=Q672I1-2; Sequence=External;
CC         Note=Produced from the subgenomic RNA;
Change of the comment line (CC) topic INTERACTION

Not before: 15-Dec-2009

The CC line topic INTERACTION conveys information about binary protein-protein interactions. A description of its current format is available in the UniProtKB User Manual. Currently, all interaction data is automatically derived from the IntAct database. In the future, we will start to add manually curated binary protein-protein interactions to this topic (these are currently described in the CC line topic SUBUNIT). In order to represent isoform- and chain-specific interactions (e.g. for viral polyproteins) and to add interactor-specific comments (e.g. PTMs and binding regions), we are going to modify the format of the INTERACTION lines. Each binary interaction will be represented by a block of 3 to 4 lines:

Note: Variable values are represented in italics. Perl-style multipliers indicate whether a pattern (as delimited by parentheses) is optional (?), may occur 0 or more times (*), or 1 or more times (+). Alternative values are separated by a pipe symbol (|). Special characters are escaped by a backslash (\).

 CC   -!- INTERACTION:
(CC       Interact=status \(source|By similarity\)( \(Potential\))?;( Xref=xref;)?
(CC         Comment=free_text;)?
 CC         Protein1=name [id(:ft_id)?];( Note=free_text;)?
 CC         Protein2=name( [id(:ft_id)?])?;( Organism=tax_name [NCBI_TaxID:tax_id];)?( Note=free_text;)?)+
Where:

Examples:

CC   -!- INTERACTION:
CC       Interact=Yes (PubMed:11533489);
CC         Comment=HDAC3 mediates the deacetylation of RELA;
CC         Protein1=RELA [Q04206];
CC         Protein2=HDAC3 [O15379];
CC   -!- INTERACTION:
CC       Interact=Yes (Ref.4);
CC         Comment=Heterodimers are called polygalacturonase-1 (PG1);
CC         Protein1=GP1 [Q40161];
CC         Protein2=PG2 [P05117];
CC   -!- INTERACTION:
CC       Interact=Yes (PubMed:11501947) (Potential);
CC          Protein1=ABI3 [Q9P2A4];
CC          Protein2=ABI3BP [Q7Z7G0];
CC   -!- INTERACTION:
CC       Interact=Yes (By similarity);
CC         Protein1=GABRG2 [Q5REA1];
CC         Protein2=GABARAP;
Isoform-specific interaction:
CC   -!- INTERACTION:
CC       Interact=Yes (PubMed:10837489);
CC         Protein1=MCL1 [Q07820-1];
CC         Protein2=BAK1 [Q16611];
CC       Interact=Yes (PubMed:15901672, PubMed:17097560); Xref=IntAct:EBI-1003422,EBI-519866;
CC         Protein1=MCL1 [Q07820];
CC         Protein2=BAK1 [Q16611];
Negative isoform-specific interaction:
CC   -!- INTERACTION:
CC       Interact=Yes (PubMed:11418237); Xref=IntAct:EBI-375446,EBI-389883;
CC         Protein1=ABI1 [Q8IZP0];
CC         Protein2=NCK1 [P16333]; Note=SH3 1 domain;
CC       Interact=No (PubMed:12681507);
CC         Protein1=ABI1 [Q8IZP0-6];
CC         Protein2=NCK1 [P16333];
CC       Interact=Yes (By similarity);
CC         Protein1=ABI1 [Q8IZP0]; Note=N-terminus;
CC         Protein2=WASF1 [Q92558];
Chain-specific host-virus interaction:
CC   -!- INTERACTION:
CC       Interact=Yes (PubMed:11086025);
CC         Protein1=C1QR1 [Q9NPY3];
CC         Protein2=Core protein p21 [P27958:PRO_0000037566)]; Organism=Hepatitis C virus genotype 1a (isolate H) [NCBI_TaxID=11108]; Note=See also other virus strains;
Chain-specific virus-host interaction:
CC   -!- INTERACTION:
CC       Interact=Yes (PubMed:11086025);
CC         Protein1=Core protein p21 [P27958:PRO_0000037566)];
CC         Protein2=C1QR1 [Q9NPY3]; Organism=Homo sapiens [NCBI_TaxID=9606];
Heterologous interaction between Bos taurus and Homo sapiens proteins:
CC   -!- INTERACTION:
CC       Interact=Yes (PubMed:16470652); Xref=IntAct:EBI-907934,EBI-907894;
CC         Protein1=CNP [P06623];
CC         Protein2=CABP1 [Q9NZU7]; Organism=Homo sapiens [NCBI_TaxID=9606];