UniProt release 2010_05
Published April 20, 2010
Nonsense-mediated mRNA decay: To be or not to be… integrated in UniProtKB
It has been known for over 30 years that, in yeast, nonsense mutations reduce mRNA levels and that the strength of the reduction depends on the position of the nonsense codon within the locus. This observation, followed by many others in a great variety of eukaryotic organisms, led to the concept of ‘Nonsense-mediated mRNA decay’ (NMD), ‘a surveillance mechanism that detects and degrades mRNAs with premature termination codons (PTCs), thereby preventing the production of faulty proteins’. The key question was what Mother Nature considers a ‘premature’ stop. For mammals, a rule was established stating that ‘if a termination codon is more than about 50 nucleotides upstream of the final exon, it is a PTC and the mRNA that harbors it will be degraded’ (see Nagy and Maquat, 1998). Although we know today that NMD is a much more sophisticated mechanism than previously anticipated (see reviews), the ‘50 nucleotide rule’ is still used to predict potential NMD targets and, on this basis, some databases deleted them from their collections. Since many PTCs are generated by alternative splicing (at least one third of the human alternatively spliced mRNAs contain PTCs), several alternatively spliced isoforms have disappeared from databases, victims of the ‘50 nucleotide rule’.
Eukaryotic cells detect PTC during the first round of translation undergone by mRNAs freshly exported from the nucleus. During this ‘pioneer’ round of translation, if the ribosome terminates at a termination codon (TC) in the vicinity of the poly(A) tail, PABPC1 – a poly(A)-binding protein – sends a signal which promotes proper termination of translation. This results in efficient reinitiation of the ribosome at the 5’ end of the mRNA, and the production of a stable mRNP. If the ribosome terminates at a TC that is too far away from the poly(A) tail for it to receive the PABPC1 – mediated translation-termination-promoting signal, the UPF1 protein binds to the stalled ribosome instead, thereby marking this TC as premature. Subsequently, a PTC-specific protein complex forms around UPF1, promoting UPF1 phosphorylation and committing the mRNA to rapid degradation.
It is thought that the physical distance, rather than the number of nucleotides, between a TC and the poly(A) tail is a crucial determinant in defining a TC as premature (Eberle et al., 2008). This distance depends on the 3D structure of the mRNA 3’ UTR. This structure can be modified by altering (1) intramolecular base pairing, (2) interaction of the mRNA with RNA-binding proteins and (3) interactions between the involved proteins through post-translational modifications (PTMs). In other words, it can be regulated in a tissue-specific manner, during development, and by environmental cues.
In higher eukaryotes, an additional level of complexity exists which links PTC detection and mRNA splicing. During pre-mRNA processing, the spliceosome removes intron sequences and a set of proteins called the exon-junction complex (EJC) is deposited 20-24 nucleotides upstream of the sites of intron removal. EJCs located within the ORF are removed from the mRNA by elongating ribosomes, and only EJCs located downstream of the TC will still be present when the first ribosome terminates. In organisms producing a large number of PTC-containing mRNAs by extensive alternative pre-mRNA splicing, such as humans, the EJC may have evolved to facilitate efficient recognition and degradation of these transcripts. An EJC downstream of a TC functions as an NMD enhancer by shortening the time window between UPF1 binding and its phosphorylation, hence promoting mRNA degradation.
NMD rarely downregulates the expression of a transcript completely. More commonly, 10-30% of the PTC-containing transcripts survive and may allow the production of physiologically relevant levels of protein products (Neu-Yilik et al., 2004). This is why in UniProtKB, we favour a conservative approach when dealing with protein isoforms predicted to be encoded by an NMD target mRNA. We do not delete them from the database, but rather tag them with the comment: ‘May be produced at very low levels due to a premature stop codon in the mRNA, leading to nonsense-mediated mRNA decay.’ For instance, in entry Q9HB09 (human Bcl-2-like protein 12), 2 isoforms are described, one of which has been predicted to be an NMD target by Hillman et al., 2004 (see also the ‘References’ section of the entry). In some cases, despite the presence of a PTC in the encoding mRNA, the isoform produced seems to be the predominant form, at least in some tissues (see human Gamma-aminobutyric acid type B receptor subunit 1 isoform 1E in entry Q9UBS5).
Currently in UniProtKB/Swiss-Prot, over 300 protein entries describe isoforms that could be produced at low levels due to NMD. 228 proteins from different species are directly involved in the NMD process itself and can be retrieved from UniProtKB with the keyword ‘Nonsense mediated mRNA decay’.
Cross-references to UCD-2DPAGE
Cross-references have been added to the University College Dublin 2-DE Proteome Database, (UCD-2DPAGE). The database HSC-2DPAGE,previously hosted at Harefield Hospital (and previously also cross-referenced from UniProtKB/Swiss-Prot), has been integrated into UCD-2DPAGE. UCD-2DPAGE currently contains data from Canis familiaris (dog), Homo sapiens (human), Mus musculus (mouse), Rattus norvegicus (rat) and Saccharomyces cerevisiae (baker’s yeast).
UCD-2DPAGE is available at http://proteomics-portal.ucd.ie:8082/cgi-bin/2d/2d.cgi.
The format of the explicit links in the flat file is:
|Resource identifier||UCD-2DPAGE accession number (in most cases the primary UniProtKB accession number)|
DR UCD-2DPAGE; P02648; -.O75112:
DR UCD-2DPAGE; O75112; -. DR UCD-2DPAGE; Q9Y4Z5; -.
Changes concerning cross-references to HSC-2DPAGE
Cross-references to HSC-2DPAGE have been removed.
Changes concerning keywordsNew keyword:
Changes in subcellular location controlled vocabularyNew subcellular location:
Changes concerning the controlled vocabulary for PTMsNew terms for the feature key ‘Cross-link’ (‘CROSSLNK’ in the flat file):
- Glycyl serine ester (interchain with G-Cter in ubiquitin)
- Glycyl threonine ester (interchain with G-Cter in ubiquitin)
- 2-(S-cysteinyl)pyruvic acid O-phosphothioketal