UniProt release 13.4
Published May 20, 2008
Swiss-Prot in the Wonderland of protein names
Successful basic research requires various skills from scientists, not only creativity, but also precision, critical analysis of experimental results, reconsideration of the starting hypotheses, continuous controls and days, nights and weekends of - sometimes tedious - work in the lab. Thus when proteins are eventually purified, genes are cloned and a nice story is wrapped around the data, one of the rewards is to name the proteins/genes. There lies the fun.
Telling names can be useful for remembering a function or a phenotype. Interaction of Drosophila Cleopatra mutants with the asp gene product is lethal. Indeed, Cleopatra, Ancient Egypt's queen, allegedly committed suicide by way of an asp bite. Groucho mutants have more bristles than the norm on their face, much like Groucho Marx. Ken and Barbie protein mutants lack external genitalia... In Arabidopsis thaliana, Superman mutants have extra stamens (male genitals) in their flowers, and fans of the famous cartoon will not be surprised to learn that Kryptonite protein suppresses the function of Superman.
Acronyms are another part of the naming game. You would expect the RING1 protein to have a specific 3D structure related to its name, round, for instance. Actually, RING stands for "Really Interesting New Gene". In the same vein, you would not expect POSH to be any ordinary protein and yet all it contains are "Plenty Of SH3" domains! JAK1 kinase has two phosphate-transferring domains and was named after Janus, the Roman god of gates, usually depicted with two heads looking in opposite directions. However, JAK is also said to be 'Just Another Kinase', one among the hundreds of essential kinases described so far. And last, but not least, the Drosophila INDY protein refers to the movie "Monty Python and the Holy Grail", in which a live person about to be buried rightly protests: 'I'm Not Dead Yet!', which is hardly surprising since mutations in this gene result in a near doubling of the average adult life-span. For more amazing protein/gene names, see the excellent website established by Mikael Niku and Mikko Taipale.
Scientific creativity can be somewhat hampered by economical actors. The Pokemon oncogene for instance - which stands for POK erythroid myeloid ontogenic factor - had to be withdrawn after the US branch of Japanese video-game franchise Pokémon threatened researchers with legal action. The protein ended up with the far more sober - not to say boring - name of 'Zinc finger and BTB domain-containing protein 7A' (ZBTB7A).
Much ink has been spilled over the lack of standardization of protein names. Inconsistency among orthologs, family members and so on makes the systematic search through the literature a complicated task. UniProt provides a few guidelines for protein naming. Such a document should help to improve consistency, keeping a given protein's 'hypokeimenon', while not curbing creativity!
Cross-references to CGD
Cross-references have been added to the Candida Genome Database. CGD is a resource for genomic sequence data and gene and protein information for Candida albicans. CGD is based on the Saccharomyces Genome Database and is funded by the National Institute of Dental and Craniofacial Research at the US National Institutes of Health.
The Candida Genome Database is available at http://www.candidagenome.org/.
The format of the explicit links in the flat file is:
|Resource identifier||CGD identifier.|
|Optional information 1||Gene name.|
Changes concerning keywords