Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

UniProt release 14.5

Published November 25, 2008


The plastid: the most important organelle!

The world is full of plastids. Most of us know the green photosynthetic chloroplast which houses the machinery that fixes CO2 (with O2 as a "mere" by-product) and synthesizes sugars, lipids, amino acids, etc.; in short, the basis of our food chain. Found in plants and algae, chloroplasts are absolutely essential to life as we know it.

Plastids contain DNA; they are the remnants of a cyanobacterium that was engulfed by a eukaryotic heterotroph which had previously engulfed an alphaproteobacteria which eventually became the mitochondrion. These are primary endosymbiotic events; the organism that was taken up by the host was not digested but survived in the cytoplasm, eventually transferring genes to the host nucleus and being in effect enslaved. Most of these transferred gene products are imported back into their respective organelles using transit peptides. Plastids now encode between 28 and 250 protein-coding genes. The primary plastid endosymbiosis gave rise to 3 lineages: green algae, red algae and the glaucophytes. Subsequent engulfment of green or red algae by other eukaryotes has given rise to secondary endosymbionts, which in some cases have been engulfed again, sometimes with plastid replacement, to give an array of tertiary endosymbionts. These secondary and tertiary events gave rise to (among others) cryptophytes, diatoms, heterokont algae and apicocomplexa which are organisms that are no longer photosynthetic such as Plasmodium. To further complicate matters, it was thought that there were only 2 primary endosymbiotic events; recent work, however, on a thecate amoeba, Paulinella chromatophora, has cast doubt on this assumption.

Due to their small size, plastids are easily sequenced. A list of fully sequenced plastid genomes, their genes and the nomenclature of known plastid-encoded proteins can be found in our document plastid.txt.

In UniProtKB, we indicate whether a protein is encoded by plastid, mitochondrial or plasmid DNA in the 'Names and origin' section, 'Encoded on' subsection (OG line in the flat file). 6 categories have been created for plastids:

  • 'Plastid; Chloroplast' indicates the organism is photosynthetic, whether of primary, secondary or higher endosymbiotic events.
  • 'Plastid; Non-photosynthetic plastid' is used when the organism is from a photosynthetic lineage but genetically unable to photosynthesize, as happens with some parasitic plants (Epifagus virginiana, Aneura mirabilis), a parastic "green" algae (Helicosporidium sp. subsp. Simulium jonesii) and a euglenoid (Astasia longa).
  • 'Plastid; Cyanelle' is used for the plastid of the glaucophyte algae. It has the remnants of a cell wall between its surrounding membranes.
  • 'Plastid; Apicoplast' is used for plastids from the non-photosynthetic Apicocomplexan parasites such as Plasmodium, Toxoplasma and Eimeria which cause malaria, toxoplasmosis and coccidian diseases respectively. Although the plastid remnant has a reduced coding capacity, it is essential for cell survival and is interesting as a drug target.
  • 'Plastid; Chromatophore' is used for the plastid of the thecate amoeba Paulinella chromatophora, which has a very large endosymbiont genome (1.0 Mb, encoding almost 900 proteins).
  • 'Plastid' (without any qualifier) is used for some parasitic plants (mostly from the genus Cuscuta) which may be briefly photosynthetic when very young.

Currently, in UniProtKB/Swiss-Prot, there are close to 11'000 entries encoded by a plastid genome; 10'130 by chloroplasts, 145 by cyanelles, 142 by non-photosynthetic plastids, 18 by apicoplasts, 22 by chromatophores and 165 by unspecified types of plastids.

UniProtKB News

Changes concerning keywords

New keywords:

Modified keyword:

Deleted keyword:

  • Structural protein

Website News

New UniParc query field 'isoform'

The existing query field uniprot allows you to search UniParc for the canonical sequence of a UniProtKB entry, e.g. uniprot:P00750. With the new query field isoform you can retrieve the UniParc record that corresponds to the sequence of a specific UniProtKB isoform, e.g. isoform:P00750-2 or you can retrieve all isoforms of a UniProtKB entry, e.g. isoform:P00750-*.

This can also be done with the website's toolbar:

  1. Select Search in: Sequence Archive (UniParc)
  2. Click on Fields » to open the query builder
  3. Select Field: UniProtKB isoform ID
  4. Type the identifier, e.g. P00750-2
  5. Click on Add & Search

Programmatic search for UniRef and UniParc identifiers in UniProtKB

The URLs to search for UniRef and UniParc identifiers in UniProtKB are going to change in the following way:

Valid until release 14.6 Valid from release 14.5
UniRef cluster:*
e.g. cluster:UniRef50_Q8WZ42
e.g. cluster:(UniRef50_Q8WZ42)
UniParc sequence:*
e.g. sequence:UPI0000D7E631
e.g. sequence:(UPI0000D7E631)

Please change your queries before release 14.6 by adding parentheses around the identifier.

The web interface for searching UniRef and UniParc identifiers in UniProtKB remains unchanged:

  1. Select Search in: Protein Knowledgebase (UniProtKB)
  2. Click on Fields » to open the query builder
  3. Select Field: UniRef ID (or UniParc ID)
  4. Type the identifier, e.g. UniRef50_Q8WZ42
  5. Click on Add & Search