UniProt release 12.6
Published December 4, 2007
Complete proteome for Arabidopsis thaliana in UniProtKB
Arabidopsis thaliana was the first plant to have its genome completely sequenced. A first round of annotation was performed in 2001 by the Arabidopsis Genome Initiative. The genome was later reannotated and is now maintained by The Arabidopsis Information Resource (TAIR) which assumes primary responsibility for Arabidopsis genome annotation.
As the genome sequencing was being completed, Swiss-Prot initiated the Plant Proteome Annotation Program (PPAP) whose main focus is the annotation of Arabidopsis (and rice) plant-specific proteins and protein families.
This ongoing program has so far produced more than 6'200 manually annotated Arabidopsis thaliana protein sequences in UniProtKB/Swiss-Prot. In addition, close to 44'000 Arabidopsis entries are available in UniProtKB/TrEMBL with a certain level of redundancy. Thus, the total number of protein sequences in UniProtKB for this model plant is much higher than the current estimate of 27'029 protein-encoding genes (see TAIR7 release of April 2007). To get around this problem, a non-redundant set of Arabidopsis proteins, including nuclear, mitochondrial and chloroplastic proteins, was created as of this release and the selected entries have been labelled with the keyword 'Complete proteome' to allow easy retrieval.
Arabidopsis thaliana is the third 'green plant' (Viridiplantae) for which a complete nonredundant protein set has been created in UniProtKB. The other two are the unicellular green algae Ostreococcus tauri and Ostreococcus lucimarinus.
Changes concerning keywords (KW line)
- Interferon induction