UniProt release 2010_06
Published May 18, 2010
UniProt and Ensembl
The Ensembl project was launched in 2000 as a joint project between the EBI and the Wellcome Trust Sanger Institute, some years before the draft human genome was completed. Even at that early stage, it was clear that manual annotation of 3 billion base pairs of sequence would not be able to offer researchers timely access to the latest data. The goal of Ensembl was therefore to automatically annotate the genome, integrate this annotation with other available biological data and make all this publicly available. Since the launch, many more genomes have been added and the range of available data has expanded to include comparative genomics, variation and regulatory data. A collaboration between UniProt and Ensembl was initiated in 2008 to contribute towards the goal of having the complete human proteome available in UniProtKB/Swiss-Prot. A pipeline was established to import those Ensembl sequences not yet in UniProtKB which is updated with each Ensembl release along with a quality assurance feedback loop which ensures that the Ensembl predictions benefit from the manual review in UniProtKB. Since then, the scope of Ensembl has been extended to include manual annotation by the Human And
Vertebrate Analysis aNd Annotation (Havana) group at Sanger Institute which further adds value to the predictions. Ensembl and UniProt are pleased to announce that this collaboration has now been extended to Mus musculus and Rattus norvegicus and will shortly be extended to Gallus gallus and Bos taurus. The provision of a complete set of protein sequences to users is a priority for the UniProt Consortium and this collaboration contributes significantly to this effort.
Changes concerning the controlled vocabulary for PTMsNew terms for the feature key ‘Cross-link’ (‘CROSSLNK’ in the flat file):
- Glycyl serine ester (Gly-Ser) (interchain with S-...)
- Glycyl threonine ester (Gly-Thr) (interchain with G-...)