Release 14.2
Published September 23, 2008
Headlines
Additional bibliography information in UniProtKB
As a comprehensive and high-quality resource of protein sequence and functional information, UniProtKB strives to provide comprehensive literature citations associated with protein sequences and their characterization. Currently about 2 thirds of the UniProtKB PubMed citations are found in UniProtKB/Swiss-Prot, as a result of active integration in the course of manual curation.
In order to keep up with the explosive growth of literature and to give our users access to additional publications, we decided to integrate additional sources of literature from other annotated databases into UniProtKB. For this purpose we selected 5 external databases: Entrez Gene (GeneRIFs), SGD, MGI, GAD and PDB, and extracted citations that were mapped to UniProtKB entries. This additional bibliography is available from the 'References' section by clicking on 'Additional computationally mapped references'.
This procedure allowed the addition of about 283'000 PubMed citations in close to 110'000 UniProtKB entries. 85% of these references did not exist previously in UniProtKB.
In the future, we plan to apply this pipeline to more databases that could be used as sources of protein bibliography, including model organism databases, such as FlyBase and WormBase. We believe this additional protein bibliography information will allow our users to better explore the existing knowledge of their proteins of interest.



