Release 9.6
Published February 6, 2007
Headlines
One million comment lines in UniProtKB/Swiss-Prot!
Annotation is the focal point of our effort to maintain and develop UniProtKB/Swiss-Prot. Many of our manual annotation is found in the comment lines, which aim to provide a summary of what is known about a protein. There are 27 different types of comment line, which are arranged according to what we designate as 'topics'.
Recently, we reached a peak of 1 million CC topic lines. About 97 % of the UniProtKB/Swiss-Prot entries contains at least one CC topic line and, currently, there is an average of 4 different CC topic lines per entry.
Comment lines are mainly free text, but we have already set up a standardised format as well as the use of controlled vocabularies for several topics (ALTERNATIVE PRODUCTS, BIOPHYSICOCHEMICAL PROPERTIES, CATALYTIC ACTIVITY, DISEASE, INTERACTION, MASS SPECTROMETRY, PATHWAY, RNA EDITING, SIMILARITY, TOXIC DOSE...). Standardisation for two further topics - SUBCELLULAR LOCATION and CAUTION - are also on their way (more: Forthcoming changes)
The most represented CC topics in UniProtKB/Swiss-Prot are:
- SIMILARITY: describes the similaritie(s) (at the sequence or structure level) of a protein with other proteins or families of proteins. It can be found in 91 % of the entries. This topic can however be considered as an 'outsider' in this ranking as its content is mostly based on protein sequence analysis (family and domain scan process or similarity searches(BLAST)), and not really on a literature-based annotation.
- FUNCTION: gives a general description of the function(s) of a protein. It is found in 68 % of the entries. Note that over 40 % of this functional data has been experimentally proved.
- SUBCELLULAR LOCATION: describes the subcellular location of the mature protein; this information is found in 54 % of the entries (39 % experimentally proven).
- SUBUNIT: describes the quaternary structure of a protein and any kind of interactions with other proteins or protein complexes (found in 37 % of the entries).
- CATALYTIC ACTIVITY: describes the reaction(s) catalyzed by an enzyme (found in 35 % of the entries). There are at least as many CC CATALYTIC ACTIVITY line(s) as complete EC number(s) in the 'Protein name' section.
Such a distribution reflects the type of experimental biological data which is available for a protein sequence nowadays in the scientific literature.
The data found in UniProtKB/Swiss-Prot, are continuously updated and - since annotators are constantly improving their skills in literature-based information retrieval - the 'depth' of manual annotation is always increasing. This is highlighted by the fact that we have increased the average number of CC topics per entry from 3.5 to 4 since March 2004.
UniProtKB News
Cross-references to Cornea-2DPAGE
Cross-references have been added to the Human Cornea 2-DE database, a two-dimensional polyacrylamide gel electrophoresis federated database available at the Aarhus University (Denmark).
The Cornea-2DPAGE is available at http://www.cornea-proteomics.com/.
The format for the explicit links is:
| Data bank identifier | Cornea-2DPAGE |
|---|---|
| Primary identifier | The primary identifier consists of an UniProtKB/Swiss-Prot accession number. |
| Secondary identifier | The secondary identifier consists of the organism common name. |
| Example |
P31946: DR Cornea-2DPAGE; P31946; HUMAN. |
Cross-references to DOSAC-COBS-2DPAGE
Cross-references have been added to the DOSAC-COBS 2D Page, a two-dimensional polyacrylamide gel electrophoresis federated database available at the DOSAC and COBS genome and proteome laboratory (La Maddalena, Italy).
The DOSAC-COBS-2DPAGE is available at http://www.dosac.unipa.it/2d/.
The format for the explicit links is:
| Data bank identifier | DOSAC-COBS-2DPAGE |
|---|---|
| Primary identifier | The primary identifier consists of an UniProtKB/Swiss-Prot accession number. |
| Secondary identifier | The secondary identifier consists of the organism common name. |
| Example |
P15531: DR DOSAC-COBS-2DPAGE; P15531; HUMAN. |
Cross-references to REPRODUCTION-2DPAGE
Cross-references have been added to the REPRODUCTION-2DPAGE, a two-dimensional polyacrylamide gel electrophoresis database available at the laboratory of Reproductive Medicine, Nanjing Medical University, P. R. China.
The REPRODUCTION-2DPAGE is available at http://reprod.njmu.edu.cn/cgi-bin/2d/2d.cgi.
The format for the explicit links is:
| Data bank identifier | REPRODUCTION-2DPAGE |
|---|---|
| Primary identifier | The primary identifier consists of an UniProtKB/Swiss-Prot accession number. |
| Secondary identifier | The secondary identifier consists of the organism common name. |
| Example |
P32119: DR REPRODUCTION-2DPAGE; P32119; HUMAN. |



