Skip Header

UniProt release 9.1

Published November 14, 2006

Headlines

CD antigens: molecular markers of cell differentiation

The CD nomenclature was proposed and established in 1982 at the first International Workshop and Conference on Human Leukocyte Differentiation Antigens (HLDA). This system was intended for the classification of monoclonal antibodies (mAbs), generated in many laboratories around the world, against various cell surface molecules on leukocytes (white blood cells). The data were collated and analyzed by the statistical procedure of 'cluster analysis'. This analytical method identified clusters of antibodies with very similar patterns of binding to leukocytes at various stages of differentiation: hence the use of the abbreviation 'CD' for 'cluster of differentiation'. CD antibodies are used widely for research, differential diagnosis, and the monitoring and treatment of disease.

The HLDA workshops assign each CD on the basis of the reactivity of at least two mAbs to one human antigen; the provisional indicator 'w' (for example CDw293) is sometimes given to an imperfectly characterized cluster or to a cluster represented by only one mAb.

Gradually the use of the CD nomenclature has expanded to many other cell types such as endothelial and stromal cells. Therefore the 8th HLDA conference (HDLA8) decided in 2004 that the acronym HLDA would be replaced by HCDM for "Human Cell Differentiation Molecules".

All currently defined human CD antigens (a total of 361 in this release) are annotated and integrated in UniProtKB/Swiss-Prot. A CD antigen appears in an entry as a synonym of the protein name (e.g. CD305 antigen for Leukocyte-associated immunoglobulin-like receptor 1). The CD name is also propagated to all orthologous mammalian proteins, so that human CD antigens and their orthologs in other mammals can be retrieved easily.

UniProtKB News

Extension of the UniProtKB accession number format

Before release 9.1, UniProtKB accession numbers consisted of 6 alphanumerical characters in the following format:

1 2 3 4 5 6
[O,P,Q] [0-9] [A-Z, 0-9] [A-Z, 0-9] [A-Z, 0-9] [0-9]

Due to the large increase in the number of protein sequences in UniProtKB, we had to extend the existing accession number format by allowing the first character to be any of the 26 letters (instead of only O, P and Q). To avoid assigning accession numbers identical to those which have been used by the International Nucleotide Sequence Database, the extension in the first position goes along with a restriction in the third position which can only be a letter. The new format for UniProtKB accession numbers is therefore:

1 2 3 4 5 6
[A-N,R-Z] [0-9] [A-Z] [A-Z, 0-9] [A-Z, 0-9] [0-9]
[O,P,Q] [0-9] [A-Z, 0-9] [A-Z, 0-9] [A-Z, 0-9] [0-9]

Cross-references to DrugBank

Cross-references have been added to the DrugBank database. The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains many drug entries including FDA-approved small molecule drugs, FDA-approved biotech (protein/peptide) drugs, nutraceuticals and experimental drugs. Additionally, protein (i.e. drug target) sequences are linked to these drug entries. Each DrugCard entry contains more than 80 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data.

The DrugBank database is available at http://redpoll.pharmacy.ualberta.ca/drugbank/.

The format of the explicit links in the flat file is:

Resource abbreviation DrugBank
Resource identifier DrugBank accession number.
Optional information 1 Drug generic name.
Example
P08185:
DR   DrugBank; DB00240; Alclometasone.
DR   DrugBank; DB00394; Beclomethasone.
DR   DrugBank; DB01410; Ciclesonide.
DR   DrugBank; DB00663; Flumethasone Pivalate.
DR   DrugBank; DB00180; Flunisolide.
DR   DrugBank; DB00591; Fluocinolone Acetonide.
DR   DrugBank; DB01047; Fluocinonide.
DR   DrugBank; DB00324; Fluorometholone.
DR   DrugBank; DB00846; Flurandrenolide.
DR   DrugBank; DB00588; Fluticasone Propionate.
DR   DrugBank; DB00596; Halobetasol Propionate.
DR   DrugBank; DB00253; Medrysone.
DR   DrugBank; DB00648; Mitotane.
DR   DrugBank; DB01384; Paramethasone.
DR   DrugBank; DB00860; Prednisolone.
DR   DrugBank; DB00896; Rimexolone.
DR   DrugBank; DB00620; Triamcinolone.

Cross-references to euHCVdb

Cross-references have been added to the European Hepatitis C Virus database. The development of the European Hepatitis C Virus database (euHCVdb) started in 1999 as the French HCV Database. The euHCVdb is mainly oriented towards protein sequence, structure and function analyses and structural biology of HCV.

The European Hepatitis C Virus database is available at http://euhcvdb.ibcp.fr/euHCVdb/.

The format of the explicit links in the flat file is:

Resource abbreviation euHCVdb
Resource identifier EMBL Accession number.
Examples
P26664:
DR   euHCVdb; AF271632; -.
DR   euHCVdb; M62321; -.

P27953:
DR   euHCVdb; X53136; -.

Cross-references to dbSNP

Explicit links are present in the FT VARIANT lines of protein sequence entries of Hominidae to the Single Nucleotide Polymorphism database (dbSNP). We will prefix dbSNP identifiers in human FT VARIANT lines by "rs". NCBI/dbSNP has rs and ss numbers, but we only refer to SNPs with rs numbers.

Examples:

P08185: 
	FT   VARIANT     246    246       S -> A (in dbSNP:rs2228541).
	FT                                /FTId=VAR_024350.
P06307: 
	FT   VARIANT      32     32       G -> E (in dbSNP:rs11571848).
	FT                                /FTId=VAR_018818.
	FT   VARIANT      95     95       R -> W (in dbSNP:rs3774395).
	FT                                /FTId=VAR_024452.