Last modified September 9, 2013
This section is used to point to information related to entries and found in data collections other than UniProtKB.
The databases to which UniProtKB is cross-referenced can be listed and searched in ‘Cross-referenced databases’. Each database is described by its name and abbreviation and a link to its web server is provided, as well as literature references where available.
List of categories of the databases cross-referenced in UniProtKB
The databases are categorized for easy user perusal and understanding of how the different databases relate to both UniProtKB and to each other:
- 2D gel databases
- 3D structure databases
- Enzyme and pathway databases
- Family and domain databases
- Gene expression databases
- Genome annotation databases
- Organism-specific databases
- Phylogenomic databases
- Polymorphism databases
- Proteomic databases
- Protein-protein interaction databases
- Protein family/group databases
- PTM databases
- Sequence databases
Cross-references to ‘Sequence databases’
If a sequence submitted to EMBL/GenBank/DDBJ present some discrepancies with the canonical sequence, these latter are documented in the ‘Sequence conflict’ subsection of the ‘Sequence annotation (Features)’ section. However, if these discrepancies are severe, the underlying cross-reference to EMBL/GenBank/DDBJ is tagged.
Four different tags are used to mark problematic sequences submitted to the EMBL/GenBank/DDBJ databases:
- ‘Frameshift’: discrepancies are due to the insertion or deletion of one or more nucleotides in the submitted nucleotide sequence.
- ‘Different initiation’: discrepancies are due to an erroneous initiation codon choice in the submitted sequence.
- ‘Different termination’: the termination codon of the submitted sequence differs from that of the sequence displayed
- ‘Sequence problems’: discrepancies are due to an erroneous gene model prediction, erroneous ORF assignement, miscellaneous discrepancy, etc.
Example: Q7XR80, P76164, Q8TBF5
The discrepancies are described in more details in the ‘Sequence Caution’ section.
If no CoDing Sequence (CDS) has been annotated in a submitted nucleotide sequence (most frequently genomic sequence), the corresponding cross-reference is tagged with the label: ‘No translation available’.