Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Cross-references section

Last modified August 23, 2017

This section is used to point to information related to entries and found in data collections other than UniProtKB.

The databases to which UniProtKB is cross-referenced can be listed and searched in ‘Cross-referenced databases’. Each database is described by its name and abbreviation and a link to its web server is provided, as well as literature references where available.

Additional and complementary information can be found in the subsection Web resources.

List of categories of the databases cross-referenced in UniProtKB

The databases are categorized for easy user perusal and understanding of how the different databases relate to both UniProtKB and to each other:

  • 2D gel databases
  • 3D structure databases
  • Chemistry databases
  • Enzyme and pathway databases
  • Family and domain databases
  • Gene expression databases
  • Genome annotation databases
  • Miscellaneous databases
  • Ontologies
  • Organism-specific databases
  • Phylogenomic databases
  • Polymorphism and mutation databases
  • Proteomic databases
  • Protein-protein interaction databases
  • Protein family/group databases
  • Protocols and materials databases
  • PTM databases
  • Sequence databases

Cross-references to ‘Sequence databases’

If a sequence submitted to EMBL/GenBank/DDBJ present some discrepancies with the canonical sequence, these latter are documented in the ‘Sequence conflict’ subsection of the ‘Sequence’ section. However, if these discrepancies are severe, the underlying cross-reference to EMBL/GenBank/DDBJ is tagged.

Four different tags are used to mark problematic sequences submitted to the EMBL/GenBank/DDBJ databases:

  • ‘Frameshift’: discrepancies are due to the insertion or deletion of one or more nucleotides in the submitted nucleotide sequence.
    Example: O14467
  • ‘Different initiation’: discrepancies are due to an erroneous initiation codon choice in the submitted sequence.
    Example: O14467
  • ‘Different termination’: the termination codon of the submitted sequence differs from that of the sequence displayed
    Example: P59263
  • ‘Sequence problems’: discrepancies are due to an erroneous gene model prediction, erroneous ORF assignement, miscellaneous discrepancy, etc.
    Example: Q7XR80, P76164, Q8TBF5

The discrepancies are described in more details in the ‘Sequence Caution’ section.

If no CoDing Sequence (CDS) has been annotated in a submitted nucleotide sequence (most frequently genomic sequence), the corresponding cross-reference is tagged with the label: ‘No translation available’.
Example: B2RU33

Related document