Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.

UniProt release 2012_08

Published September 5, 2012


Prokaryotes do it too: CRISPR, an RNA-based adaptive immune system in UniProt

Like all other cellular organisms, bacteria and archaea are constantly bombarded by viruses, and unlike eukaryotes, many are also susceptible to infective plasmids. While we have known about defenses such as restriction-modification systems, blockage of absorption and/or DNA injection and abortive infection for some time, new ways in which bacteria and archaea defend themselves against these infective agents have been found more recently. One of these is the clustered regularly interspaced short palindromic repeat (CRISPR) sequences. CRISPR is an RNA-based adaptive immune system, which degrades invading genetic material. The system is mechanistically different from eukaryotic RNA interference (RNAi) and the proteins involved in prokaryotes are not homologous to those in eukaryotes (review).

CRISPRs are repetitive loci on the genome consisting of unique sequences 20-50 bases long (the spacer sequences) interspaced with repeated sequences of about the same length. Examination of the spacer sequences has shown that some are identical to viral and plasmid sequences; they are thought to serve as a “memory” of a previous infection. Bacteria and archaea can have from 0 to 18 CRISPR loci, with between 2 and 249 repeat-spacer units. While many pathogens have CRISPR loci, obligate parasites do not. The CRISPR loci are transcribed and processed to give short CRISPR-derived RNA (crRNA) complementary to a previously-encountered infective agent. It is this crRNA that is at the heart of adaptive prokaryotic immunity.

There are a large number of proteins associated with CRISPR loci, the operon-encoded CRISPR-associated or Cas proteins. The Cas proteins present in each locus have allowed the definition of 3 major CRISPR-Cas systems with further division into a number of subtypes. The number of subtypes will probably continue to increase as more prokaryotic genomes are fully sequenced. Many of these proteins are predicted to be nucleases, helicases and/or RNA binding proteins as is to be expected given the function of CRISPR.

There are 3 stages in CRISPR-Cas mediated immunity:

Stage 1, adaptation or acquisition, is the least well characterized. A short piece of DNA homologous to an invading agent is integrated into the 5’ end of the CRISPR loci. This requires the metal-dependent Cas1 endoribonuclease, the only Cas protein found in all organisms with CRISPR loci, although almost all organisms also encode Cas2, another metal-dependent endoribonuclease which is also thought to be involved in adaptation.

Stage 2, expression or crRNA biogenesis, requires transcription and processing of the CRISPR loci to produce the crRNA. Type I CRISPR systems use one of the related, metal-independent Cas6, Cas6e or Cas6f endoribonucleases to process the precursor, while type III systems use endogenous RNase III to generate the crRNA. It is not yet known which protein produces crRNA in type II systems.

Stage 3, interference, is the destruction of the target (be it virus or plasmid) and is performed by a complex of crRNA and proteins. While it is generally thought to recognize invading DNA, the type III-B CRISPR system of Pyrococcus furiosis cleaves target RNA.

While CRISPR-Cas systems can now be assumed to be involved in adaptive immunity, there are tantalizing hints that they may perform other functions as well. In Pseudomonas aeruginosa UCBPP-PA14, the type I CRISPR system does not confer resistance to phages DMS3 or MP22, but is required for DMS3-dependent inhibition of biofilm formation and possibly motility, while in Myxococcus xanthus, a CRISPR system is involved in the regulation of fruiting body development.

We have recently annotated and updated characterized Cas proteins in UniProtKB/Swiss-Prot, although the field moves so quickly that it is impossible to be fully up-to-date with all the latest research. All manually annotated CRISPR-associated protein entries can be retrieved from UniProtKB/Swiss-Prot using the query term ‘CRISPR’ in ‘Protein name’.

UniProtKB news

Cross-references to GenomeRNAi

Cross-references have been added to GenomeRNAi, a database containing phenotypes from RNA interference (RNAi) screens in Drosophila and Homo sapiens.

GenomeRNAi is available at

The format of the explicit links in the flat file is:

Resource abbreviation GenomeRNAi
Resource identifier GenomeRNAi identifier
Example Q9BXP5:
DR   GenomeRNAi; 51593; -.

Show all the entries having a cross-reference to GenomeRNAi

Cross-references to UniPathway

Cross-references have been added to UniPathway, a fully manually curated resource for the representation and annotation of metabolic pathways.

UniPathway provides explicit representations of enzyme-catalyzed and spontaneous chemical reactions, as well as a hierarchical representation of metabolic pathways. All of the pathway data in UniPathway has been extensively cross-linked to existing pathway resources such as KEGG and MetaCyc, as well as sequence resources such as UniProtKB, for which UniPathway provides a controlled vocabulary for pathway annotation.

The format of the explicit links in the flat file is:

Resource abbreviation UniPathway
Resource identifier UniPathway pathway ID (UPA)
Optional information UniPathway enzymatic reaction ID (UER)
Examples Q8LL69:
DR   UniPathway; UPA00842; -.
DR   UniPathway; UPA00842; UER00808.

Show all the entries having a cross-reference to UniPathway

Changes to keywords

New keywords: