Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

SAAS (Statistical Automatic Annotation System)

Last modified October 16, 2015

UniProt’s Automatic Annotation pipeline enhances the unreviewed records in UniProtKB by enriching them with automatic classification and annotation.

The Statistical Automatic Annotation System SAAS is one of the contributors to this pipeline, and generates automatic rules for functional annotation from expertly annotated entries in UniProtKB/Swiss-Prot using the C4.5 decision tree algorithm. This algorithm uses machine learning to find the most concise rule for an annotation based on the properties of sequence length, InterPro group membership and taxonomy. SAAS employs a data exclusion set that censors data not suitable for computational annotation (such as specific biophysical or chemical properties) and generates human-readable rules for each release.

SAAS rules can annotate protein properties such as function, catalytic activity, pathway membership, and subcellular location, but protein names and feature predictions are currently excluded. Generating rules on-the-fly in this way allows rules to evolve along with the content of UniProtKB with little or no manual intervention. It also provides a constant supply of potential ‘seed rules’ which can be further developed by the curators into UniRule rules.

SAAS based evidence for UniProtKB annotation (example: Q87644)

UniProtKB entries contain evidence tags that describe the provenance of a given annotation and provide links to a reference where applicable. When an annotation is added to an entry based on an automatic annotation SAAS rule, the evidence tag indicates this:

When you click on the tag, you see a link to the relevant SAAS annotation rule:

Searching SAAS rules

The SAAS dataset is available from the UniProt website. In order to search the dataset to view rules of interest, click on the dropdown next to the search box and select ‘SAAS’. Now enter a search keyword or rule ID. You can also use the advanced search to build your query.

Exploring the SAAS rule pages

Example

A SAAS rule page contains the unique SAAS ID, a link to the UniProtKB entries
annotated by the rule and the full rule with its conditions and annotations. A rule consists of a set of conditions and corresponding annotations that apply to a protein entry if the conditions are true.

Conditions are listed on the left hand side of the rule page and annotations are on the right hand side. If a condition holds true then the corresponding annotation is applied. A SAAS rule only ever applies one annotation but can have multiple condition sets that lead to this annotation. Clicking on the conditions highlights the annotation and vice versa.