Last modified October 12, 2015
UniProt’s Automatic Annotation pipeline enhances the unreviewed records in UniProtKB by enriching them with automatic classification and annotation.
UniProt uses InterPro to classify sequences at superfamily, family and subfamily levels and to predict the occurrence of functional domains and important sites. InterPro integrates predictive models of protein function, so-called ‘signatures’, from a number of member databases. InterPro matches are automatically annotated to UniProtKB entries as database cross-references with every InterPro release.
UniProt has developed two prediction systems, UniRule and the Statistical Automatic Annotation System (SAAS) to automatically annotate UniProtKB/TrEMBL in an efficient and scalable manner with a high degree of accuracy.
Rules that constitute these two prediction systems can be browsed and queried in dedicated sections of the UniProt website:
We also use a suite of Sequence Analysis Methods (SAM) to enrich the unreviewed TrEMBL records in the UniProt Knowledgebase with extra sequence-specific information. Predictions of sequence features such as Signal, Transmembrane and Coil regions are generated using software from external providers.