Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.


This subsection of the ‘Family and Domains’ section indicates the positions and types of repeated sequence motifs or repeated domains within the protein.

Repeats vary from short amino acid repetitions, such as the polyglutamine tracts of the Huntington disease gene product huntingtin, to large repetitions containing multiple domains, such as in the cytoskeletal protein titin. One likely reason for their evolutionary success is that repeat-containing proteins are relatively cheap to evolve. That is, large and thermodynamically stable proteins may arise by the simple expedient of intragenic duplications rather than the more complex processes of de novo alpha-helix and beta-sheet creation.

1. Annotation of specific repeated sequence motifs

Certain repeats may be restricted to, or typical of, a specific protein or protein family. When these repeats have a specific name, that name is applied to each such repeat:
Examples: Q03763, P20811

When no standard name exists, the 'Region' subsection is used to describe the region containing all the repeats and, when possible, the pattern of the repeat. The ‘Repeat’ subsection is then used simply to specify the number and position of each such repeat.
Example: Q96NY7

Conventions: We define repeats as ‘half-length’ or ‘truncated’ when appropriate and as ‘approximate’, ‘degenerate’ or ‘atypical’ when they deviate significantly from the consensus sequence. We also omit the position of each individual repeat when they are extremely abundant.
Examples: P15497, P38479, P15305

2. Annotation of predicted repeats

A large number of repeated domains have been modelled by InterPro and by the REP program of Andrade and Bork. Repeats predicted using both of these resources are annotated in UniProtKB. When using REP, the e-value thresholds for reporting matches are initially set to their most conservative values, but may be relaxed to ensure that consistent numbers of repeats are annotated in orthologous proteins. The e-values may also be adjusted to ensure that the predicted number of repeats is consistent with the 3-dimensional structures the repeats may adopt. For instance, Kelch repeats form a propeller-like structure containing 5-7 tandem repeats. Armadillo (Arm) and HEAT repeats are very similar and to date all known proteins possess only of these two repeat types. Therefore if REP detects both repeat types in a single protein, all repeats are annotated as the most common of the two types reported by REP.

See also: Evidence

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again