Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Ras-responsive element-binding protein 1

Gene

Rreb1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcription factor that binds specifically to the RAS-responsive elements (RRE) of gene promoters. May be involved in Ras/Raf-mediated cell differentiation by enhancing calcitonin expression. Represses the angiotensinogen gene. Negatively regulates the transcriptional activity of AR. Potentiates the transcriptional activity of NEUROD1 (By similarity). Binds specifically to the allelic variant of the CDKN2A promoter present in Balb/c mice, which leads to a down-regulation of CDKN2A expression in this strain, and, as a consequence, to an elevated susceptibility to pristane-induced tumors.By similarity1 Publication

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri66 – 88C2H2-type 1PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri97 – 119C2H2-type 2PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri125 – 147C2H2-type 3PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri206 – 228C2H2-type 4PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri233 – 256C2H2-type 5PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri314 – 336C2H2-type 6PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri641 – 663C2H2-type 7PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri669 – 691C2H2-type 8PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri697 – 720C2H2-type 9PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri751 – 782C2H2-type 10PROSITE-ProRule annotationAdd BLAST32
Zinc fingeri788 – 813C2H2-type 11PROSITE-ProRule annotationAdd BLAST26
Zinc fingeri1251 – 1273C2H2-type 12PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1400 – 1422C2H2-type 13PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1520 – 1542C2H2-type 14PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1548 – 1570C2H2-type 15PROSITE-ProRule annotationAdd BLAST23

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Activator, Repressor

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding, Metal-binding, Zinc

Names & Taxonomyi

Protein namesi
Recommended name:
Ras-responsive element-binding protein 1
Short name:
RREB-1
Alternative name(s):
RAS-responsive zinc finger transcription factor RREB
Gene namesi
Name:Rreb1
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 13

Organism-specific databases

MGIiMGI:2443664. Rreb1.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Nucleus

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00002951551 – 1700Ras-responsive element-binding protein 1Add BLAST1700

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei36PhosphoserineBy similarity1
Modified residuei42PhosphoserineBy similarity1
Modified residuei161PhosphoserineCombined sources1
Modified residuei175PhosphoserineBy similarity1
Modified residuei180PhosphoserineBy similarity1
Modified residuei229PhosphoserineBy similarity1
Cross-linki549Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Cross-linki613Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO1)By similarity
Cross-linki613Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Cross-linki883Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Modified residuei970PhosphoserineCombined sources1
Modified residuei1125PhosphoserineBy similarity1
Modified residuei1137PhosphoserineCombined sources1
Modified residuei1138PhosphoserineCombined sources1
Modified residuei1172PhosphoserineBy similarity1
Modified residuei1179PhosphoserineCombined sources1
Modified residuei1180PhosphoserineCombined sources1
Modified residuei1230PhosphoserineBy similarity1
Modified residuei1450PhosphoserineCombined sources1
Modified residuei1452PhosphoserineCombined sources1
Modified residuei1593PhosphoserineCombined sources1
Modified residuei1606PhosphoserineCombined sources1
Modified residuei1667PhosphoserineBy similarity1

Keywords - PTMi

Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

MaxQBiQ3UH06.
PaxDbiQ3UH06.
PeptideAtlasiQ3UH06.
PRIDEiQ3UH06.

PTM databases

iPTMnetiQ3UH06.
PhosphoSitePlusiQ3UH06.

Expressioni

Tissue specificityi

Expressed in splenic B-cells.1 Publication

Gene expression databases

BgeeiENSMUSG00000039087.
CleanExiMM_RREB1.
ExpressionAtlasiQ3UH06. baseline and differential.
GenevisibleiQ3UH06. MM.

Interactioni

Subunit structurei

Interacts with NEUROD1 and AR.By similarity

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000049265.

Structurei

3D structure databases

ProteinModelPortaliQ3UH06.
SMRiQ3UH06.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi479 – 542Pro-richAdd BLAST64
Compositional biasi921 – 925Poly-Ser5
Compositional biasi1320 – 1325Poly-Ala6
Compositional biasi1328 – 1359Glu-richAdd BLAST32

Sequence similaritiesi

Contains 15 C2H2-type zinc fingers.PROSITE-ProRule annotation

Zinc finger

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri66 – 88C2H2-type 1PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri97 – 119C2H2-type 2PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri125 – 147C2H2-type 3PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri206 – 228C2H2-type 4PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri233 – 256C2H2-type 5PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri314 – 336C2H2-type 6PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri641 – 663C2H2-type 7PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri669 – 691C2H2-type 8PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri697 – 720C2H2-type 9PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri751 – 782C2H2-type 10PROSITE-ProRule annotationAdd BLAST32
Zinc fingeri788 – 813C2H2-type 11PROSITE-ProRule annotationAdd BLAST26
Zinc fingeri1251 – 1273C2H2-type 12PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1400 – 1422C2H2-type 13PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1520 – 1542C2H2-type 14PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1548 – 1570C2H2-type 15PROSITE-ProRule annotationAdd BLAST23

Keywords - Domaini

Repeat, Zinc-finger

Phylogenomic databases

eggNOGiKOG1721. Eukaryota.
COG5048. LUCA.
GeneTreeiENSGT00860000133800.
HOGENOMiHOG000154195.
HOVERGENiHBG108418.
InParanoidiQ3UH06.
KOiK20210.
OMAiYRALRIH.
OrthoDBiEOG091G0AFV.
PhylomeDBiQ3UH06.
TreeFamiTF332503.

Family and domain databases

Gene3Di3.30.160.60. 11 hits.
InterProiIPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
IPR013087. Znf_C2H2/integrase_DNA-bd.
[Graphical view]
PfamiPF13912. zf-C2H2_6. 2 hits.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 15 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 14 hits.
PS50157. ZINC_FINGER_C2H2_2. 14 hits.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q3UH06-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MTSNSPIGLE GSDLSSINTM MSAVMSVASV TENGGSPQGI KSPMKPPGPN
60 70 80 90 100
RIGRRNQETK EEKSSYNCPL CEKICTTQHQ LTMHIRQHNT DTGGADHACS
110 120 130 140 150
ICGKSLSSAS SLDRHMLVHS GERPYKCTVC GQSFTTNGNM HRHMKIHEKD
160 170 180 190 200
TNSTTAAAPP SPLKRRRLSS KRKLSHDAES EDPGPAKKMV EDGQSGDLDK
210 220 230 240 250
MSDEIFHCPV CFKEFVCKYE LETHMETHSD NPLRCDICCV TFRTHRGLLR
260 270 280 290 300
HNALVHKQLP RDAMGRPFIQ NNPSIPAGFH DLGFTDFSCR KFPRISQAWC
310 320 330 340 350
ETNLRRCISE QHRFVCDTCD KAFPMLSSLI LHRQSHIPAD QGREKLQTKT
360 370 380 390 400
LAAESLEQKA FLALLGLQHT KDVKPAPAEE LLPDDNQAIQ LQTLKYQLPQ
410 420 430 440 450
EPGCPTVLSV SPLDAASLGG SLTVLPATKE NMKHLSLQPF QKGFIIQPDS
460 470 480 490 500
SIVVKPISGE SAIELADIQQ ILKMAASAPP QISLPPLSKA PATPLQAIFK
510 520 530 540 550
HMPPLKPKPL VTPRTVVAAS TPPPLINAQQ ASPGCISPSL PPQSLKFLKG
560 570 580 590 600
SVEAVSNVHL LQSKSGIQPS TTTQLFLQQA GVELPGQPEM KTQLEQESII
610 620 630 640 650
EALLPLNMEA KIKQEITEGD LKAIMTGPSG KKTPAMRKVL YPCRFCNQVF
660 670 680 690 700
AFSGVLRAHV RSHLGISPYQ CNICDYIAAD KAALIRHIRT HSGERPYICK
710 720 730 740 750
ICHYPFTVKA NCERHLRKKH LKATRKDIEK NIEYVSSPTA ELVDAFCAPE
760 770 780 790 800
TVCRLCGEDL KHYRALRIHM RTHCSRGLGG CHKGRKPFEC KECNAPFVAK
810 820 830 840 850
RNCIHHILKQ HLHVPEKDIE SYVLATNSGL GPADTPTDAA SRGEEGSCVT
860 870 880 890 900
FAECKPLATF LEPQNGFLHS SPTQPLPSHI SVKLEPASSF AMDFNEPLDF
910 920 930 940 950
SQKGLALVQV KQENVSSLLT SSSSSALYDC SMEPIDLSIP KSVKKGDKDT
960 970 980 990 1000
VVPSDAKKPE PEAGQAEPLS PRPPPCPTLS VTVEPKGSLE TPTGTVVAVT
1010 1020 1030 1040 1050
TAAKLEPHTQ PLQGSVQLAV PIYSPALVSN TPLLGNSAAL LNNPALLRPL
1060 1070 1080 1090 1100
RPKPPLLLPK PSMTEELPPL ASIAQIISSV SSAPTLLKTK VADPGPSITS
1110 1120 1130 1140 1150
SNTVATDSPG SSIPKAAATP TDTTSSKESS EPPPAASSPE EALPTEQGPA
1160 1170 1180 1190 1200
ATSSSRKRGR KRGLRNRPLP NSSAVDLDSS GEFASIEKML ATTDTNKFSP
1210 1220 1230 1240 1250
FLQTAEDDTQ EEVAGAPADQ HGPADEEQGS PAEDRLLRAK RNSYANCLQK
1260 1270 1280 1290 1300
INCPHCPRVF PWASSLQRHM LTHTDSQSDT DTLTTPGEVL DLTAQAKEQP
1310 1320 1330 1340 1350
PAEGASEISP ASQDLAIKEA KAAAAPSEEE EEKETEENPE PEEECRVEES
1360 1370 1380 1390 1400
TGAADAPEED TASNQSLDLD FATKLMDFKL AESEAGSVDS QGPAQQEPKH
1410 1420 1430 1440 1450
ACDTCGKNFK FLGTLSRHKK AHSCQEPKEE EAAAPSLENE GVGRAVEGPS
1460 1470 1480 1490 1500
PSPEPEEKPA ESLAIDPTPG TREASVAKQN EETEGPTDGE GTAEKRGDGD
1510 1520 1530 1540 1550
KRPKTDSPKS MASKADKRKK VCSVCNKRFW SLQDLTRHMR SHTGERPYKC
1560 1570 1580 1590 1600
QTCERTFTLK HSLVRHQRIH QKARHSKHHG KDSDKDERAE EDSEDESTHS
1610 1620 1630 1640 1650
ATNPASENEA ESAPSTSNHV AVTRSRKESL STSGKECSPE ERAAAEQAAE
1660 1670 1680 1690 1700
PSAPKEQASP GETDPQSPAA IVQDLLELCG KRPAPILAAT DGASQLLGME
Length:1,700
Mass (Da):184,154
Last modified:July 10, 2007 - v2
Checksum:iDAB9246A566B1BC9
GO
Isoform 2 (identifier: Q3UH06-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-82: Missing.

Show »
Length:1,618
Mass (Da):175,402
Checksum:i52678CF395D69693
GO
Isoform 3 (identifier: Q3UH06-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1274-1274: T → TGQKPFPCQKCDAFFSTKSNCERHQLRKHGVTTCSLRRNGLIPPKESDVGSHDST

Show »
Length:1,754
Mass (Da):190,167
Checksum:iEE5F35416BEC03BC
GO
Isoform 4 (identifier: Q3UH06-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1275-1291: DSQSDTDTLTTPGEVLD → GKKALTAHQAVSLERKE
     1292-1700: Missing.

Show »
Length:1,291
Mass (Da):140,184
Checksum:i8AAF680E466B62B7
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti834D → G in BAE28051 (PubMed:16141072).Curated1
Sequence conflicti1113I → T in AAH80680 (Ref. 4) Curated1
Sequence conflicti1178D → G in BAE28051 (PubMed:16141072).Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0267651 – 82Missing in isoform 2. 1 PublicationAdd BLAST82
Alternative sequenceiVSP_0267661274T → TGQKPFPCQKCDAFFSTKSN CERHQLRKHGVTTCSLRRNG LIPPKESDVGSHDST in isoform 3. 1 Publication1
Alternative sequenceiVSP_0267671275 – 1291DSQSD…GEVLD → GKKALTAHQAVSLERKE in isoform 4. 2 PublicationsAdd BLAST17
Alternative sequenceiVSP_0267681292 – 1700Missing in isoform 4. 2 PublicationsAdd BLAST409

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK147653 mRNA. Translation: BAE28051.1.
AK154980 mRNA. Translation: BAE32969.1.
AK171375 mRNA. Translation: BAE42417.1.
CT010477 Genomic DNA. Translation: CAX15920.1.
CT010477 Genomic DNA. Translation: CAX15921.1.
BC080680 mRNA. Translation: AAH80680.1.
AY946044 mRNA. Translation: AAX83010.1.
CCDSiCCDS36634.1. [Q3UH06-4]
CCDS49238.1. [Q3UH06-3]
RefSeqiNP_001034277.1. NM_001039188.1. [Q3UH06-4]
NP_001171339.1. NM_001177868.1. [Q3UH06-4]
NP_001171340.1. NM_001177869.1. [Q3UH06-3]
NP_081106.1. NM_026830.2. [Q3UH06-4]
XP_006516812.1. XM_006516749.1. [Q3UH06-3]
XP_006516813.1. XM_006516750.2. [Q3UH06-3]
XP_006516814.1. XM_006516751.2. [Q3UH06-3]
XP_006516815.1. XM_006516752.2. [Q3UH06-3]
XP_006516816.1. XM_006516753.3. [Q3UH06-3]
XP_006516817.1. XM_006516754.3. [Q3UH06-3]
XP_006516819.1. XM_006516756.2. [Q3UH06-3]
XP_006516820.1. XM_006516757.3. [Q3UH06-3]
XP_006516822.1. XM_006516759.1. [Q3UH06-1]
UniGeneiMm.491109.

Genome annotation databases

EnsembliENSMUST00000037232; ENSMUSP00000049265; ENSMUSG00000039087. [Q3UH06-3]
ENSMUST00000110237; ENSMUSP00000105866; ENSMUSG00000039087. [Q3UH06-4]
ENSMUST00000110238; ENSMUSP00000105867; ENSMUSG00000039087. [Q3UH06-4]
ENSMUST00000128570; ENSMUSP00000115599; ENSMUSG00000039087. [Q3UH06-3]
ENSMUST00000149745; ENSMUSP00000121211; ENSMUSG00000039087. [Q3UH06-4]
GeneIDi68750.
KEGGimmu:68750.
UCSCiuc007qcw.2. mouse. [Q3UH06-4]
uc007qda.2. mouse. [Q3UH06-1]
uc007qdb.2. mouse. [Q3UH06-3]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK147653 mRNA. Translation: BAE28051.1.
AK154980 mRNA. Translation: BAE32969.1.
AK171375 mRNA. Translation: BAE42417.1.
CT010477 Genomic DNA. Translation: CAX15920.1.
CT010477 Genomic DNA. Translation: CAX15921.1.
BC080680 mRNA. Translation: AAH80680.1.
AY946044 mRNA. Translation: AAX83010.1.
CCDSiCCDS36634.1. [Q3UH06-4]
CCDS49238.1. [Q3UH06-3]
RefSeqiNP_001034277.1. NM_001039188.1. [Q3UH06-4]
NP_001171339.1. NM_001177868.1. [Q3UH06-4]
NP_001171340.1. NM_001177869.1. [Q3UH06-3]
NP_081106.1. NM_026830.2. [Q3UH06-4]
XP_006516812.1. XM_006516749.1. [Q3UH06-3]
XP_006516813.1. XM_006516750.2. [Q3UH06-3]
XP_006516814.1. XM_006516751.2. [Q3UH06-3]
XP_006516815.1. XM_006516752.2. [Q3UH06-3]
XP_006516816.1. XM_006516753.3. [Q3UH06-3]
XP_006516817.1. XM_006516754.3. [Q3UH06-3]
XP_006516819.1. XM_006516756.2. [Q3UH06-3]
XP_006516820.1. XM_006516757.3. [Q3UH06-3]
XP_006516822.1. XM_006516759.1. [Q3UH06-1]
UniGeneiMm.491109.

3D structure databases

ProteinModelPortaliQ3UH06.
SMRiQ3UH06.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000049265.

PTM databases

iPTMnetiQ3UH06.
PhosphoSitePlusiQ3UH06.

Proteomic databases

MaxQBiQ3UH06.
PaxDbiQ3UH06.
PeptideAtlasiQ3UH06.
PRIDEiQ3UH06.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000037232; ENSMUSP00000049265; ENSMUSG00000039087. [Q3UH06-3]
ENSMUST00000110237; ENSMUSP00000105866; ENSMUSG00000039087. [Q3UH06-4]
ENSMUST00000110238; ENSMUSP00000105867; ENSMUSG00000039087. [Q3UH06-4]
ENSMUST00000128570; ENSMUSP00000115599; ENSMUSG00000039087. [Q3UH06-3]
ENSMUST00000149745; ENSMUSP00000121211; ENSMUSG00000039087. [Q3UH06-4]
GeneIDi68750.
KEGGimmu:68750.
UCSCiuc007qcw.2. mouse. [Q3UH06-4]
uc007qda.2. mouse. [Q3UH06-1]
uc007qdb.2. mouse. [Q3UH06-3]

Organism-specific databases

CTDi6239.
MGIiMGI:2443664. Rreb1.

Phylogenomic databases

eggNOGiKOG1721. Eukaryota.
COG5048. LUCA.
GeneTreeiENSGT00860000133800.
HOGENOMiHOG000154195.
HOVERGENiHBG108418.
InParanoidiQ3UH06.
KOiK20210.
OMAiYRALRIH.
OrthoDBiEOG091G0AFV.
PhylomeDBiQ3UH06.
TreeFamiTF332503.

Miscellaneous databases

ChiTaRSiRreb1. mouse.
PROiQ3UH06.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000039087.
CleanExiMM_RREB1.
ExpressionAtlasiQ3UH06. baseline and differential.
GenevisibleiQ3UH06. MM.

Family and domain databases

Gene3Di3.30.160.60. 11 hits.
InterProiIPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
IPR013087. Znf_C2H2/integrase_DNA-bd.
[Graphical view]
PfamiPF13912. zf-C2H2_6. 2 hits.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 15 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 14 hits.
PS50157. ZINC_FINGER_C2H2_2. 14 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiRREB1_MOUSE
AccessioniPrimary (citable) accession number: Q3UH06
Secondary accession number(s): B8JJE2
, B8JJE3, Q3TB97, Q4ZE88, Q66JZ8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 10, 2007
Last sequence update: July 10, 2007
Last modified: November 30, 2016
This is version 106 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.