Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

RuvB-like 2

Gene

RUVBL2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Possesses single-stranded DNA-stimulated ATPase and ATP-dependent DNA helicase (5' to 3') activity; hexamerization is thought to be critical for ATP hydrolysis and adjacent subunits in the ring-like structure contribute to the ATPase activity.
Component of the NuA4 histone acetyltransferase complex which is involved in transcriptional activation of select genes principally by acetylation of nucleosomal histones H4 and H2A. This modification may both alter nucleosome - DNA interactions and promote interaction of the modified histones with other proteins which positively regulate transcription. This complex may be required for the activation of transcriptional programs associated with oncogene and proto-oncogene mediated growth induction, tumor suppressor mediated growth arrest and replicative senescence, apoptosis, and DNA repair. The NuA4 complex ATPase and helicase activities seem to be, at least in part, contributed by the association of RUVBL1 and RUVBL2 with EP400. NuA4 may also play a direct role in DNA repair when recruited to sites of DNA damage. Component of a SWR1-like complex that specifically mediates the removal of histone H2A.Z/H2AFZ from the nucleosome.
Proposed core component of the chromatin remodeling INO80 complex which is involved in transcriptional regulation, DNA replication and probably DNA repair.
Plays an essential role in oncogenic transformation by MYC and also modulates transcriptional activation by the LEF1/TCF1-CTNNB1 complex. May also inhibit the transcriptional activity of ATF2.
Involved in the endoplasmic reticulum (ER)-associated degradation (ERAD) pathway where it negatively regulates expression of ER stress response genes.1 Publication

Catalytic activityi

ATP + H2O = ADP + phosphate.

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Nucleotide bindingi77 – 848ATP

GO - Molecular functioni

  • ADP binding Source: Ensembl
  • ATP binding Source: UniProtKB-KW
  • ATP-dependent 5'-3' DNA helicase activity Source: InterPro
  • ATP-dependent DNA helicase activity Source: ProtInc
  • chromatin DNA binding Source: UniProtKB
  • DNA helicase activity Source: UniProtKB
  • identical protein binding Source: UniProtKB
  • RNA polymerase II core promoter sequence-specific DNA binding Source: UniProtKB
  • RNA polymerase II distal enhancer sequence-specific DNA binding Source: UniProtKB
  • unfolded protein binding Source: ProtInc

GO - Biological processi

  • cellular response to estradiol stimulus Source: UniProtKB
  • cellular response to UV Source: UniProtKB
  • chromatin remodeling Source: UniProtKB
  • DNA recombination Source: ProtInc
  • DNA repair Source: ProtInc
  • establishment of protein localization to chromatin Source: UniProtKB
  • histone H2A acetylation Source: UniProtKB
  • histone H4 acetylation Source: UniProtKB
  • negative regulation of estrogen receptor binding Source: UniProtKB
  • positive regulation of histone acetylation Source: UniProtKB
  • positive regulation of telomerase RNA localization to Cajal body Source: BHF-UCL
  • positive regulation of transcription from RNA polymerase II promoter Source: UniProtKB
  • protein folding Source: ProtInc
  • regulation of growth Source: UniProtKB-KW
  • transcription, DNA-templated Source: UniProtKB-KW
  • transcriptional activation by promoter-enhancer looping Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Activator, Chromatin regulator, Helicase, Hydrolase

Keywords - Biological processi

DNA damage, DNA recombination, DNA repair, Growth regulation, Transcription, Transcription regulation

Keywords - Ligandi

ATP-binding, Nucleotide-binding

Enzyme and pathway databases

ReactomeiR-HSA-171319. Telomere Extension By Telomerase.
R-HSA-3214847. HATs acetylate histones.

Names & Taxonomyi

Protein namesi
Recommended name:
RuvB-like 2 (EC:3.6.4.12)
Alternative name(s):
48 kDa TATA box-binding protein-interacting protein
Short name:
48 kDa TBP-interacting protein
51 kDa erythrocyte cytosolic protein
Short name:
ECP-51
INO80 complex subunit J
Repressing pontin 52
Short name:
Reptin 52
TIP49b
TIP60-associated protein 54-beta
Short name:
TAP54-beta
Gene namesi
Name:RUVBL2
Synonyms:INO80J, TIP48, TIP49B
ORF Names:CGI-46
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 19

Organism-specific databases

HGNCiHGNC:10475. RUVBL2.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: HPA
  • extracellular exosome Source: UniProtKB
  • Ino80 complex Source: UniProtKB
  • intracellular Source: LIFEdb
  • intracellular ribonucleoprotein complex Source: Ensembl
  • membrane Source: UniProtKB-SubCell
  • MLL1 complex Source: UniProtKB
  • NuA4 histone acetyltransferase complex Source: UniProtKB
  • nuclear euchromatin Source: UniProtKB
  • nuclear matrix Source: UniProtKB-SubCell
  • nucleoplasm Source: HPA
  • nucleus Source: UniProtKB
  • R2TP complex Source: UniProtKB
  • Swr1 complex Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Membrane, Nucleus

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi299 – 2991D → N: Abolishes ATPase activity. 1 Publication

Organism-specific databases

PharmGKBiPA34888.

Chemistry

ChEMBLiCHEMBL2062349.

Polymorphism and mutation databases

BioMutaiRUVBL2.
DMDMi28201890.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Initiator methionineiRemovedCombined sources2 Publications
Chaini2 – 463462RuvB-like 2PRO_0000165644Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei2 – 21N-acetylalanineCombined sources1 Publication
Modified residuei437 – 4371PhosphoserineCombined sources

Keywords - PTMi

Acetylation, Phosphoprotein

Proteomic databases

EPDiQ9Y230.
MaxQBiQ9Y230.
PaxDbiQ9Y230.
PeptideAtlasiQ9Y230.
PRIDEiQ9Y230.

2D gel databases

REPRODUCTION-2DPAGEIPI00009104.

PTM databases

iPTMnetiQ9Y230.
PhosphoSiteiQ9Y230.
SwissPalmiQ9Y230.

Expressioni

Tissue specificityi

Ubiquitously expressed. Highly expressed in testis and thymus.

Gene expression databases

BgeeiENSG00000183207.
CleanExiHS_RUVBL2.
ExpressionAtlasiQ9Y230. baseline and differential.
GenevisibleiQ9Y230. HS.

Organism-specific databases

HPAiCAB012432.
HPA042880.

Interactioni

Subunit structurei

Forms homohexameric rings (Probable). Can form a dodecamer with RUVBL1 made of two stacked hexameric rings; however, even though RUVBL1 and RUVBL2 are present in equimolar ratio, the oligomeric status of each hexamer is not known. Oligomerization may regulate binding to nucleic acids and conversely, binding to nucleic acids may affect the dodecameric assembly. Interacts with the transcriptional activation domain of MYC. Interacts With ATF2. Component of the RNA polymerase II holoenzyme complex. May also act to bridge the LEF1/TCF1-CTNNB1 complex and TBP. Component of the NuA4 histone acetyltransferase complex which contains the catalytic subunit KAT5/TIP60 and the subunits EP400, TRRAP/PAF400, BRD8/SMAP, EPC1, DMAP1/DNMAP1, RUVBL1/TIP49, RUVBL2, ING3, actin, ACTL6A/BAF53A, MORF4L1/MRG15, MORF4L2/MRGX, MRGBP, YEATS4/GAS41, VPS72/YL1 and MEAF6. The NuA4 complex interacts with MYC and the adenovirus E1A protein. RUVBL2 interacts with EP400. Component of a NuA4-related complex which contains EP400, TRRAP/PAF400, SRCAP, BRD8/SMAP, EPC1, DMAP1/DNMAP1, RUVBL1/TIP49, RUVBL2, actin, ACTL6A/BAF53A, VPS72 and YEATS4/GAS41. Interacts with NPAT. Component of the chromatin-remodeling INO80 complex; specifically part of a complex module associated with the helicase ATP-binding and the helicase C-terminal domain of INO80. Component of some MLL1/MLL complex, at least composed of the core components KMT2A/MLL1, ASH2L, HCFC1/HCF1, WDR5 and RBBP5, as well as the facultative components BAP18, CHD8, E2F6, HSP70, INO80C, KANSL1, LAS1L, MAX, MCRS1, MGA, MYST1/MOF, PELP1, PHF20, PRP31, RING2, RUVB1/TIP49A, RUVB2/TIP49B, SENP3, TAF1, TAF4, TAF6, TAF7, TAF9 and TEX10. Interacts with IGHMBP2. Interacts with TELO2. Interacts with HINT1. Component of a SWR1-like complex. Component of the R2TP complex composed at least of PIHD1, RUVBL1, RUVBL2 and RPAP3 (PubMed:20864032). Interacts with ITFG1 (PubMed:25437307).Curated18 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
itself2EBI-352939,EBI-352939
CCDC103Q8IW403EBI-352939,EBI-10261970
DPCDQ9BVM24EBI-352939,EBI-749988
LNX1Q8TBB13EBI-352939,EBI-739832
RUVBL1Q9Y26524EBI-352939,EBI-353675
YY1P254905EBI-352939,EBI-765538

GO - Molecular functioni

  • identical protein binding Source: UniProtKB
  • unfolded protein binding Source: ProtInc

Protein-protein interaction databases

BioGridi116067. 246 interactions.
DIPiDIP-28153N.
IntActiQ9Y230. 108 interactions.
MINTiMINT-1136527.
STRINGi9606.ENSP00000473172.

Structurei

Secondary structure

1
463
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Turni19 – 224Combined sources
Turni24 – 274Combined sources
Beta strandi41 – 433Combined sources
Beta strandi46 – 483Combined sources
Helixi50 – 6415Combined sources
Beta strandi72 – 787Combined sources
Helixi83 – 9412Combined sources
Beta strandi100 – 1045Combined sources
Helixi105 – 1084Combined sources
Beta strandi111 – 1133Combined sources
Helixi115 – 12511Combined sources
Beta strandi126 – 1294Combined sources
Beta strandi136 – 14813Combined sources
Beta strandi152 – 1565Combined sources
Beta strandi158 – 1647Combined sources
Beta strandi166 – 1749Combined sources
Helixi177 – 1848Combined sources
Beta strandi191 – 1966Combined sources
Turni197 – 2004Combined sources
Beta strandi201 – 2066Combined sources
Beta strandi241 – 2433Combined sources
Helixi244 – 2507Combined sources
Helixi270 – 28617Combined sources
Beta strandi290 – 2934Combined sources
Beta strandi295 – 3006Combined sources
Helixi301 – 3033Combined sources
Helixi306 – 31510Combined sources
Beta strandi323 – 3297Combined sources
Beta strandi331 – 3344Combined sources
Beta strandi341 – 3433Combined sources
Helixi348 – 3514Combined sources
Beta strandi354 – 3596Combined sources
Helixi364 – 37714Combined sources
Helixi384 – 39613Combined sources
Helixi399 – 41517Combined sources
Beta strandi419 – 4213Combined sources
Helixi423 – 43210Combined sources
Helixi436 – 4438Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
2CQANMR-A132-213[»]
2XSZX-ray3.00D/E/F2-133[»]
D/E/F238-463[»]
3UK6X-ray2.95A/B/C/D/E/F/G/H/I/J/K/L1-132[»]
A/B/C/D/E/F/G/H/I/J/K/L239-463[»]
ProteinModelPortaliQ9Y230.
SMRiQ9Y230. Positions 8-454.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ9Y230.

Family & Domainsi

Domaini

The C-terminal domain is required for association with ATF2.

Sequence similaritiesi

Belongs to the RuvB family.Curated

Phylogenomic databases

eggNOGiKOG2680. Eukaryota.
COG1224. LUCA.
GeneTreeiENSGT00550000075034.
HOGENOMiHOG000190885.
HOVERGENiHBG054186.
InParanoidiQ9Y230.
KOiK11338.
OMAiYDAMGAQ.
OrthoDBiEOG091G07C9.
PhylomeDBiQ9Y230.
TreeFamiTF300469.

Family and domain databases

Gene3Di3.40.50.300. 3 hits.
InterProiIPR003593. AAA+_ATPase.
IPR027417. P-loop_NTPase.
IPR027238. RuvB-like.
IPR010339. TIP49_C.
[Graphical view]
PANTHERiPTHR11093. PTHR11093. 1 hit.
PfamiPF06068. TIP49. 1 hit.
[Graphical view]
SMARTiSM00382. AAA. 1 hit.
[Graphical view]
SUPFAMiSSF52540. SSF52540. 1 hit.

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9Y230-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MATVTATTKV PEIRDVTRIE RIGAHSHIRG LGLDDALEPR QASQGMVGQL
60 70 80 90 100
AARRAAGVVL EMIREGKIAG RAVLIAGQPG TGKTAIAMGM AQALGPDTPF
110 120 130 140 150
TAIAGSEIFS LEMSKTEALT QAFRRSIGVR IKEETEIIEG EVVEIQIDRP
160 170 180 190 200
ATGTGSKVGK LTLKTTEMET IYDLGTKMIE SLTKDKVQAG DVITIDKATG
210 220 230 240 250
KISKLGRSFT RARDYDAMGS QTKFVQCPDG ELQKRKEVVH TVSLHEIDVI
260 270 280 290 300
NSRTQGFLAL FSGDTGEIKS EVREQINAKV AEWREEGKAE IIPGVLFIDE
310 320 330 340 350
VHMLDIESFS FLNRALESDM APVLIMATNR GITRIRGTSY QSPHGIPIDL
360 370 380 390 400
LDRLLIVSTT PYSEKDTKQI LRIRCEEEDV EMSEDAYTVL TRIGLETSLR
410 420 430 440 450
YAIQLITAAS LVCRKRKGTE VQVDDIKRVY SLFLDESRST QYMKEYQDAF
460
LFNELKGETM DTS
Length:463
Mass (Da):51,157
Last modified:January 23, 2007 - v3
Checksum:i54C78E9C587D975A
GO
Isoform 2 (identifier: Q9Y230-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-45: Missing.

Note: No experimental confirmation available.
Show »
Length:418
Mass (Da):46,306
Checksum:i64C304B804D86B2E
GO

Sequence cautioni

The sequence AAD34041 differs from that shown. Reason: Frameshift at position 401. Curated
The sequence AAH08355 differs from that shown. Reason: Frameshift at position 191. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti214 – 2141D → N in AAD34041 (PubMed:10810093).Curated
Sequence conflicti257 – 2582FL → YV AA sequence (PubMed:10882073).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 4545Missing in isoform 2. 1 PublicationVSP_056584Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Y18417 mRNA. Translation: CAB46270.1.
AB024301 mRNA. Translation: BAA76708.1.
AF155138 mRNA. Translation: AAD38073.1.
AF124607 mRNA. Translation: AAF87087.1.
AF151804 mRNA. Translation: AAD34041.1. Frameshift.
AL136743 mRNA. Translation: CAB66677.1.
AK057498 mRNA. Translation: BAG51921.1.
AK074542 mRNA. Translation: BAC11048.1.
CR533507 mRNA. Translation: CAG38538.1.
AC008687 Genomic DNA. No translation available.
CH471177 Genomic DNA. Translation: EAW52426.1.
CH471177 Genomic DNA. Translation: EAW52430.1.
BC000428 mRNA. Translation: AAH00428.1.
BC004531 mRNA. Translation: AAH04531.1.
BC008355 mRNA. Translation: AAH08355.1. Frameshift.
CCDSiCCDS42588.1. [Q9Y230-1]
PIRiT46313.
RefSeqiNP_001308120.1. NM_001321191.1. [Q9Y230-2]
NP_006657.1. NM_006666.2. [Q9Y230-1]
XP_011524632.1. XM_011526330.1. [Q9Y230-2]
UniGeneiHs.515846.

Genome annotation databases

EnsembliENST00000595090; ENSP00000473172; ENSG00000183207. [Q9Y230-1]
GeneIDi10856.
KEGGihsa:10856.
UCSCiuc002plr.2. human. [Q9Y230-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Web resourcesi

Atlas of Genetics and Cytogenetics in Oncology and Haematology

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Y18417 mRNA. Translation: CAB46270.1.
AB024301 mRNA. Translation: BAA76708.1.
AF155138 mRNA. Translation: AAD38073.1.
AF124607 mRNA. Translation: AAF87087.1.
AF151804 mRNA. Translation: AAD34041.1. Frameshift.
AL136743 mRNA. Translation: CAB66677.1.
AK057498 mRNA. Translation: BAG51921.1.
AK074542 mRNA. Translation: BAC11048.1.
CR533507 mRNA. Translation: CAG38538.1.
AC008687 Genomic DNA. No translation available.
CH471177 Genomic DNA. Translation: EAW52426.1.
CH471177 Genomic DNA. Translation: EAW52430.1.
BC000428 mRNA. Translation: AAH00428.1.
BC004531 mRNA. Translation: AAH04531.1.
BC008355 mRNA. Translation: AAH08355.1. Frameshift.
CCDSiCCDS42588.1. [Q9Y230-1]
PIRiT46313.
RefSeqiNP_001308120.1. NM_001321191.1. [Q9Y230-2]
NP_006657.1. NM_006666.2. [Q9Y230-1]
XP_011524632.1. XM_011526330.1. [Q9Y230-2]
UniGeneiHs.515846.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
2CQANMR-A132-213[»]
2XSZX-ray3.00D/E/F2-133[»]
D/E/F238-463[»]
3UK6X-ray2.95A/B/C/D/E/F/G/H/I/J/K/L1-132[»]
A/B/C/D/E/F/G/H/I/J/K/L239-463[»]
ProteinModelPortaliQ9Y230.
SMRiQ9Y230. Positions 8-454.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi116067. 246 interactions.
DIPiDIP-28153N.
IntActiQ9Y230. 108 interactions.
MINTiMINT-1136527.
STRINGi9606.ENSP00000473172.

Chemistry

ChEMBLiCHEMBL2062349.

PTM databases

iPTMnetiQ9Y230.
PhosphoSiteiQ9Y230.
SwissPalmiQ9Y230.

Polymorphism and mutation databases

BioMutaiRUVBL2.
DMDMi28201890.

2D gel databases

REPRODUCTION-2DPAGEIPI00009104.

Proteomic databases

EPDiQ9Y230.
MaxQBiQ9Y230.
PaxDbiQ9Y230.
PeptideAtlasiQ9Y230.
PRIDEiQ9Y230.

Protocols and materials databases

DNASUi10856.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000595090; ENSP00000473172; ENSG00000183207. [Q9Y230-1]
GeneIDi10856.
KEGGihsa:10856.
UCSCiuc002plr.2. human. [Q9Y230-1]

Organism-specific databases

CTDi10856.
GeneCardsiRUVBL2.
HGNCiHGNC:10475. RUVBL2.
HPAiCAB012432.
HPA042880.
MIMi604788. gene.
neXtProtiNX_Q9Y230.
PharmGKBiPA34888.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG2680. Eukaryota.
COG1224. LUCA.
GeneTreeiENSGT00550000075034.
HOGENOMiHOG000190885.
HOVERGENiHBG054186.
InParanoidiQ9Y230.
KOiK11338.
OMAiYDAMGAQ.
OrthoDBiEOG091G07C9.
PhylomeDBiQ9Y230.
TreeFamiTF300469.

Enzyme and pathway databases

ReactomeiR-HSA-171319. Telomere Extension By Telomerase.
R-HSA-3214847. HATs acetylate histones.

Miscellaneous databases

ChiTaRSiRUVBL2. human.
EvolutionaryTraceiQ9Y230.
GeneWikiiRUVBL2.
GenomeRNAii10856.
PROiQ9Y230.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000183207.
CleanExiHS_RUVBL2.
ExpressionAtlasiQ9Y230. baseline and differential.
GenevisibleiQ9Y230. HS.

Family and domain databases

Gene3Di3.40.50.300. 3 hits.
InterProiIPR003593. AAA+_ATPase.
IPR027417. P-loop_NTPase.
IPR027238. RuvB-like.
IPR010339. TIP49_C.
[Graphical view]
PANTHERiPTHR11093. PTHR11093. 1 hit.
PfamiPF06068. TIP49. 1 hit.
[Graphical view]
SMARTiSM00382. AAA. 1 hit.
[Graphical view]
SUPFAMiSSF52540. SSF52540. 1 hit.
ProtoNetiSearch...

Entry informationi

Entry nameiRUVB2_HUMAN
AccessioniPrimary (citable) accession number: Q9Y230
Secondary accession number(s): B3KQ59
, E7ETE5, Q6FIB9, Q6PK27, Q9Y361
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 1, 2003
Last sequence update: January 23, 2007
Last modified: September 7, 2016
This is version 189 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 19
    Human chromosome 19: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  4. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.