Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Histone deacetylase 7

Gene

Hdac7

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Responsible for the deacetylation of lysine residues on the N-terminal part of the core histones (H2A, H2B, H3 and H4). Histone deacetylation gives a tag for epigenetic repression and plays an important role in transcriptional regulation, cell cycle progression and developmental events. Histone deacetylases act via the formation of large multiprotein complexes. Involved in muscle maturation by repressing transcription of myocyte enhancer factors such as MEF2A, MEF2B and MEF2C. During muscle differentiation, it shuttles into the cytoplasm, allowing the expression of myocyte enhancer factors. Positively regulates the transcriptional repressor activity of FOXP3 (By similarity).By similarity1 Publication

Catalytic activityi

Hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Metal bindingi520ZincBy similarity1
Metal bindingi522ZincBy similarity1
Metal bindingi528ZincBy similarity1
Metal bindingi605ZincBy similarity1
Active sitei657By similarity1
Sitei830Contributes to catalysisBy similarity1

GO - Molecular functioni

  • 14-3-3 protein binding Source: UniProtKB
  • activating transcription factor binding Source: MGI
  • chromatin binding Source: MGI
  • histone deacetylase activity Source: UniProtKB
  • metal ion binding Source: UniProtKB-KW
  • NAD-dependent histone deacetylase activity (H3-K14 specific) Source: UniProtKB-EC
  • protein kinase binding Source: MGI
  • protein kinase C binding Source: MGI
  • repressing transcription factor binding Source: MGI
  • transcription corepressor activity Source: MGI
  • transcription factor binding Source: UniProtKB

GO - Biological processi

  • B cell activation Source: UniProtKB
  • B cell differentiation Source: UniProtKB
  • cell-cell junction assembly Source: MGI
  • chromatin organization Source: UniProtKB
  • inflammatory response Source: UniProtKB
  • negative regulation of interleukin-2 production Source: MGI
  • negative regulation of NIK/NF-kappaB signaling Source: MGI
  • negative regulation of osteoblast differentiation Source: MGI
  • negative regulation of striated muscle tissue development Source: UniProtKB
  • negative regulation of transcription, DNA-templated Source: UniProtKB
  • negative regulation of transcription from RNA polymerase II promoter Source: MGI
  • nervous system development Source: UniProtKB
  • positive regulation of cell migration involved in sprouting angiogenesis Source: MGI
  • transcription, DNA-templated Source: UniProtKB-KW
  • vasculogenesis Source: MGI
Complete GO annotation...

Keywords - Molecular functioni

Chromatin regulator, Hydrolase, Repressor

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

Metal-binding, Zinc

Enzyme and pathway databases

ReactomeiR-MMU-3108214. SUMOylation of DNA damage response and repair proteins.

Names & Taxonomyi

Protein namesi
Recommended name:
Histone deacetylase 7 (EC:3.5.1.98)
Short name:
HD7
Alternative name(s):
Histone deacetylase 7A
Short name:
HD7a
Gene namesi
Name:Hdac7
Synonyms:Hdac7a
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 15

Organism-specific databases

MGIiMGI:1891835. Hdac7.

Subcellular locationi

  • Nucleus
  • Cytoplasm

  • Note: In the nucleus, it associates with distinct subnuclear dot-like structures. Shuttles between the nucleus and the cytoplasm. In muscle cells, it shuttles into the cytoplasm during myocyte differentiation. The export to cytoplasm depends on the interaction with the 14-3-3 protein YWHAE and is due to its phosphorylation.

GO - Cellular componenti

  • cytoplasm Source: UniProtKB
  • histone deacetylase complex Source: UniProtKB
  • nucleus Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Nucleus

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi178S → A: Strong reduction of CaMK1-dependent nuclear export. Reduces interaction with YWHAE. 1 Publication1
Mutagenesisi344S → A: Strong reduction of CaMK1-dependent nuclear export. Reduces interaction with YWHAE. 1 Publication1
Mutagenesisi479S → A: Strong reduction of CaMK1-dependent nuclear export. Reduces interaction with YWHAE. 1 Publication1
Mutagenesisi657H → A: Abolishes deacetylase activity, but not the interaction with HDAC2 and HDAC3. 2 Publications1
Mutagenesisi692D → A: Disrupts the dot-like nuclear pattern. 1 Publication1
Mutagenesisi694D → A: Disrupts the dot-like nuclear pattern. Abolishes deacetylase activity, but not the interaction with HDAC2 and HDAC3. 1 Publication1
Mutagenesisi717H → A: Abolishes deacetylase activity, but not the interaction with HDAC2 and HDAC3. 1 Publication1

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00001147061 – 938Histone deacetylase 7Add BLAST938

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei132PhosphoserineBy similarity1
Modified residuei178Phosphoserine; by MARK2, MARK3 and PKD/PRKD11 Publication1
Modified residuei204Phosphoserine; by PKD/PRKD2By similarity1
Modified residuei344Phosphoserine; by PKD/PRKD11 Publication1
Modified residuei350PhosphoserineBy similarity1
Modified residuei398PhosphoserineBy similarity1
Modified residuei479Phosphoserine; by PKD/PRKD11 Publication1
Modified residuei480PhosphoserineBy similarity1
Modified residuei582PhosphoserineCombined sources1

Post-translational modificationi

May be phosphorylated by CaMK1. Phosphorylated by the PKC kinases PKN1 and PKN2, impairing nuclear import. Phosphorylation at Ser-178 by MARK2, MARK3 and PRKD1 promotes interaction with 14-3-3 proteins and export from the nucleus. Phosphorylation at Ser-178 is a prerequisite for phosphorylation at Ser-204 (By similarity).By similarity

Keywords - PTMi

Phosphoprotein

Proteomic databases

PaxDbiQ8C2B3.
PRIDEiQ8C2B3.

PTM databases

iPTMnetiQ8C2B3.
PhosphoSitePlusiQ8C2B3.

Miscellaneous databases

PMAP-CutDBQ8C2B3.

Expressioni

Tissue specificityi

Highly expressed in heart and lung. Expressed at intermediate level in muscle.2 Publications

Gene expression databases

BgeeiENSMUSG00000022475.
CleanExiMM_HDAC7.
ExpressionAtlasiQ8C2B3. baseline and differential.
GenevisibleiQ8C2B3. MM.

Interactioni

Subunit structurei

Interacts with KDM5B (By similarity). Interacts with KAT5 and EDNRA. Interacts with HDAC1, HDAC2, HDAC3, HDAC4, HDAC5, NCOR1, NCOR2, SIN3A, SIN3B, RBBP4, RBBP7, MTA1L1, SAP30 and MBD3. Interacts with the 14-3-3 protein YWHAE, MEF2A, MEF2B and MEF2C. Interacts with ZMYND15. Interacts with PML (By similarity). Interacts with FOXP3.By similarity6 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
YwhaeP622596EBI-643830,EBI-356480

GO - Molecular functioni

Protein-protein interaction databases

BioGridi207862. 5 interactors.
DIPiDIP-42594N.
IntActiQ8C2B3. 4 interactors.
MINTiMINT-1551781.
STRINGi10090.ENSMUSP00000112110.

Structurei

3D structure databases

ProteinModelPortaliQ8C2B3.
SMRiQ8C2B3.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 121Interaction with MEF2CAdd BLAST121
Regioni2 – 254Transcription repression 1Add BLAST253
Regioni72 – 172Interaction with MEF2A1 PublicationAdd BLAST101
Regioni241 – 533Transcription repression 2Add BLAST293
Regioni505 – 852Histone deacetylaseAdd BLAST348
Regioni864 – 938Interaction with SIN3AAdd BLAST75

Motif

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Motifi904 – 938Nuclear export signalBy similarityAdd BLAST35

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi220 – 226Poly-Ser7

Domaini

The nuclear export sequence mediates the shuttling between the nucleus and the cytoplasm.By similarity

Sequence similaritiesi

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiKOG1343. Eukaryota.
COG0123. LUCA.
GeneTreeiENSGT00530000062809.
HOGENOMiHOG000232065.
HOVERGENiHBG057100.
InParanoidiQ8C2B3.
KOiK11408.
OMAiAFRIVVM.
OrthoDBiEOG091G0EQO.
PhylomeDBiQ8C2B3.
TreeFamiTF106173.

Family and domain databases

Gene3Di3.40.800.20. 1 hit.
InterProiIPR000286. His_deacetylse.
IPR023801. His_deacetylse_dom.
[Graphical view]
PANTHERiPTHR10625. PTHR10625. 2 hits.
PfamiPF00850. Hist_deacetyl. 1 hit.
[Graphical view]
PRINTSiPR01270. HDASUPER.

Sequences (6)i

Sequence statusi: Complete.

This entry describes 6 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q8C2B3-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MHSPGAGCPA LQPDTPGSQP QPMDLRVGQR PTVEPPPEPA LLTLQHPQRL
60 70 80 90 100
HRHLFLAGLH QQQRSAEPMR LSMDPPMPEL QGGQQEQELR QLLNKDKSKR
110 120 130 140 150
SAVASSVVKQ KLAEVILKKQ QAALERTVHP SSPSIPYRTL EPLDTEGAAR
160 170 180 190 200
SVLSSFLPPV PSLPTEPPEH FPLRKTVSEP NLKLRYKPKK SLERRKNPLL
210 220 230 240 250
RKESAPPSLR RRPAETLGDS SPSSSSTPAS GCSSPNDSEH GPNPALGSEA
260 270 280 290 300
DGDRRTHSTL GPRGPVLGNP HAPLFLHHGL EPEAGGTLPS RLQPILLLDP
310 320 330 340 350
SVSHAPLWTV PGLGPLPFHF AQPLLTTERL SGSGLHRPLN RTRSEPLPPS
360 370 380 390 400
ATASPLLAPL QPRQDRLKPH VQLIKPAISP PQRPAKPSEK PRLRQIPSAE
410 420 430 440 450
DLETDGGGVG PMANDGLEHR ESGRGPPEGR GSISLQQHQQ VPPWEQQHLA
460 470 480 490 500
GRLSQGSPGD SVLIPLAQVG HRPLSRTQSS PAAPVSLLSP EPTCQTQVLN
510 520 530 540 550
SSETPATGLV YDSVMLKHQC SCGDNSKHPE HAGRIQSIWS RLQERGLRSQ
560 570 580 590 600
CECLRGRKAS LEELQSVHSE RHVLLYGTNP LSRLKLDNGK LTGLLAQRTF
610 620 630 640 650
VMLPCGGVGV DTDTIWNELH SSNAARWAAG SVTDLAFKVA SRELKNGFAV
660 670 680 690 700
VRPPGHHADH STAMGFCFFN SVAIACRQLQ QHGKASKILI VDWDVHHGNG
710 720 730 740 750
TQQTFYQDPS VLYISLHRHD DGNFFPGSGA VDEVGTGSGE GFNVNVAWAG
760 770 780 790 800
GLDPPMGDPE YLAAFRIVVM PIAREFAPDL VLVSAGFDAA EGHPAPLGGY
810 820 830 840 850
HVSAKCFGYM TQQLMNLAGG AVVLALEGGH DLTAICDASE ACVAALLGNK
860 870 880 890 900
VDPLSEESWK QKPNLSAIRS LEAVVRVHRK YWGCMQRLAS CPDSWLPRVP
910 920 930
GADAEVEAVT ALASLSVGIL AEDRPSERLV EEEEPMNL
Length:938
Mass (Da):101,287
Last modified:May 16, 2003 - v2
Checksum:i8D4B455CE6F95483
GO
Isoform 2 (identifier: Q8C2B3-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-22: Missing.
     249-249: E → EALLGQRLRLQETSLAPFALPTVSLLPAITLGLPAPAR
     376-382: Missing.

Show »
Length:946
Mass (Da):102,289
Checksum:i035DE255A4F44F7E
GO
Isoform 3 (identifier: Q8C2B3-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-22: Missing.
     249-249: E → EALLGQRLRLQETSLAPFALPTVSLLPAITLGLPAPAR

Show »
Length:953
Mass (Da):102,980
Checksum:i4D87C0E22274202C
GO
Isoform 4 (identifier: Q8C2B3-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     138-161: Missing.
     249-249: E → EALLGQRLRLQETSLAPFALPTVSLLPAITLGLPAPAR
     376-382: Missing.

Show »
Length:944
Mass (Da):101,910
Checksum:i5DD4D8F2A30A8C16
GO
Isoform 5 (identifier: Q8C2B3-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-22: Missing.

Show »
Length:916
Mass (Da):99,131
Checksum:i1586B7308A24964A
GO
Isoform 6 (identifier: Q8C2B3-6) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     138-161: Missing.

Show »
Length:914
Mass (Da):98,752
Checksum:i23AD714981D53FC7
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti169E → G in BAC27161 (PubMed:16141072).Curated1
Sequence conflicti183K → M in BAC29493 (PubMed:16141072).Curated1
Sequence conflicti228P → T in BAC27161 (PubMed:16141072).Curated1
Sequence conflicti487L → M in AAF31419 (PubMed:10640276).Curated1
Sequence conflicti487L → M in BAC40598 (PubMed:16141072).Curated1
Sequence conflicti487L → M in BAC40666 (PubMed:16141072).Curated1
Sequence conflicti645K → R in BAC29493 (PubMed:16141072).Curated1
Sequence conflicti661S → P in BAC40598 (PubMed:16141072).Curated1
Sequence conflicti737G → A in AAF31419 (PubMed:10640276).Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0074321 – 22Missing in isoform 2, isoform 3 and isoform 5. 1 PublicationAdd BLAST22
Alternative sequenceiVSP_007433138 – 161Missing in isoform 4 and isoform 6. 1 PublicationAdd BLAST24
Alternative sequenceiVSP_007434249E → EALLGQRLRLQETSLAPFAL PTVSLLPAITLGLPAPAR in isoform 2, isoform 3 and isoform 4. 1 Publication1
Alternative sequenceiVSP_007435376 – 382Missing in isoform 2 and isoform 4. 1 Publication7

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF207749 mRNA. Translation: AAF31419.1.
AK030863 mRNA. Translation: BAC27161.1.
AK036586 mRNA. Translation: BAC29493.1.
AK044287 mRNA. Translation: BAC31856.1.
AK088828 mRNA. Translation: BAC40598.1.
AK088945 mRNA. Translation: BAC40666.1.
BC057332 mRNA. Translation: AAH57332.1.
CCDSiCCDS37188.1. [Q8C2B3-1]
CCDS57004.1. [Q8C2B3-5]
CCDS57005.1. [Q8C2B3-2]
CCDS57006.1. [Q8C2B3-3]
CCDS57007.1. [Q8C2B3-4]
RefSeqiNP_001191204.1. NM_001204275.1. [Q8C2B3-3]
NP_001191205.1. NM_001204276.1. [Q8C2B3-2]
NP_001191206.1. NM_001204277.1. [Q8C2B3-4]
NP_001191207.1. NM_001204278.1. [Q8C2B3-5]
NP_062518.2. NM_019572.3. [Q8C2B3-1]
XP_006521268.1. XM_006521205.2.
XP_006521270.1. XM_006521207.3. [Q8C2B3-3]
XP_006521271.1. XM_006521208.3. [Q8C2B3-3]
XP_006521272.1. XM_006521209.2. [Q8C2B3-3]
XP_006521273.1. XM_006521210.2. [Q8C2B3-3]
UniGeneiMm.384027.

Genome annotation databases

EnsembliENSMUST00000079838; ENSMUSP00000078766; ENSMUSG00000022475. [Q8C2B3-4]
ENSMUST00000088402; ENSMUSP00000085744; ENSMUSG00000022475. [Q8C2B3-1]
ENSMUST00000116408; ENSMUSP00000112109; ENSMUSG00000022475. [Q8C2B3-5]
ENSMUST00000116409; ENSMUSP00000112110; ENSMUSG00000022475. [Q8C2B3-3]
ENSMUST00000118294; ENSMUSP00000113380; ENSMUSG00000022475. [Q8C2B3-2]
GeneIDi56233.
KEGGimmu:56233.
UCSCiuc007xle.2. mouse. [Q8C2B3-1]
uc007xlf.2. mouse. [Q8C2B3-2]
uc007xlg.2. mouse. [Q8C2B3-4]
uc007xlh.2. mouse. [Q8C2B3-3]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF207749 mRNA. Translation: AAF31419.1.
AK030863 mRNA. Translation: BAC27161.1.
AK036586 mRNA. Translation: BAC29493.1.
AK044287 mRNA. Translation: BAC31856.1.
AK088828 mRNA. Translation: BAC40598.1.
AK088945 mRNA. Translation: BAC40666.1.
BC057332 mRNA. Translation: AAH57332.1.
CCDSiCCDS37188.1. [Q8C2B3-1]
CCDS57004.1. [Q8C2B3-5]
CCDS57005.1. [Q8C2B3-2]
CCDS57006.1. [Q8C2B3-3]
CCDS57007.1. [Q8C2B3-4]
RefSeqiNP_001191204.1. NM_001204275.1. [Q8C2B3-3]
NP_001191205.1. NM_001204276.1. [Q8C2B3-2]
NP_001191206.1. NM_001204277.1. [Q8C2B3-4]
NP_001191207.1. NM_001204278.1. [Q8C2B3-5]
NP_062518.2. NM_019572.3. [Q8C2B3-1]
XP_006521268.1. XM_006521205.2.
XP_006521270.1. XM_006521207.3. [Q8C2B3-3]
XP_006521271.1. XM_006521208.3. [Q8C2B3-3]
XP_006521272.1. XM_006521209.2. [Q8C2B3-3]
XP_006521273.1. XM_006521210.2. [Q8C2B3-3]
UniGeneiMm.384027.

3D structure databases

ProteinModelPortaliQ8C2B3.
SMRiQ8C2B3.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi207862. 5 interactors.
DIPiDIP-42594N.
IntActiQ8C2B3. 4 interactors.
MINTiMINT-1551781.
STRINGi10090.ENSMUSP00000112110.

PTM databases

iPTMnetiQ8C2B3.
PhosphoSitePlusiQ8C2B3.

Proteomic databases

PaxDbiQ8C2B3.
PRIDEiQ8C2B3.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000079838; ENSMUSP00000078766; ENSMUSG00000022475. [Q8C2B3-4]
ENSMUST00000088402; ENSMUSP00000085744; ENSMUSG00000022475. [Q8C2B3-1]
ENSMUST00000116408; ENSMUSP00000112109; ENSMUSG00000022475. [Q8C2B3-5]
ENSMUST00000116409; ENSMUSP00000112110; ENSMUSG00000022475. [Q8C2B3-3]
ENSMUST00000118294; ENSMUSP00000113380; ENSMUSG00000022475. [Q8C2B3-2]
GeneIDi56233.
KEGGimmu:56233.
UCSCiuc007xle.2. mouse. [Q8C2B3-1]
uc007xlf.2. mouse. [Q8C2B3-2]
uc007xlg.2. mouse. [Q8C2B3-4]
uc007xlh.2. mouse. [Q8C2B3-3]

Organism-specific databases

CTDi51564.
MGIiMGI:1891835. Hdac7.

Phylogenomic databases

eggNOGiKOG1343. Eukaryota.
COG0123. LUCA.
GeneTreeiENSGT00530000062809.
HOGENOMiHOG000232065.
HOVERGENiHBG057100.
InParanoidiQ8C2B3.
KOiK11408.
OMAiAFRIVVM.
OrthoDBiEOG091G0EQO.
PhylomeDBiQ8C2B3.
TreeFamiTF106173.

Enzyme and pathway databases

ReactomeiR-MMU-3108214. SUMOylation of DNA damage response and repair proteins.

Miscellaneous databases

ChiTaRSiHdac7. mouse.
PMAP-CutDBQ8C2B3.
PROiQ8C2B3.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000022475.
CleanExiMM_HDAC7.
ExpressionAtlasiQ8C2B3. baseline and differential.
GenevisibleiQ8C2B3. MM.

Family and domain databases

Gene3Di3.40.800.20. 1 hit.
InterProiIPR000286. His_deacetylse.
IPR023801. His_deacetylse_dom.
[Graphical view]
PANTHERiPTHR10625. PTHR10625. 2 hits.
PfamiPF00850. Hist_deacetyl. 1 hit.
[Graphical view]
PRINTSiPR01270. HDASUPER.
ProtoNetiSearch...

Entry informationi

Entry nameiHDAC7_MOUSE
AccessioniPrimary (citable) accession number: Q8C2B3
Secondary accession number(s): Q8C2C9
, Q8C8X4, Q8CB80, Q8CDA3, Q9JL72
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 16, 2003
Last sequence update: May 16, 2003
Last modified: November 2, 2016
This is version 145 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Miscellaneous

Its activity is inhibited by Trichostatin A (TSA), a known histone deacetylase inhibitor.

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.