Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

SWI/SNF complex subunit SMARCC2

Gene

SMARCC2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Involved in transcriptional activation and repression of select genes by chromatin remodeling (alteration of DNA-nucleosome topology). Can stimulate the ATPase activity of the catalytic subunit of these complexes. May be required for CoREST dependent repression of neuronal specific gene promoters in non-neuronal cells. Belongs to the neural progenitors-specific chromatin remodeling complex (npBAF complex) and the neuron-specific chromatin remodeling complex (nBAF complex). During neural development a switch from a stem/progenitor to a post-mitotic chromatin remodeling mechanism occurs as neurons exit the cell cycle and become committed to their adult state. The transition from proliferating neural stem/progenitor cells to post-mitotic neurons requires a switch in subunit composition of the npBAF and nBAF complexes. As neural progenitors exit mitosis and differentiate into neurons, npBAF complexes which contain ACTL6A/BAF53A and PHF10/BAF45A, are exchanged for homologous alternative ACTL6B/BAF53B and DPF1/BAF45B or DPF3/BAF45C subunits in neuron-specific complexes (nBAF). The npBAF complex is essential for the self-renewal/proliferative capacity of the multipotent neural stem cells. The nBAF complex along with CREST plays a role regulating the activity of genes essential for dendrite growth (By similarity).By similarity1 Publication

GO - Molecular functioni

  • chromatin binding Source: Ensembl
  • DNA binding Source: InterPro
  • transcription coactivator activity Source: BHF-UCL

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Chromatin regulator

Keywords - Biological processi

Neurogenesis, Transcription, Transcription regulation

Enzyme and pathway databases

BioCyciZFISH:ENSG00000139613-MONOMER.
ReactomeiR-HSA-3214858. RMTs methylate histone arginines.
SIGNORiQ8TAQ2.

Names & Taxonomyi

Protein namesi
Recommended name:
SWI/SNF complex subunit SMARCC2
Alternative name(s):
BRG1-associated factor 170
Short name:
BAF170
SWI/SNF complex 170 kDa subunit
SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily C member 2
Gene namesi
Name:SMARCC2
Synonyms:BAF170
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 12

Organism-specific databases

HGNCiHGNC:11105. SMARCC2.

Subcellular locationi

GO - Cellular componenti

  • nBAF complex Source: UniProtKB
  • npBAF complex Source: UniProtKB
  • nuclear chromatin Source: UniProtKB
  • nucleoplasm Source: HPA
  • protein complex Source: UniProtKB
  • SWI/SNF complex Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Organism-specific databases

DisGeNETi6601.
OpenTargetsiENSG00000139613.
PharmGKBiPA35955.

Polymorphism and mutation databases

BioMutaiSMARCC2.
DMDMi57012959.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00001971171 – 1214SWI/SNF complex subunit SMARCC2Add BLAST1214

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei283PhosphoserineCombined sources1
Modified residuei286PhosphoserineCombined sources1
Modified residuei302PhosphoserineCombined sources1
Modified residuei304PhosphoserineCombined sources1
Modified residuei306PhosphoserineCombined sources1
Modified residuei326N6-acetyllysineCombined sources1
Modified residuei347PhosphoserineCombined sources1
Modified residuei387PhosphoserineBy similarity1
Modified residuei548PhosphothreonineCombined sources1
Cross-linki566Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Cross-linki704Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei813PhosphoserineCombined sources1
Cross-linki848Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources

Keywords - PTMi

Acetylation, Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiQ8TAQ2.
MaxQBiQ8TAQ2.
PaxDbiQ8TAQ2.
PeptideAtlasiQ8TAQ2.
PRIDEiQ8TAQ2.

PTM databases

iPTMnetiQ8TAQ2.
PhosphoSitePlusiQ8TAQ2.

Expressioni

Tissue specificityi

Ubiquitously expressed.

Gene expression databases

BgeeiENSG00000139613.
CleanExiHS_SMARCC2.
ExpressionAtlasiQ8TAQ2. baseline and differential.
GenevisibleiQ8TAQ2. HS.

Organism-specific databases

HPAiCAB004321.
HPA021213.
HPA061788.

Interactioni

Subunit structurei

Component of a number of multiprotein chromatin-remodeling complexes: Swi/Snf-A (BAF), Swi/Snf-B (PBAF), Brm, Brg1(I) and Brg1(II). Each of the complexes contains a catalytic subunit (either SMARCA4 or SMARCA2), and at least SMARCE1, ACTL6A/BAF53A or ACTL6B/BAF53B, SMARCC1 and SMARCB1. Other subunits specific to each of the complexes may also be present. Component of the BAF complex, which includes at least actin (ACTB), ARID1A, ARID1B/BAF250, SMARCA2, SMARCA4/BRG1, ACTL6A/BAF53, ACTL6B/BAF53B, SMARCE1/BAF57, SMARCC1/BAF155, SMARCC2/BAF170, SMARCB1/SNF5/INI1, and one or more of SMARCD1/BAF60A, SMARCD2/BAF60B, or SMARCD3/BAF60C. In muscle cells, the BAF complex also contains DPF3. May also interact with the SIN3A histone deacetylase transcription repressor complex in conjunction with SMARCA2 and SMARCA4. Interacts with SMARD1. Component of neural progenitors-specific chromatin remodeling complex (npBAF complex) composed of at least, ARID1A/BAF250A or ARID1B/BAF250B, SMARCD1/BAF60A, SMARCD3/BAF60C, SMARCA2/BRM/BAF190B, SMARCA4/BRG1/BAF190A, SMARCB1/BAF47, SMARCC1/BAF155, SMARCE1/BAF57, SMARCC2/BAF170, PHF10/BAF45A, ACTL6A/BAF53A and actin. Component of neuron-specific chromatin remodeling complex (nBAF complex) composed of at least, ARID1A/BAF250A or ARID1B/BAF250B, SMARCD1/BAF60A, SMARCD3/BAF60C, SMARCA2/BRM/BAF190B, SMARCA4/BRG1/BAF190A, SMARCB1/BAF47, SMARCC1/BAF155, SMARCE1/BAF57, SMARCC2/BAF170, DPF1/BAF45B, DPF3/BAF45C, ACTL6B/BAF53B and actin (By similarity).By similarity

Binary interactionsi

WithEntry#Exp.IntActNotes
TERF2IPQ9NYB02EBI-357418,EBI-750109

Protein-protein interaction databases

BioGridi112485. 143 interactors.
DIPiDIP-27611N.
IntActiQ8TAQ2. 55 interactors.
MINTiMINT-1151033.
STRINGi9606.ENSP00000267064.

Structurei

3D structure databases

ProteinModelPortaliQ8TAQ2.
SMRiQ8TAQ2.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini424 – 521SWIRMPROSITE-ProRule annotationAdd BLAST98
Domaini596 – 647SANTPROSITE-ProRule annotationAdd BLAST52

Coiled coil

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Coiled coili907 – 934Sequence analysisAdd BLAST28

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi186 – 189Poly-Glu4
Compositional biasi747 – 855Glu-richAdd BLAST109
Compositional biasi861 – 870Poly-Ala10
Compositional biasi956 – 960Poly-Gln5
Compositional biasi961 – 1213Pro-richAdd BLAST253

Sequence similaritiesi

Belongs to the SMARCC family.Curated
Contains 1 SANT domain.PROSITE-ProRule annotation
Contains 1 SWIRM domain.PROSITE-ProRule annotation

Keywords - Domaini

Coiled coil

Phylogenomic databases

eggNOGiKOG1279. Eukaryota.
COG5259. LUCA.
GeneTreeiENSGT00390000018166.
HOGENOMiHOG000047736.
HOVERGENiHBG054849.
InParanoidiQ8TAQ2.
KOiK11649.
PhylomeDBiQ8TAQ2.
TreeFamiTF314710.

Family and domain databases

Gene3Di1.10.10.10. 1 hit.
1.10.10.60. 1 hit.
InterProiIPR001357. BRCT_dom.
IPR000953. Chromo/shadow_dom.
IPR009057. Homeodomain-like.
IPR001005. SANT/Myb.
IPR017884. SANT_dom.
IPR030092. SMARCC2(BAF170).
IPR032451. SMARCC_C.
IPR032450. SMARCC_N.
IPR007526. SWIRM.
IPR032448. SWIRM-assoc.
IPR011991. WHTH_DNA-bd_dom.
[Graphical view]
PANTHERiPTHR12802:SF38. PTHR12802:SF38. 1 hit.
PfamiPF00249. Myb_DNA-binding. 1 hit.
PF04433. SWIRM. 1 hit.
PF16495. SWIRM-assoc_1. 1 hit.
PF16496. SWIRM-assoc_2. 1 hit.
PF16498. SWIRM-assoc_3. 1 hit.
[Graphical view]
SMARTiSM00298. CHROMO. 1 hit.
SM00717. SANT. 1 hit.
[Graphical view]
SUPFAMiSSF46689. SSF46689. 2 hits.
SSF52113. SSF52113. 2 hits.
PROSITEiPS51293. SANT. 1 hit.
PS50934. SWIRM. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q8TAQ2-1) [UniParc]FASTAAdd to basket
Also known as: SMARCC2a

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MAVRKKDGGP NVKYYEAADT VTQFDNVRLW LGKNYKKYIQ AEPPTNKSLS
60 70 80 90 100
SLVVQLLQFQ EEVFGKHVSN APLTKLPIKC FLDFKAGGSL CHILAAAYKF
110 120 130 140 150
KSDQGWRRYD FQNPSRMDRN VEMFMTIEKS LVQNNCLSRP NIFLCPEIEP
160 170 180 190 200
KLLGKLKDII KRHQGTVTED KNNASHVVYP VPGNLEEEEW VRPVMKRDKQ
210 220 230 240 250
VLLHWGYYPD SYDTWIPASE IEASVEDAPT PEKPRKVHAK WILDTDTFNE
260 270 280 290 300
WMNEEDYEVN DDKNPVSRRK KISAKTLTDE VNSPDSDRRD KKGGNYKKRK
310 320 330 340 350
RSPSPSPTPE AKKKNAKKGP STPYTKSKRG HREEEQEDLT KDMDEPSPVP
360 370 380 390 400
NVEEVTLPKT VNTKKDSESA PVKGGTMTDL DEQEDESMET TGKDEDENST
410 420 430 440 450
GNKGEQTKNP DLHEDNVTEQ THHIIIPSYA AWFDYNSVHA IERRALPEFF
460 470 480 490 500
NGKNKSKTPE IYLAYRNFMI DTYRLNPQEY LTSTACRRNL AGDVCAIMRV
510 520 530 540 550
HAFLEQWGLI NYQVDAESRP TPMGPPPTSH FHVLADTPSG LVPLQPKTPQ
560 570 580 590 600
QTSASQQMLN FPDKGKEKPT DMQNFGLRTD MYTKKNVPSK SKAAASATRE
610 620 630 640 650
WTEQETLLLL EALEMYKDDW NKVSEHVGSR TQDECILHFL RLPIEDPYLE
660 670 680 690 700
DSEASLGPLA YQPIPFSQSG NPVMSTVAFL ASVVDPRVAS AAAKSALEEF
710 720 730 740 750
SKMKEEVPTA LVEAHVRKVE EAAKVTGKAD PAFGLESSGI AGTTSDEPER
760 770 780 790 800
IEESGNDEAR VEGQATDEKK EPKEPREGGG AIEEEAKEKT SEAPKKDEEK
810 820 830 840 850
GKEGDSEKES EKSDGDPIVD PEKEKEPKEG QEEVLKEVVE SEGERKTKVE
860 870 880 890 900
RDIGEGNLST AAAAALAAAA VKAKHLAAVE ERKIKSLVAL LVETQMKKLE
910 920 930 940 950
IKLRHFEELE TIMDREREAL EYQRQQLLAD RQAFHMEQLK YAEMRARQQH
960 970 980 990 1000
FQQMHQQQQQ PPPALPPGSQ PIPPTGAAGP PAVHGLAVAP ASVVPAPAGS
1010 1020 1030 1040 1050
GAPPGSLGPS EQIGQAGSTA GPQQQQPAGA PQPGAVPPGV PPPGPHGPSP
1060 1070 1080 1090 1100
FPNQQTPPSM MPGAVPGSGH PGVAGNAPLG LPFGMPPPPP PPAPSIIPFG
1110 1120 1130 1140 1150
SLADSISINL PAPPNLHGHH HHLPFAPGTL PPPNLPVSMA NPLHPNLPAT
1160 1170 1180 1190 1200
TTMPSSLPLG PGLGSAAAQS PAIVAAVQGN LLPSASPLPD PGTPLPPDPT
1210
APSPGTVTPV PPPQ
Length:1,214
Mass (Da):132,879
Last modified:June 1, 2002 - v1
Checksum:iEEFC1042A9296320
GO
Isoform 2 (identifier: Q8TAQ2-2) [UniParc]FASTAAdd to basket
Also known as: SMARCC2b

The sequence of this isoform differs from the canonical sequence as follows:
     550-550: Q → QGRQVDADTKAGRKGKELDDLVPETAKGKPEL
     1075-1189: Missing.

Show »
Length:1,130
Mass (Da):124,841
Checksum:i30107786CB04BC1E
GO
Isoform 3 (identifier: Q8TAQ2-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     550-550: Q → QGRQVDADTKAGRKGKELDDLVPETAKGKPEL
     1075-1167: Missing.

Note: No experimental confirmation available.
Show »
Length:1,152
Mass (Da):126,924
Checksum:i70504763A7CA887C
GO

Sequence cautioni

The sequence BAD92243 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti311 – 316AKKKNA → VKEEKC in AAC50694 (PubMed:8804307).Curated6
Sequence conflicti498M → S in AAC50694 (PubMed:8804307).Curated1
Sequence conflicti587V → A in AAC50694 (PubMed:8804307).Curated1
Sequence conflicti876L → F in AAP88926 (Ref. 2) Curated1
Sequence conflicti876L → F in AAH13045 (PubMed:15489334).Curated1
Sequence conflicti942A → P in AAC50694 (PubMed:8804307).Curated1
Sequence conflicti1020A → R in AAC50694 (PubMed:8804307).Curated1
Sequence conflicti1117 – 1126HGHHHHLPFA → MGSPPSPVR in AAC50694 (PubMed:8804307).Curated10

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_012490550Q → QGRQVDADTKAGRKGKELDD LVPETAKGKPEL in isoform 2 and isoform 3. 3 Publications1
Alternative sequenceiVSP_0124911075 – 1189Missing in isoform 2. 2 PublicationsAdd BLAST115
Alternative sequenceiVSP_0446471075 – 1167Missing in isoform 3. 1 PublicationAdd BLAST93

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U66616 mRNA. Translation: AAC50694.1.
BT009924 mRNA. Translation: AAP88926.1.
AB209006 mRNA. Translation: BAD92243.1. Different initiation.
AC073896 Genomic DNA. No translation available.
BC009067 mRNA. Translation: AAH09067.1.
BC013045 mRNA. Translation: AAH13045.1.
BC026222 mRNA. Translation: AAH26222.1.
CCDSiCCDS55835.1. [Q8TAQ2-3]
CCDS8907.1. [Q8TAQ2-1]
CCDS8908.1. [Q8TAQ2-2]
RefSeqiNP_001123892.1. NM_001130420.2. [Q8TAQ2-3]
NP_001317217.1. NM_001330288.1.
NP_003066.2. NM_003075.4. [Q8TAQ2-1]
NP_620706.1. NM_139067.3. [Q8TAQ2-2]
UniGeneiHs.236030.
Hs.632717.

Genome annotation databases

EnsembliENST00000267064; ENSP00000267064; ENSG00000139613. [Q8TAQ2-1]
ENST00000347471; ENSP00000302919; ENSG00000139613. [Q8TAQ2-2]
ENST00000394023; ENSP00000377591; ENSG00000139613. [Q8TAQ2-3]
GeneIDi6601.
KEGGihsa:6601.
UCSCiuc001ska.4. human. [Q8TAQ2-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U66616 mRNA. Translation: AAC50694.1.
BT009924 mRNA. Translation: AAP88926.1.
AB209006 mRNA. Translation: BAD92243.1. Different initiation.
AC073896 Genomic DNA. No translation available.
BC009067 mRNA. Translation: AAH09067.1.
BC013045 mRNA. Translation: AAH13045.1.
BC026222 mRNA. Translation: AAH26222.1.
CCDSiCCDS55835.1. [Q8TAQ2-3]
CCDS8907.1. [Q8TAQ2-1]
CCDS8908.1. [Q8TAQ2-2]
RefSeqiNP_001123892.1. NM_001130420.2. [Q8TAQ2-3]
NP_001317217.1. NM_001330288.1.
NP_003066.2. NM_003075.4. [Q8TAQ2-1]
NP_620706.1. NM_139067.3. [Q8TAQ2-2]
UniGeneiHs.236030.
Hs.632717.

3D structure databases

ProteinModelPortaliQ8TAQ2.
SMRiQ8TAQ2.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi112485. 143 interactors.
DIPiDIP-27611N.
IntActiQ8TAQ2. 55 interactors.
MINTiMINT-1151033.
STRINGi9606.ENSP00000267064.

PTM databases

iPTMnetiQ8TAQ2.
PhosphoSitePlusiQ8TAQ2.

Polymorphism and mutation databases

BioMutaiSMARCC2.
DMDMi57012959.

Proteomic databases

EPDiQ8TAQ2.
MaxQBiQ8TAQ2.
PaxDbiQ8TAQ2.
PeptideAtlasiQ8TAQ2.
PRIDEiQ8TAQ2.

Protocols and materials databases

DNASUi6601.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000267064; ENSP00000267064; ENSG00000139613. [Q8TAQ2-1]
ENST00000347471; ENSP00000302919; ENSG00000139613. [Q8TAQ2-2]
ENST00000394023; ENSP00000377591; ENSG00000139613. [Q8TAQ2-3]
GeneIDi6601.
KEGGihsa:6601.
UCSCiuc001ska.4. human. [Q8TAQ2-1]

Organism-specific databases

CTDi6601.
DisGeNETi6601.
GeneCardsiSMARCC2.
HGNCiHGNC:11105. SMARCC2.
HPAiCAB004321.
HPA021213.
HPA061788.
MIMi601734. gene.
neXtProtiNX_Q8TAQ2.
OpenTargetsiENSG00000139613.
PharmGKBiPA35955.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG1279. Eukaryota.
COG5259. LUCA.
GeneTreeiENSGT00390000018166.
HOGENOMiHOG000047736.
HOVERGENiHBG054849.
InParanoidiQ8TAQ2.
KOiK11649.
PhylomeDBiQ8TAQ2.
TreeFamiTF314710.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000139613-MONOMER.
ReactomeiR-HSA-3214858. RMTs methylate histone arginines.
SIGNORiQ8TAQ2.

Miscellaneous databases

ChiTaRSiSMARCC2. human.
GeneWikiiSMARCC2.
GenomeRNAii6601.
PROiQ8TAQ2.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000139613.
CleanExiHS_SMARCC2.
ExpressionAtlasiQ8TAQ2. baseline and differential.
GenevisibleiQ8TAQ2. HS.

Family and domain databases

Gene3Di1.10.10.10. 1 hit.
1.10.10.60. 1 hit.
InterProiIPR001357. BRCT_dom.
IPR000953. Chromo/shadow_dom.
IPR009057. Homeodomain-like.
IPR001005. SANT/Myb.
IPR017884. SANT_dom.
IPR030092. SMARCC2(BAF170).
IPR032451. SMARCC_C.
IPR032450. SMARCC_N.
IPR007526. SWIRM.
IPR032448. SWIRM-assoc.
IPR011991. WHTH_DNA-bd_dom.
[Graphical view]
PANTHERiPTHR12802:SF38. PTHR12802:SF38. 1 hit.
PfamiPF00249. Myb_DNA-binding. 1 hit.
PF04433. SWIRM. 1 hit.
PF16495. SWIRM-assoc_1. 1 hit.
PF16496. SWIRM-assoc_2. 1 hit.
PF16498. SWIRM-assoc_3. 1 hit.
[Graphical view]
SMARTiSM00298. CHROMO. 1 hit.
SM00717. SANT. 1 hit.
[Graphical view]
SUPFAMiSSF46689. SSF46689. 2 hits.
SSF52113. SSF52113. 2 hits.
PROSITEiPS51293. SANT. 1 hit.
PS50934. SWIRM. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiSMRC2_HUMAN
AccessioniPrimary (citable) accession number: Q8TAQ2
Secondary accession number(s): F8VTJ5
, Q59GV3, Q92923, Q96E12, Q96GY4
Entry historyi
Integrated into UniProtKB/Swiss-Prot: January 4, 2005
Last sequence update: June 1, 2002
Last modified: November 30, 2016
This is version 149 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 12
    Human chromosome 12: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.