Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Androgen receptor

Gene

Ar

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Steroid hormone receptors are ligand-activated transcription factors that regulate eukaryotic gene expression and affect cellular proliferation and differentiation in target tissues. Transcription factor activity is modulated by bound coactivator and corepressor proteins. Transcription activation is down-regulated by NR0B2. Activated, but not phosphorylated, by HIPK3 and ZIPK/DAPK3.By similarity

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Binding sitei685Androgen1
Binding sitei732Androgen1
Binding sitei857Androgen1

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
DNA bindingi538 – 611Nuclear receptorPROSITE-ProRule annotationAdd BLAST74
Zinc fingeri539 – 559NR C4-typePROSITE-ProRule annotationAdd BLAST21
Zinc fingeri575 – 599NR C4-typePROSITE-ProRule annotationAdd BLAST25

GO - Molecular functioni

GO - Biological processi

  • activation of prostate induction by androgen receptor signaling pathway Source: MGI
  • androgen receptor signaling pathway Source: MGI
  • animal organ formation Source: MGI
  • cellular process Source: MGI
  • epithelial cell differentiation involved in prostate gland development Source: MGI
  • epithelial cell morphogenesis Source: MGI
  • fertilization Source: MGI
  • intracellular receptor signaling pathway Source: BHF-UCL
  • in utero embryonic development Source: MGI
  • lateral sprouting involved in mammary gland duct morphogenesis Source: MGI
  • Leydig cell differentiation Source: MGI
  • male genitalia morphogenesis Source: MGI
  • male gonad development Source: MGI
  • male somatic sex determination Source: MGI
  • mammary gland alveolus development Source: MGI
  • morphogenesis of an epithelial fold Source: MGI
  • multicellular organism growth Source: MGI
  • negative regulation of cell proliferation Source: MGI
  • negative regulation of epithelial cell proliferation Source: MGI
  • negative regulation of extrinsic apoptotic signaling pathway Source: BHF-UCL
  • negative regulation of integrin biosynthetic process Source: UniProtKB
  • positive regulation of cell differentiation Source: MGI
  • positive regulation of cell proliferation Source: UniProtKB
  • positive regulation of gene expression Source: MGI
  • positive regulation of insulin-like growth factor receptor signaling pathway Source: MGI
  • positive regulation of integrin biosynthetic process Source: MGI
  • positive regulation of intracellular estrogen receptor signaling pathway Source: MGI
  • positive regulation of MAPK cascade Source: MGI
  • positive regulation of NF-kappaB transcription factor activity Source: UniProtKB
  • positive regulation of phosphorylation Source: UniProtKB
  • positive regulation of transcription, DNA-templated Source: UniProtKB
  • positive regulation of transcription from RNA polymerase III promoter Source: MGI
  • positive regulation of transcription from RNA polymerase II promoter Source: UniProtKB
  • prostate gland epithelium morphogenesis Source: MGI
  • prostate gland growth Source: MGI
  • protein oligomerization Source: MGI
  • regulation of catalytic activity Source: MGI
  • regulation of developmental growth Source: MGI
  • regulation of establishment of protein localization to plasma membrane Source: UniProtKB
  • regulation of gene expression Source: MGI
  • regulation of prostatic bud formation Source: MGI
  • regulation of systemic arterial blood pressure Source: MGI
  • regulation of transcription, DNA-templated Source: MGI
  • regulation of transcription from RNA polymerase II promoter Source: MGI
  • reproductive structure development Source: MGI
  • reproductive system development Source: MGI
  • seminiferous tubule development Source: MGI
  • single fertilization Source: MGI
  • spermatogenesis Source: MGI
  • tertiary branching involved in mammary gland duct morphogenesis Source: MGI
  • transcription, DNA-templated Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Receptor

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding, Lipid-binding, Metal-binding, Steroid-binding, Zinc

Enzyme and pathway databases

ReactomeiR-MMU-383280. Nuclear Receptor transcription pathway.
R-MMU-5625886. Activated PKN1 stimulates transcription of AR (androgen receptor) regulated genes KLK2 and KLK3.
R-MMU-5689880. Ub-specific processing proteases.

Names & Taxonomyi

Protein namesi
Recommended name:
Androgen receptor
Alternative name(s):
Dihydrotestosterone receptor
Nuclear receptor subfamily 3 group C member 4
Gene namesi
Name:Ar
Synonyms:Nr3c4
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome X

Organism-specific databases

MGIiMGI:88064. Ar.

Subcellular locationi

  • Nucleus By similarity
  • Cytoplasm By similarity

  • Note: Predominantly cytoplasmic in unligated form but translocates to the nucleus upon ligand-binding. Can also translocate to the nucleus in unligated form in the presence of RACK1.By similarity

GO - Cellular componenti

  • cytoplasm Source: UniProtKB
  • nuclear chromatin Source: UniProtKB
  • nucleus Source: UniProtKB
  • plasma membrane Source: MGI
  • protein complex Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Nucleus

Pathology & Biotechi

Chemistry databases

ChEMBLiCHEMBL3056.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000537071 – 899Androgen receptorAdd BLAST899

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei61Phosphoserine; by CDK9By similarity1
Modified residuei75PhosphoserineBy similarity1
Modified residuei218Phosphotyrosine; by CSKBy similarity1
Modified residuei251PhosphoserineBy similarity1
Modified residuei262Phosphotyrosine; by CSK and TNK2By similarity1
Modified residuei302Phosphotyrosine; by CSKBy similarity1
Modified residuei341Phosphotyrosine; by CSKBy similarity1
Modified residuei352Phosphotyrosine; by CSKBy similarity1
Modified residuei357Phosphotyrosine; by CSKBy similarity1
Modified residuei358Phosphotyrosine; by CSK and TNK2By similarity1
Cross-linki381Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)By similarity
Modified residuei388Phosphotyrosine; by CSKBy similarity1
Cross-linki500Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)By similarity
Modified residuei514Phosphotyrosine; by CSKBy similarity1
Modified residuei531Phosphotyrosine; by CSKBy similarity1
Modified residuei630PhosphoserineCombined sources1
Cross-linki825Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in ubiquitin)By similarity
Cross-linki827Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in ubiquitin)By similarity
Modified residuei895Phosphotyrosine; by CSKBy similarity1

Post-translational modificationi

Phosphorylated in prostate cancer cells in response to several growth factors including EGF. Phosphorylation is induced by c-Src kinase (CSK). Tyr-514 is one of the major phosphorylation sites and an increase in phosphorylation and Src kinase activity is associated with prostate cancer progression (By similarity). Phosphorylation by TNK2 enhances the DNA-binding and transcriptional activity. Phosphorylation at Ser-61 by CDK9 regulates AR promoter selectivity and cell growth. Phosphorylation by PAK6 leads to AR-mediated transcription inhibition (By similarity).By similarity
Sumoylated on Lys-381 (major) and Lys-500 (By similarity). Ubiquitinated. Deubiquitinated by USP26 (By similarity). 'Lys-6' and 'Lys-27'-linked polyubiquitination by RNF6 modulates AR transcriptional activity and specificity (By similarity).By similarity
Palmitoylated by ZDHHC7 and ZDHHC21. Palmitoylation is required for plasma membrane targeting and for rapid intracellular signaling via ERK and AKT kinases and cAMP generation (By similarity).By similarity

Keywords - PTMi

Isopeptide bond, Lipoprotein, Palmitate, Phosphoprotein, Ubl conjugation

Proteomic databases

PaxDbiP19091.
PRIDEiP19091.

PTM databases

iPTMnetiP19091.
PhosphoSitePlusiP19091.

Expressioni

Gene expression databases

BgeeiENSMUSG00000046532.
CleanExiMM_AR.
GenevisibleiP19091. MM.

Interactioni

Subunit structurei

Binds DNA as a homodimer. Part of a ternary complex containing AR, EFCAB6/DJBP and PARK7. Interacts with HIPK3 and NR0B2 in the presence of androgen. The ligand binding domain interacts with KAT7/HBO1 in the presence of dihydrotestosterone. Interacts with EFCAB6/DJBP, PELP1, PQBP1, RANBP9, SPDEF, SRA1, TGFB1I1, ZNF318 and RREB1. The AR N-terminal poly-Gln region binds Ran resulting in enhancement of AR-mediated transactivation. Ran-binding decreases as the poly-Gln length increases. Interacts with ZMIZ1/ZIMP10 and ZMIZ2/ZMIP7 which both enhance its transactivation activity. Interacts with RBAK. Interacts via the ligand-binding domain with LXXLL and FXXLF motifs from NCOA1, NCOA2, NCOA3, NCOA4 and MAGEA11. Interacts with HIP1 (via coiled coil domain). Interacts with SLC30A9 and RAD54L2/ARIP4. Interacts with MACROD1. Interacts (via ligand-binding domain) with TRIM68. Interacts with TNK2. Interacts with USP26. Interacts with RNF6. Interacts (regulated by RNF6 probably through polyubiquitination) with RNF14; regulates AR transcriptional activity. Interacts with PRMT2 and TRIM24. Interacts with RACK1. Interacts with RANBP10; this interaction enhances hormone-induced AR transcriptional activity. Interacts with PRPF6 in a hormone-independent way; this interaction enhances hormone-induced AR transcriptional activity. Interacts with STK4/MST1. Interacts with ZIPK/DAPK3. Interacts with LPXN. Interacts with MAK. Part of a complex containing AR, MAK and NCOA3. Interacts with CRY1. Interacts with CCAR1 and GATA2 (By similarity).By similarity6 Publications

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei700Interaction with coactivator LXXL motifBy similarity1
Sitei877Interaction with coactivator FXXLF motifBy similarity1

Binary interactionsi

WithEntry#Exp.IntActNotes
Casp8O891102EBI-1776062,EBI-851690

GO - Molecular functioni

  • ATPase binding Source: MGI
  • beta-catenin binding Source: UniProtKB
  • enzyme binding Source: BHF-UCL
  • POU domain binding Source: UniProtKB
  • receptor binding Source: MGI
  • RNA polymerase II transcription factor binding Source: MGI
  • transcription factor binding Source: MGI

Protein-protein interaction databases

BioGridi198179. 21 interactors.
DIPiDIP-41803N.
IntActiP19091. 8 interactors.
MINTiMINT-151935.
STRINGi10090.ENSMUSP00000052648.

Chemistry databases

BindingDBiP19091.

Structurei

Secondary structure

1899
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Helixi652 – 659Combined sources8
Helixi677 – 701Combined sources25
Helixi705 – 707Combined sources3
Helixi710 – 736Combined sources27
Beta strandi740 – 745Combined sources6
Beta strandi748 – 750Combined sources3
Helixi752 – 757Combined sources6
Helixi761 – 776Combined sources16
Helixi781 – 792Combined sources12
Beta strandi794 – 799Combined sources6
Helixi804 – 823Combined sources20
Helixi831 – 844Combined sources14
Helixi846 – 862Combined sources17
Turni863 – 868Combined sources6
Helixi873 – 887Combined sources15
Beta strandi890 – 893Combined sources4

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2QPYX-ray2.50A649-899[»]
ProteinModelPortaliP19091.
SMRiP19091.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP19091.

Family & Domainsi

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 537ModulatingBy similarityAdd BLAST537
Regioni531 – 899Interaction with LPXNBy similarityAdd BLAST369
Regioni551 – 641Interaction with HIPK3By similarityAdd BLAST91
Regioni571 – 899Interaction with CCAR1By similarityAdd BLAST329
Regioni604 – 899Interaction with KAT7By similarityAdd BLAST296
Regioni670 – 899Ligand-bindingAdd BLAST230

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi63 – 67Poly-Arg5
Compositional biasi174 – 193Poly-GlnAdd BLAST20
Compositional biasi367 – 373Poly-Pro7
Compositional biasi391 – 397Poly-Ala7
Compositional biasi441 – 447Poly-Gly7

Domaini

Composed of three domains: a modulating N-terminal domain, a DNA-binding domain and a C-terminal ligand-binding domain. In the presence of bound steroid the ligand-binding domain interacts with the N-terminal modulating domain, and thereby activates AR transcription factor activity. Agonist binding is required for dimerization and binding to target DNA. The transcription factor activity of the complex formed by ligand-activated AR and DNA is modulated by interactions with coactivator and corepressor proteins. Interaction with RANBP9 is mediated by both the N-terminal domain and the DNA-binding domain. Interaction with EFCAB6/DJBP is mediated by the DNA-binding domain (By similarity).By similarity

Sequence similaritiesi

Contains 1 nuclear receptor DNA-binding domain.PROSITE-ProRule annotation

Zinc finger

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri539 – 559NR C4-typePROSITE-ProRule annotationAdd BLAST21
Zinc fingeri575 – 599NR C4-typePROSITE-ProRule annotationAdd BLAST25

Keywords - Domaini

Zinc-finger

Phylogenomic databases

eggNOGiKOG3575. Eukaryota.
ENOG410XRZC. LUCA.
GeneTreeiENSGT00760000118887.
HOGENOMiHOG000254783.
HOVERGENiHBG007583.
InParanoidiP19091.
KOiK08557.
OMAiMENYSGP.
OrthoDBiEOG091G032J.
PhylomeDBiP19091.
TreeFamiTF350286.

Family and domain databases

Gene3Di1.10.565.10. 2 hits.
3.30.50.10. 1 hit.
InterProiIPR001103. Andrgn_rcpt.
IPR000536. Nucl_hrmn_rcpt_lig-bd.
IPR001628. Znf_hrmn_rcpt.
IPR013088. Znf_NHR/GATA.
[Graphical view]
PfamiPF02166. Androgen_recep. 1 hit.
PF00104. Hormone_recep. 1 hit.
PF00105. zf-C4. 1 hit.
[Graphical view]
PRINTSiPR00521. ANDROGENR.
PR00047. STROIDFINGER.
SMARTiSM00430. HOLI. 1 hit.
SM00399. ZnF_C4. 1 hit.
[Graphical view]
SUPFAMiSSF48508. SSF48508. 1 hit.
PROSITEiPS00031. NUCLEAR_REC_DBD_1. 1 hit.
PS51030. NUCLEAR_REC_DBD_2. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

P19091-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REAIQNPGPR HPEAANIAPP
60 70 80 90 100
GACLQQRQET SPRRRRRQQH TEDGSPQAHI RGPTGYLALE EEQQPSQQQA
110 120 130 140 150
ASEGHPESSC LPEPGAATAP GKGLPQQPPA PPDQDDSAAP STLSLLGPTF
160 170 180 190 200
PGLSSCSADI KDILNEAGTM QLLQQQQQQQ QHQQQHQQHQ QQQEVISEGS
210 220 230 240 250
SARAREATGA PSSSKDSYLG GNSTISDSAK ELCKAVSVSM GLGVEALEHL
260 270 280 290 300
SPGEQLRGDC MYASLLGGPP AVRPTPCAPL PECKGLPLDE GPGKSTEETA
310 320 330 340 350
EYSSFKGGYA KGLEGESLGC SGSSEAGSSG TLEIPSSLSL YKSGALDEAA
360 370 380 390 400
AYQNRDYYNF PLALSGPPHP PPPTHPHARI KLENPLDYGS AWAAAAAQCR
410 420 430 440 450
YGDLGSLHGG SVAGPSTGSP PATTSSSWHT LFTAEEGQLY GPGGGGGSSS
460 470 480 490 500
PSDAGPVAPY GYTRPPQGLT SQESDYSASE VWYPGGVVNR VPYPSPNCVK
510 520 530 540 550
SEMGPWMENY SGPYGDMRLD STRDHVLPID YYFPPQKTCL ICGDEASGCH
560 570 580 590 600
YGALTCGSCK VFFKRAAEGK QKYLCASRND CTIDKFRRKN CPSCRLRKCY
610 620 630 640 650
EAGMTLGARK LKKLGNLKLQ EEGENSNAGS PTEDPSQKMT VSHIEGYECQ
660 670 680 690 700
PIFLNVLEAI EPGVVCAGHD NNQPDSFAAL LSSLNELGER QLVHVVKWAK
710 720 730 740 750
ALPGFRNLHV DDQMAVIQYS WMGLMVFAMG WRSFTNVNSR MLYFAPDLVF
760 770 780 790 800
NEYRMHKSRM YSQCVRMRHL SQEFGWLQIT PQEFLCMKAL LLFSIIPVDG
810 820 830 840 850
LKNQKFFDEL RMNYIKELDR IIACKRKNPT SCSRRFYQLT KLLDSVQPIA
860 870 880 890
RELHQFTFDL LIKSHMVSVD FPEMMAEIIS VQVPKILSGK VKPIYFHTQ
Length:899
Mass (Da):98,194
Last modified:November 1, 1990 - v1
Checksum:iFD9EE07C07F7A568
GO

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
S56585 mRNA. Translation: AAB19916.1.
X53779 mRNA. Translation: CAA37795.1.
M37890 mRNA. Translation: AAA37234.1.
X59592 mRNA. Translation: CAA42160.1.
CCDSiCCDS30294.1.
PIRiA35895.
RefSeqiNP_038504.1. NM_013476.4.
UniGeneiMm.39005.
Mm.394224.
Mm.439657.

Genome annotation databases

EnsembliENSMUST00000052837; ENSMUSP00000052648; ENSMUSG00000046532.
GeneIDi11835.
KEGGimmu:11835.
UCSCiuc009tuv.1. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
S56585 mRNA. Translation: AAB19916.1.
X53779 mRNA. Translation: CAA37795.1.
M37890 mRNA. Translation: AAA37234.1.
X59592 mRNA. Translation: CAA42160.1.
CCDSiCCDS30294.1.
PIRiA35895.
RefSeqiNP_038504.1. NM_013476.4.
UniGeneiMm.39005.
Mm.394224.
Mm.439657.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2QPYX-ray2.50A649-899[»]
ProteinModelPortaliP19091.
SMRiP19091.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi198179. 21 interactors.
DIPiDIP-41803N.
IntActiP19091. 8 interactors.
MINTiMINT-151935.
STRINGi10090.ENSMUSP00000052648.

Chemistry databases

BindingDBiP19091.
ChEMBLiCHEMBL3056.

PTM databases

iPTMnetiP19091.
PhosphoSitePlusiP19091.

Proteomic databases

PaxDbiP19091.
PRIDEiP19091.

Protocols and materials databases

DNASUi11835.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000052837; ENSMUSP00000052648; ENSMUSG00000046532.
GeneIDi11835.
KEGGimmu:11835.
UCSCiuc009tuv.1. mouse.

Organism-specific databases

CTDi367.
MGIiMGI:88064. Ar.

Phylogenomic databases

eggNOGiKOG3575. Eukaryota.
ENOG410XRZC. LUCA.
GeneTreeiENSGT00760000118887.
HOGENOMiHOG000254783.
HOVERGENiHBG007583.
InParanoidiP19091.
KOiK08557.
OMAiMENYSGP.
OrthoDBiEOG091G032J.
PhylomeDBiP19091.
TreeFamiTF350286.

Enzyme and pathway databases

ReactomeiR-MMU-383280. Nuclear Receptor transcription pathway.
R-MMU-5625886. Activated PKN1 stimulates transcription of AR (androgen receptor) regulated genes KLK2 and KLK3.
R-MMU-5689880. Ub-specific processing proteases.

Miscellaneous databases

EvolutionaryTraceiP19091.
PROiP19091.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000046532.
CleanExiMM_AR.
GenevisibleiP19091. MM.

Family and domain databases

Gene3Di1.10.565.10. 2 hits.
3.30.50.10. 1 hit.
InterProiIPR001103. Andrgn_rcpt.
IPR000536. Nucl_hrmn_rcpt_lig-bd.
IPR001628. Znf_hrmn_rcpt.
IPR013088. Znf_NHR/GATA.
[Graphical view]
PfamiPF02166. Androgen_recep. 1 hit.
PF00104. Hormone_recep. 1 hit.
PF00105. zf-C4. 1 hit.
[Graphical view]
PRINTSiPR00521. ANDROGENR.
PR00047. STROIDFINGER.
SMARTiSM00430. HOLI. 1 hit.
SM00399. ZnF_C4. 1 hit.
[Graphical view]
SUPFAMiSSF48508. SSF48508. 1 hit.
PROSITEiPS00031. NUCLEAR_REC_DBD_1. 1 hit.
PS51030. NUCLEAR_REC_DBD_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiANDR_MOUSE
AccessioniPrimary (citable) accession number: P19091
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 1, 1990
Last sequence update: November 1, 1990
Last modified: November 2, 2016
This is version 184 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Miscellaneous

In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity. The hormone-receptor complex appears to recognize discrete DNA sequences upstream of transcriptional start sites.
Transcriptional activity is enhanced by binding to RANBP9.

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.