Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

POU domain, class 5, transcription factor 1

Gene

Pou5f1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcription factor that binds to the octamer motif (5'-ATTTGCAT-3'). Forms a trimeric complex with SOX2 on DNA and controls the expression of a number of genes involved in embryonic development such as YES1, FGF4, UTF1 and ZFP206. Critical for early embryogenesis and for embryonic stem cell pluripotency.2 Publications

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
DNA bindingi223 – 28260HomeoboxPROSITE-ProRule annotationAdd
BLAST

GO - Molecular functioni

  1. chromatin binding Source: MGI
  2. chromatin DNA binding Source: MGI
  3. DNA binding Source: MGI
  4. miRNA binding Source: MGI
  5. poly(A) RNA binding Source: MGI
  6. protein heterodimerization activity Source: MGI
  7. RNA polymerase II core promoter proximal region sequence-specific DNA binding Source: NTNU_SB
  8. RNA polymerase II core promoter proximal region sequence-specific DNA binding transcription factor activity involved in positive regulation of transcription Source: NTNU_SB
  9. RNA polymerase II transcription coactivator activity Source: Ensembl
  10. sequence-specific DNA binding Source: MGI
  11. sequence-specific DNA binding RNA polymerase II transcription factor activity Source: BHF-UCL
  12. sequence-specific DNA binding transcription factor activity Source: MGI
  13. transcription corepressor activity Source: MGI
  14. transcription factor binding Source: UniProtKB
  15. transcription regulatory region DNA binding Source: UniProtKB
  16. ubiquitin protein ligase binding Source: BHF-UCL

GO - Biological processi

  1. blastocyst development Source: BHF-UCL
  2. blastocyst growth Source: MGI
  3. BMP signaling pathway involved in heart induction Source: MGI
  4. cardiac cell fate determination Source: MGI
  5. cell fate commitment Source: MGI
  6. cell fate commitment involved in formation of primary germ layer Source: MGI
  7. ectodermal cell fate commitment Source: MGI
  8. endodermal cell fate commitment Source: MGI
  9. endodermal cell fate specification Source: MGI
  10. germ-line stem cell maintenance Source: MGI
  11. mesodermal cell fate commitment Source: MGI
  12. mRNA transcription from RNA polymerase II promoter Source: BHF-UCL
  13. negative regulation of calcium ion-dependent exocytosis Source: MGI
  14. negative regulation of cell differentiation Source: MGI
  15. negative regulation of gene silencing by miRNA Source: MGI
  16. negative regulation of protein kinase B signaling Source: MGI
  17. negative regulation of transcription, DNA-templated Source: MGI
  18. negative regulation of transcription from RNA polymerase II promoter Source: MGI
  19. positive regulation of catenin import into nucleus Source: MGI
  20. positive regulation of protein kinase B signaling Source: MGI
  21. positive regulation of SMAD protein import into nucleus Source: MGI
  22. positive regulation of transcription, DNA-templated Source: MGI
  23. positive regulation of transcription from RNA polymerase II promoter Source: NTNU_SB
  24. regulation of asymmetric cell division Source: BHF-UCL
  25. regulation of gene expression Source: MGI
  26. regulation of heart induction by regulation of canonical Wnt signaling pathway Source: MGI
  27. regulation of methylation-dependent chromatin silencing Source: MGI
  28. regulation of transcription, DNA-templated Source: MGI
  29. response to cytokine Source: Ensembl
  30. response to organic substance Source: MGI
  31. response to retinoic acid Source: MGI
  32. somatic stem cell maintenance Source: MGI
  33. stem cell differentiation Source: MGI
  34. stem cell maintenance Source: MGI
  35. transcription from RNA polymerase II promoter Source: BHF-UCL
  36. trophectodermal cell differentiation Source: MGI
Complete GO annotation...

Keywords - Molecular functioni

Developmental protein

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding

Enzyme and pathway databases

ReactomeiREACT_273385. Transcriptional regulation of pluripotent stem cells.

Names & Taxonomyi

Protein namesi
Recommended name:
POU domain, class 5, transcription factor 1
Alternative name(s):
NF-A3
Octamer-binding protein 3
Short name:
Oct-3
Octamer-binding protein 4
Short name:
Oct-4
Octamer-binding transcription factor 3
Short name:
OTF-3
Gene namesi
Name:Pou5f1
Synonyms:Oct-3, Oct-4, Otf-3, Otf3
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
ProteomesiUP000000589 Componenti: Chromosome 17

Organism-specific databases

MGIiMGI:101893. Pou5f1.

Subcellular locationi

  1. Cytoplasm By similarity
  2. Nucleus PROSITE-ProRule annotation2 Publications

  3. Note: Expressed in a diffuse and slightly punctuate pattern.

GO - Cellular componenti

  1. cytoplasm Source: MGI
  2. cytosol Source: MGI
  3. nuclear transcription factor complex Source: MGI
  4. nucleolus Source: MGI
  5. nucleoplasm Source: BHF-UCL
  6. nucleus Source: BHF-UCL
  7. transcriptional repressor complex Source: MGI
  8. transcription factor complex Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Nucleus

Pathology & Biotechi

Biotechnological usei

POU5F1/OCT4, SOX2, MYC/c-Myc and KLF4 are the four Yamanaka factors. When combined, these factors are sufficient to reprogram differentiated cells to an embryonic-like state designated iPS (induced pluripotent stem) cells. iPS cells exhibit the morphology and growth properties of ES cells and express ES cell marker genes.1 Publication

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi118 – 1181K → R: Absence of sumoylation. Enhanced protein degradation. Reduced self-renewal ability in ES cells. 70% lower expression of YES1. Reduced DNA binding. No change in nuclear location. No change in nuclear localization. Absence of sumoylation; when associated with R-215 and R-244. 3 Publications
Mutagenesisi120 – 1201E → A: Absence of sumoylation. Enhanced protein degradation. Reduced self-renewal ability in ES cells. 55% lower expression of YES1. 1 Publication
Mutagenesisi215 – 2151K → R: No change in sumoylation; when associated with R-244. Loss of sumoylation. No change in nuclear localization; when associated with R-118 and R-244. 1 Publication
Mutagenesisi244 – 2441K → R: No change in sumoylation. No change in sumoylation; when associated with R-215. Loss of sumoylation; when associated with R-118 and R-215. No change in nuclear localization; when associated with R-118 and R-215. 1 Publication

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 352352POU domain, class 5, transcription factor 1PRO_0000100749Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei106 – 1061Phosphoserine; by MAPKBy similarity
Cross-linki118 – 118Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)
Modified residuei228 – 2281PhosphothreonineBy similarity
Modified residuei229 – 2291PhosphoserineBy similarity
Modified residuei282 – 2821PhosphoserineBy similarity
Modified residuei347 – 3471PhosphoserineBy similarity

Post-translational modificationi

Sumoylation enhances the protein stability, DNA binding and transactivation activity. Sumoylation is required for enhanced YES1 expression.3 Publications
Ubiquitinated; undergoes 'Lys-63'-linked polyubiquitination by WWP2 leading to proteasomal degradation.1 Publication
ERK1/2-mediated phosphorylation at Ser-106 promotes nuclear exclusion and proteasomal degradation. Phosphorylation at Thr-228 and Ser-229 decrease DNA-binding and alters ability to activate transcription (By similarity).By similarity

Keywords - PTMi

Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

PRIDEiP20263.

PTM databases

PhosphoSiteiP20263.

Expressioni

Tissue specificityi

Expressed the totipotent and pluripotent stem cells of the pregastrulation embryo. Also expressed in primordial germ cells and in the female germ line. Absent from adult tissues.2 Publications

Developmental stagei

Down-regulated during differentiation to endoderm and mesoderm.1 Publication

Inductioni

Repressed by retinoic acid (RA).2 Publications

Gene expression databases

BgeeiP20263.
CleanExiMM_POU5F1.
ExpressionAtlasiP20263. baseline and differential.
GenevestigatoriP20263.

Interactioni

Subunit structurei

Interacts with PKM. Interacts with WWP2 (By similarity). Interacts with UBE2I and ZSCAN10.By similarity2 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
Cdk1P114404EBI-1606219,EBI-846949
Ewsr1Q6154513EBI-1606219,EBI-1606991
Mta2Q9R1906EBI-1606219,EBI-904134
NanogQ80Z644EBI-1606219,EBI-2312517
Nr0b1Q610665EBI-1606219,EBI-2312665
Parp1P111032EBI-1606219,EBI-642213
PkmP524806EBI-1606219,EBI-647785
Smad3Q8BUN513EBI-1606219,EBI-2337983
Sox2P484324EBI-1606219,EBI-2313612
Wdr5P619657EBI-1606219,EBI-1247084

Protein-protein interaction databases

BioGridi202313. 303 interactions.
DIPiDIP-29931N.
IntActiP20263. 138 interactions.
STRINGi10090.ENSMUSP00000025271.

Structurei

Secondary structure

1
352
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Helixi132 – 15221Combined sources
Helixi157 – 16812Combined sources
Helixi174 – 1818Combined sources
Helixi187 – 20418Combined sources
Helixi208 – 2125Combined sources
Beta strandi223 – 2253Combined sources
Helixi232 – 2398Combined sources
Turni240 – 2445Combined sources
Helixi250 – 25910Combined sources
Helixi264 – 27916Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1OCPNMR-A217-282[»]
3L1PX-ray2.80A/B131-282[»]
ProteinModelPortaliP20263.
SMRiP20263. Positions 131-282.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP20263.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini131 – 20575POU-specificPROSITE-ProRule annotationAdd
BLAST

Domaini

The POU-specific domain mediates interaction with PKM.By similarity

Sequence similaritiesi

Contains 1 homeobox DNA-binding domain.PROSITE-ProRule annotation
Contains 1 POU-specific domain.PROSITE-ProRule annotation

Keywords - Domaini

Homeobox

Phylogenomic databases

eggNOGiNOG329627.
GeneTreeiENSGT00760000118935.
HOGENOMiHOG000089941.
HOVERGENiHBG057998.
InParanoidiP20263.
KOiK09367.
OMAiLENMFLQ.
OrthoDBiEOG7DJSMG.
PhylomeDBiP20263.
TreeFamiTF316413.

Family and domain databases

Gene3Di1.10.10.60. 1 hit.
1.10.260.40. 1 hit.
InterProiIPR017970. Homeobox_CS.
IPR001356. Homeobox_dom.
IPR009057. Homeodomain-like.
IPR010982. Lambda_DNA-bd_dom.
IPR013847. POU.
IPR015585. POU_dom_5.
IPR000327. POU_specific.
[Graphical view]
PANTHERiPTHR11636:SF15. PTHR11636:SF15. 1 hit.
PfamiPF00046. Homeobox. 1 hit.
PF00157. Pou. 1 hit.
[Graphical view]
PRINTSiPR00028. POUDOMAIN.
SMARTiSM00389. HOX. 1 hit.
SM00352. POU. 1 hit.
[Graphical view]
SUPFAMiSSF46689. SSF46689. 1 hit.
SSF47413. SSF47413. 1 hit.
PROSITEiPS00027. HOMEOBOX_1. 1 hit.
PS50071. HOMEOBOX_2. 1 hit.
PS00035. POU_1. 1 hit.
PS00465. POU_2. 1 hit.
PS51179. POU_3. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

P20263-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MAGHLASDFA FSPPPGGGDG SAGLEPGWVD PRTWLSFQGP PGGPGIGPGS
60 70 80 90 100
EVLGISPCPP AYEFCGGMAY CGPQVGLGLV PQVGVETLQP EGQAGARVES
110 120 130 140 150
NSEGTSSEPC ADRPNAVKLE KVEPTPEESQ DMKALQKELE QFAKLLKQKR
160 170 180 190 200
ITLGYTQADV GLTLGVLFGK VFSQTTICRF EALQLSLKNM CKLRPLLEKW
210 220 230 240 250
VEEADNNENL QEICKSETLV QARKRKRTSI ENRVRWSLET MFLKCPKPSL
260 270 280 290 300
QQITHIANQL GLEKDVVRVW FCNRRQKGKR SSIEYSQREE YEATGTPFPG
310 320 330 340 350
GAVSFPLPPG PHFGTPGYGS PHFTTLYSVP FPEGEAFPSV PVTALGSPMH

SN
Length:352
Mass (Da):38,216
Last modified:February 1, 1991 - v1
Checksum:i757E41DF52286714
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti1 – 2828Missing in CAA36682 (PubMed:1690859).CuratedAdd
BLAST
Sequence conflicti29 – 291V → M in CAA36682 (PubMed:1690859).Curated
Sequence conflicti31 – 311P → S in AAA39844 (PubMed:1915274).Curated
Sequence conflicti31 – 311P → S in AAB19896 (PubMed:1915274).Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M34381 mRNA. Translation: AAA39844.1. Sequence problems.
X52437 mRNA. Translation: CAA36682.1.
S58426
, S58422, S58423, S58424, S58425 Genomic DNA. Translation: AAB19896.1.
BC068268 mRNA. Translation: AAH68268.1.
CCDSiCCDS37600.1.
PIRiA34672.
S17313.
RefSeqiNP_038661.2. NM_013633.3.
UniGeneiMm.17031.

Genome annotation databases

EnsembliENSMUST00000025271; ENSMUSP00000025271; ENSMUSG00000024406.
GeneIDi18999.
KEGGimmu:18999.
UCSCiuc008chu.2. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M34381 mRNA. Translation: AAA39844.1. Sequence problems.
X52437 mRNA. Translation: CAA36682.1.
S58426
, S58422, S58423, S58424, S58425 Genomic DNA. Translation: AAB19896.1.
BC068268 mRNA. Translation: AAH68268.1.
CCDSiCCDS37600.1.
PIRiA34672.
S17313.
RefSeqiNP_038661.2. NM_013633.3.
UniGeneiMm.17031.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1OCPNMR-A217-282[»]
3L1PX-ray2.80A/B131-282[»]
ProteinModelPortaliP20263.
SMRiP20263. Positions 131-282.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi202313. 303 interactions.
DIPiDIP-29931N.
IntActiP20263. 138 interactions.
STRINGi10090.ENSMUSP00000025271.

PTM databases

PhosphoSiteiP20263.

Proteomic databases

PRIDEiP20263.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000025271; ENSMUSP00000025271; ENSMUSG00000024406.
GeneIDi18999.
KEGGimmu:18999.
UCSCiuc008chu.2. mouse.

Organism-specific databases

CTDi5460.
MGIiMGI:101893. Pou5f1.

Phylogenomic databases

eggNOGiNOG329627.
GeneTreeiENSGT00760000118935.
HOGENOMiHOG000089941.
HOVERGENiHBG057998.
InParanoidiP20263.
KOiK09367.
OMAiLENMFLQ.
OrthoDBiEOG7DJSMG.
PhylomeDBiP20263.
TreeFamiTF316413.

Enzyme and pathway databases

ReactomeiREACT_273385. Transcriptional regulation of pluripotent stem cells.

Miscellaneous databases

EvolutionaryTraceiP20263.
NextBioi295410.
PROiP20263.
SOURCEiSearch...

Gene expression databases

BgeeiP20263.
CleanExiMM_POU5F1.
ExpressionAtlasiP20263. baseline and differential.
GenevestigatoriP20263.

Family and domain databases

Gene3Di1.10.10.60. 1 hit.
1.10.260.40. 1 hit.
InterProiIPR017970. Homeobox_CS.
IPR001356. Homeobox_dom.
IPR009057. Homeodomain-like.
IPR010982. Lambda_DNA-bd_dom.
IPR013847. POU.
IPR015585. POU_dom_5.
IPR000327. POU_specific.
[Graphical view]
PANTHERiPTHR11636:SF15. PTHR11636:SF15. 1 hit.
PfamiPF00046. Homeobox. 1 hit.
PF00157. Pou. 1 hit.
[Graphical view]
PRINTSiPR00028. POUDOMAIN.
SMARTiSM00389. HOX. 1 hit.
SM00352. POU. 1 hit.
[Graphical view]
SUPFAMiSSF46689. SSF46689. 1 hit.
SSF47413. SSF47413. 1 hit.
PROSITEiPS00027. HOMEOBOX_1. 1 hit.
PS50071. HOMEOBOX_2. 1 hit.
PS00035. POU_1. 1 hit.
PS00465. POU_2. 1 hit.
PS51179. POU_3. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "A POU-domain transcription factor in early stem cells and germ cells of the mammalian embryo."
    Rosner M.H., Vigano M.A., Ozato K., Timmons P.M., Poirier F., Rigby P.W.J., Staudt L.
    Nature 345:686-692(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], TISSUE SPECIFICITY, DEVELOPMENTAL STAGE.
    Tissue: Embryonic carcinoma.
  2. "New type of POU domain in germ line-specific protein Oct-4."
    Schoeler H.R., Ruppert S., Suzuki N., Chowdhury K., Gruss P.
    Nature 344:435-439(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], TISSUE SPECIFICITY, INDUCTION.
    Tissue: Embryonic carcinoma.
  3. "A novel octamer binding transcription factor is differentially expressed in mouse embryonic cells."
    Okamoto K., Okazawa H., Okuda A., Sakai M., Muramatsu M., Hamada H.
    Cell 60:461-472(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
  4. "The oct3 gene, a gene for an embryonic transcription factor, is controlled by a retinoic acid repressible enhancer."
    Okazawa H., Okamoto K., Ishino F., Ishino-Kaneko T., Takeda S., Toyoda Y., Muramatsu M., Hamada H.
    EMBO J. 10:2997-3005(1991) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA], SEQUENCE REVISION, INDUCTION.
  5. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Strain: C57BL/6J.
    Tissue: Embryo.
  6. Cited for: SUMOYLATION AT LYS-118, MUTAGENESIS OF LYS-118.
  7. "Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors."
    Takahashi K., Yamanaka S.
    Cell 126:663-676(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: BIOTECHNOLOGY.
  8. "Post-translational modification of POU domain transcription factor Oct-4 by SUMO-1."
    Zhang Z., Liao B., Xu M., Jin Y.
    FASEB J. 21:3042-3051(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: SUMOYLATION AT LYS-118, MUTAGENESIS OF LYS-118; GLU-120; LYS-215 AND LYS-244, FUNCTION, SUBCELLULAR LOCATION, INTERACTION WITH UBE2I.
  9. "Sumoylation of Oct4 enhances its stability, DNA binding, and transactivation."
    Wei F., Schoeler H.R., Atchison M.L.
    J. Biol. Chem. 282:21551-21560(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: SUMOYLATION AT LYS-118, MUTAGENESIS OF LYS-118, FUNCTION, SUBCELLULAR LOCATION.
  10. "Zfp206, Oct4, and Sox2 are integrated components of a transcriptional regulatory network in embryonic stem cells."
    Yu H.B., Kunarso G., Hong F.H., Stanton L.W.
    J. Biol. Chem. 284:31327-31335(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH ZSCAN10.
  11. "Wwp2 mediates Oct4 ubiquitination and its own auto-ubiquitination in a dosage-dependent manner."
    Liao B., Jin Y.
    Cell Res. 20:332-344(2010) [PubMed] [Europe PMC] [Abstract]
    Cited for: UBIQUITINATION.
  12. "Secondary structure of the oct-3 POU homeodomain as determined by 1H-15N NMR spectroscopy."
    Morita E.H., Shirakawa M., Hayashi F., Imagawa M., Kyogoku Y.
    FEBS Lett. 321:107-110(1993) [PubMed] [Europe PMC] [Abstract]
    Cited for: STRUCTURE BY NMR OF POU DOMAIN.

Entry informationi

Entry nameiPO5F1_MOUSE
AccessioniPrimary (citable) accession number: P20263
Secondary accession number(s): Q63843
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 1, 1991
Last sequence update: February 1, 1991
Last modified: April 29, 2015
This is version 154 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.