Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Transcription factor Sp3

Gene

Sp3

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcriptional factor that can act as an activator or repressor depending on isoform and/or post-translational modifications. Binds to GT and GC boxes promoter elements. Competes with SP1 for the GC-box promoters. Weak activator of transcription but can activate a number of genes involved in different processes such as cell-cycle regulation, hormone-induction and house-keeping (By similarity).By similarity1 Publication

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri623 – 64725C2H2-type 1PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri653 – 67725C2H2-type 2PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri683 – 70523C2H2-type 3PROSITE-ProRule annotationAdd
BLAST

GO - Molecular functioni

  • chromatin binding Source: MGI
  • core promoter proximal region sequence-specific DNA binding Source: MGI
  • DNA binding Source: MGI
  • double-stranded DNA binding Source: MGI
  • metal ion binding Source: UniProtKB-KW
  • RNA polymerase II core promoter sequence-specific DNA binding Source: MGI
  • RNA polymerase II regulatory region sequence-specific DNA binding Source: MGI
  • transcriptional repressor activity, RNA polymerase II core promoter proximal region sequence-specific binding Source: UniProtKB
  • transcription factor activity, sequence-specific DNA binding Source: MGI

GO - Biological processi

  • B cell differentiation Source: MGI
  • definitive hemopoiesis Source: MGI
  • embryonic camera-type eye morphogenesis Source: MGI
  • embryonic placenta development Source: MGI
  • embryonic process involved in female pregnancy Source: MGI
  • embryonic skeletal system development Source: MGI
  • enucleate erythrocyte differentiation Source: MGI
  • erythrocyte differentiation Source: MGI
  • granulocyte differentiation Source: MGI
  • in utero embryonic development Source: MGI
  • liver development Source: MGI
  • lung development Source: MGI
  • megakaryocyte differentiation Source: MGI
  • monocyte differentiation Source: MGI
  • natural killer cell differentiation Source: MGI
  • negative regulation of transcription, DNA-templated Source: UniProtKB
  • negative regulation of transcription from RNA polymerase II promoter Source: GOC
  • ossification Source: MGI
  • positive regulation of transcription, DNA-templated Source: MGI
  • positive regulation of transcription from RNA polymerase II promoter Source: MGI
  • regulation of transcription, DNA-templated Source: MGI
  • T cell differentiation Source: MGI
  • transcription, DNA-templated Source: UniProtKB-KW
  • trophectodermal cell differentiation Source: MGI
Complete GO annotation...

Keywords - Molecular functioni

Activator

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding, Metal-binding, Zinc

Enzyme and pathway databases

ReactomeiR-MMU-3232118. SUMOylation of transcription factors.

Names & Taxonomyi

Protein namesi
Recommended name:
Transcription factor Sp3
Gene namesi
Name:Sp3
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 2

Organism-specific databases

MGIiMGI:1277166. Sp3.

Subcellular locationi

  • Nucleus
  • NucleusPML body By similarity

  • Note: Localizes to the nuclear periphery and in nuclear dots when sumoylated. Some localization in PML nuclear bodies (By similarity).By similarity

GO - Cellular componenti

  • cytoplasm Source: MGI
  • Golgi apparatus Source: MGI
  • nucleoplasm Source: MGI
  • nucleus Source: MGI
  • plasma membrane Source: MGI
  • PML body Source: UniProtKB-SubCell
  • transcriptional repressor complex Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 783783Transcription factor Sp3PRO_0000047142Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei72 – 721PhosphoserineCombined sources
Cross-linki122 – 122Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)By similarity
Modified residuei553 – 5531N6-acetyllysine; alternate1 Publication
Cross-linki553 – 553Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO); alternateBy similarity
Cross-linki553 – 553Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2); alternateBy similarity
Modified residuei565 – 5651PhosphoserineCombined sources
Modified residuei568 – 5681PhosphoserineCombined sources
Modified residuei648 – 6481PhosphoserineBy similarity

Post-translational modificationi

Acetylated by histone acetyltransferase p300, deacetylated by HDACs. Acetylation/deacetylation states regulate transcriptional activity. Acetylation appears to activate transcription. Alternate sumoylation and acetylation at Lys-553 also control transcriptional activity (By similarity).By similarity
Sumoylated on all isoforms. Sumoylated on 2 sites in longer isoforms with Lys-553 being the major site. Sumoylation at this site promotes nuclear localization to the nuclear periphery, nuclear dots and PML nuclear bodies. Sumoylation on Lys-553 represses the transactivation activity, except for the largest isoform which has little effect on transactivation. Alternate sumoylation and acetylation at Lys-553 also control transcriptional activity (By similarity).By similarity

Keywords - PTMi

Acetylation, Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiO70494.
MaxQBiO70494.
PaxDbiO70494.
PRIDEiO70494.

PTM databases

iPTMnetiO70494.
PhosphoSiteiO70494.

Expressioni

Gene expression databases

BgeeiO70494.
CleanExiMM_SP3.
GenevisibleiO70494. MM.

Interactioni

Subunit structurei

Interacts with HLTF; the interaction may be required for basal transcriptional activity of HLTF. Interacts with HDAC1; the interaction deacetylates SP3 and regulates its transcriptional activity. Interacts with HDAC2 (preferably the CK2-phosphorylated form); the interaction deacetylates SP3 and regulates its transcriptional activity. Ceramides can also regulate acetylation/deacetylation events through altering the interaction of HDAC with SP3. Interacts with MEIS2 isoform Meis2D and PBX1 isoform PBX1a (By similarity).By similarity

Protein-protein interaction databases

BioGridi203418. 8 interactions.
IntActiO70494. 1 interaction.
MINTiMINT-1605936.
STRINGi10090.ENSMUSP00000099750.

Structurei

3D structure databases

ProteinModelPortaliO70494.
SMRiO70494. Positions 575-709.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni140 – 239100Transactivation domain (Gln-rich)By similarityAdd
BLAST
Regioni352 – 501150Transactivation domain (Gln-rich)By similarityAdd
BLAST
Regioni536 – 62287Repressor domainBy similarityAdd
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi20 – 289Poly-Gly
Compositional biasi32 – 409Poly-Gln
Compositional biasi45 – 10258Ala-richAdd
BLAST
Compositional biasi336 – 3394Poly-Ser

Sequence similaritiesi

Contains 3 C2H2-type zinc fingers.PROSITE-ProRule annotation

Zinc finger

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri623 – 64725C2H2-type 1PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri653 – 67725C2H2-type 2PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri683 – 70523C2H2-type 3PROSITE-ProRule annotationAdd
BLAST

Keywords - Domaini

Repeat, Zinc-finger

Phylogenomic databases

eggNOGiKOG1721. Eukaryota.
COG5048. LUCA.
GeneTreeiENSGT00760000118984.
HOGENOMiHOG000234295.
HOVERGENiHBG008933.
InParanoidiO70494.
KOiK09193.
OMAiPQIQSTD.
OrthoDBiEOG76HQ15.
PhylomeDBiO70494.
TreeFamiTF350150.

Family and domain databases

Gene3Di3.30.160.60. 3 hits.
InterProiIPR030450. Sp1_fam.
IPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
IPR013087. Znf_C2H2/integrase_DNA-bd.
[Graphical view]
PANTHERiPTHR23235. PTHR23235. 1 hit.
PfamiPF00096. zf-C2H2. 1 hit.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 3 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 3 hits.
PS50157. ZINC_FINGER_C2H2_2. 3 hits.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: O70494-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MTAPEKPVKQ EEMAALDVDG GGGGGGHGEY LQQQQQQQQQ HGNGAAAAAA
60 70 80 90 100
QDTQPSPLAL LAATCSKIGP PSPGDDDEEA AVAAAAGVPA AAAGATGDLA
110 120 130 140 150
SAQLGGAPNR WEVLSATPTT IKDEAGNLVQ IPGAATSSGQ YVLPLQNLQN
160 170 180 190 200
QQIFSVAPGS DSSNGTVSNV QYQVIPQIQS TDAQQVQIGF TGSSDNGGIN
210 220 230 240 250
QENSQIQIIP GSNQTLLASG TPPANIQNLI PQTGQVQVQG VAIGGSSFPG
260 270 280 290 300
QTQVVANVPL GLPGNITFVP INSVDLDSLG LSGSSQTMTA GINADGHLIN
310 320 330 340 350
TGQAMDSSDN SERTGERVSP DVNETNADTD LFVPTSSSSQ LPVTIDSTGI
360 370 380 390 400
LQQNTNSLTT TSGQVHSSDL QGNYIQSPVS EETQAQNIQV STAQPVVQHL
410 420 430 440 450
QLQDSQQPTS QAQIVQGITP QTIHGVQASG QNISQQALQN LQLQLNPGTF
460 470 480 490 500
LIQAQTVTPS GQITWQTFQV QGVQNLQNLQ IQNTAAQQIT LTPVQTLTLG
510 520 530 540 550
QVAAGGALTS TPVSLSTGQL PNLQTVTVNS IDSTGIQLHP GENADSPADI
560 570 580 590 600
RIKEEEPDPE EWQLSGDSTL NTNDLTHLRV QVVDEEGDQQ HQEGKRLRRV
610 620 630 640 650
ACTCPNCKEG GGRGTNLGKK KQHICHIPGC GKVYGKTSHL RAHLRWHSGE
660 670 680 690 700
RPFICNWMFC GKRFTRSDEL QRHRRTHTGE KKFVCPECSK RFMRSDHLAK
710 720 730 740 750
HIKTHQNKKV IHSSSTVLAS VEAGRDDALI TAGGTTLILA NIQQGSVSGI
760 770 780
GTVNTSATSN QDILTNTEIP LQLVTVSGNE TME
Length:783
Mass (Da):82,362
Last modified:January 10, 2006 - v2
Checksum:iE45A00D566454D61
GO
Isoform 2 (identifier: O70494-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     52-95: Missing.

Show »
Length:739
Mass (Da):78,364
Checksum:i22CEB08B93A699B0
GO
Isoform 3 (identifier: O70494-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-64: Missing.
     65-94: CSKIGPPSPGDDDEEAAVAAAAGVPAAAAG → MKEDRAAIAGRRRRGRRSCAHDEKTAADKR

Note: No experimental confirmation available.
Show »
Length:719
Mass (Da):76,729
Checksum:i8B16F3B85E282654
GO

Sequence cautioni

The sequence AAH27797.2 differs from that shown. Reason: Erroneous initiation. Curated
The sequence BAE21310.1 differs from that shown. Reason: Erroneous initiation. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti231 – 2333PQT → TRP in AAH27797 (PubMed:15489334).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 6464Missing in isoform 3. 1 PublicationVSP_016782Add
BLAST
Alternative sequencei52 – 9544Missing in isoform 2. 1 PublicationVSP_016783Add
BLAST
Alternative sequencei65 – 9430CSKIG…AAAAG → MKEDRAAIAGRRRRGRRSCA HDEKTAADKR in isoform 3. 1 PublicationVSP_016784Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK004607 mRNA. Translation: BAC25090.1.
AK132702 mRNA. Translation: BAE21310.1. Different initiation.
AL844840 Genomic DNA. Translation: CAM21417.1.
BC027797 mRNA. Translation: AAH27797.2. Different initiation.
BC079874 mRNA. Translation: AAH79874.1.
AF062567 mRNA. Translation: AAC16322.1.
CCDSiCCDS16122.1. [O70494-1]
RefSeqiNP_001018052.1. NM_001018042.3. [O70494-1]
NP_001091895.1. NM_001098425.1.
UniGeneiMm.124328.
Mm.446209.
Mm.465887.

Genome annotation databases

EnsembliENSMUST00000066003; ENSMUSP00000065807; ENSMUSG00000027109. [O70494-2]
ENSMUST00000102689; ENSMUSP00000099750; ENSMUSG00000027109. [O70494-1]
GeneIDi20687.
KEGGimmu:20687.
UCSCiuc008kca.2. mouse. [O70494-1]
uc008kcb.2. mouse. [O70494-2]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK004607 mRNA. Translation: BAC25090.1.
AK132702 mRNA. Translation: BAE21310.1. Different initiation.
AL844840 Genomic DNA. Translation: CAM21417.1.
BC027797 mRNA. Translation: AAH27797.2. Different initiation.
BC079874 mRNA. Translation: AAH79874.1.
AF062567 mRNA. Translation: AAC16322.1.
CCDSiCCDS16122.1. [O70494-1]
RefSeqiNP_001018052.1. NM_001018042.3. [O70494-1]
NP_001091895.1. NM_001098425.1.
UniGeneiMm.124328.
Mm.446209.
Mm.465887.

3D structure databases

ProteinModelPortaliO70494.
SMRiO70494. Positions 575-709.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi203418. 8 interactions.
IntActiO70494. 1 interaction.
MINTiMINT-1605936.
STRINGi10090.ENSMUSP00000099750.

PTM databases

iPTMnetiO70494.
PhosphoSiteiO70494.

Proteomic databases

EPDiO70494.
MaxQBiO70494.
PaxDbiO70494.
PRIDEiO70494.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000066003; ENSMUSP00000065807; ENSMUSG00000027109. [O70494-2]
ENSMUST00000102689; ENSMUSP00000099750; ENSMUSG00000027109. [O70494-1]
GeneIDi20687.
KEGGimmu:20687.
UCSCiuc008kca.2. mouse. [O70494-1]
uc008kcb.2. mouse. [O70494-2]

Organism-specific databases

CTDi6670.
MGIiMGI:1277166. Sp3.

Phylogenomic databases

eggNOGiKOG1721. Eukaryota.
COG5048. LUCA.
GeneTreeiENSGT00760000118984.
HOGENOMiHOG000234295.
HOVERGENiHBG008933.
InParanoidiO70494.
KOiK09193.
OMAiPQIQSTD.
OrthoDBiEOG76HQ15.
PhylomeDBiO70494.
TreeFamiTF350150.

Enzyme and pathway databases

ReactomeiR-MMU-3232118. SUMOylation of transcription factors.

Miscellaneous databases

NextBioi299211.
PROiO70494.
SOURCEiSearch...

Gene expression databases

BgeeiO70494.
CleanExiMM_SP3.
GenevisibleiO70494. MM.

Family and domain databases

Gene3Di3.30.160.60. 3 hits.
InterProiIPR030450. Sp1_fam.
IPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
IPR013087. Znf_C2H2/integrase_DNA-bd.
[Graphical view]
PANTHERiPTHR23235. PTHR23235. 1 hit.
PfamiPF00096. zf-C2H2. 1 hit.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 3 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 3 hits.
PS50157. ZINC_FINGER_C2H2_2. 3 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "The transcriptional landscape of the mammalian genome."
    Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.
    , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
    Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 3).
    Strain: C57BL/6J.
    Tissue: Lung and Testis.
  2. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: C57BL/6J.
  3. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Strain: C57BL/6J and FVB/N.
    Tissue: Brain and Mammary tumor.
  4. "Sp family transcription factors regulate expression of rat D2 dopamine receptor gene."
    Yajima S., Lee S.H., Minowa T., Mouradian M.M.
    DNA Cell Biol. 17:471-479(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 15-783 (ISOFORM 2).
    Tissue: Neuroblastoma.
  5. "Sp3 is involved in the regulation of SOCS3 gene expression."
    Ehlting C., Haussinger D., Bode J.G.
    Biochem. J. 387:737-745(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: ACETYLATION AT LYS-553, FUNCTION.
  6. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-72; SER-565 AND SER-568, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Brain, Kidney, Lung and Spleen.

Entry informationi

Entry nameiSP3_MOUSE
AccessioniPrimary (citable) accession number: O70494
Secondary accession number(s): A2AQK9
, Q68FF2, Q8CF64, Q8K378
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 9, 2003
Last sequence update: January 10, 2006
Last modified: March 16, 2016
This is version 142 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.