Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Sp110 nuclear body protein

Gene

SP110

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcription factor. May be a nuclear hormone receptor coactivator. Enhances transcription of genes with retinoic acid response elements (RARE).

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri534 – 580PHD-typePROSITE-ProRule annotationAdd BLAST47

GO - Molecular functioni

  • DNA binding Source: ProtInc
  • signal transducer activity Source: ProtInc
  • zinc ion binding Source: InterPro

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Host-virus interaction, Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding, Metal-binding, Zinc

Enzyme and pathway databases

BioCyciZFISH:ENSG00000135899-MONOMER.

Names & Taxonomyi

Protein namesi
Recommended name:
Sp110 nuclear body protein
Alternative name(s):
Interferon-induced protein 41/75
Speckled 110 kDa
Transcriptional coactivator Sp110
Gene namesi
Name:SP110
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 2

Organism-specific databases

HGNCiHGNC:5401. SP110.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Involvement in diseasei

Hepatic venoocclusive disease with immunodeficiency (VODI)1 Publication
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionAutosomal recessive primary immunodeficiency associated with hepatic vascular occlusion and fibrosis. The immunodeficiency is characterized by severe hypogammaglobulinemia, combined T and B-cell immunodeficiency, absent lymph node germinal centers, and absent tissue plasma cells.
See also OMIM:235550

Organism-specific databases

DisGeNETi3431.
MalaCardsiSP110.
MIMi235550. phenotype.
OpenTargetsiENSG00000135899.
ENSG00000280755.
Orphaneti79124. Hepatic veno-occlusive disease - immunodeficiency.
PharmGKBiPA35104.

Polymorphism and mutation databases

BioMutaiSP110.
DMDMi313104323.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000741011 – 689Sp110 nuclear body proteinAdd BLAST689

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei175PhosphoserineBy similarity1
Modified residuei177PhosphoserineBy similarity1
Modified residuei244PhosphoserineCombined sources1
Modified residuei256PhosphoserineCombined sources1
Modified residuei380PhosphoserineCombined sources1

Post-translational modificationi

Phosphorylated (isoform 2).

Keywords - PTMi

Phosphoprotein

Proteomic databases

MaxQBiQ9HB58.
PaxDbiQ9HB58.
PeptideAtlasiQ9HB58.
PRIDEiQ9HB58.

PTM databases

iPTMnetiQ9HB58.
PhosphoSitePlusiQ9HB58.

Expressioni

Tissue specificityi

Highly expressed in peripheral blood leukocytes and spleen. Detected at intermediate levels in thymus, prostate, testis, ovary, small intestine and colon, and at low levels in heart, brain, placenta, lung, liver, skeletal muscle, kidney and pancreas.

Inductioni

By IFNG/IFN-gamma and all-trans retinoic acid (ATRA).

Gene expression databases

BgeeiENSG00000135899.
CleanExiHS_SP110.
ExpressionAtlasiQ9HB58. baseline and differential.
GenevisibleiQ9HB58. HS.

Organism-specific databases

HPAiHPA047036.

Interactioni

Subunit structurei

Isoform 3 interacts with HCV core protein.1 Publication

Protein-protein interaction databases

BioGridi109657. 22 interactors.
IntActiQ9HB58. 18 interactors.
MINTiMINT-1401836.
STRINGi9606.ENSP00000258381.

Structurei

3D structure databases

ProteinModelPortaliQ9HB58.
SMRiQ9HB58.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini1 – 108HSRPROSITE-ProRule annotationAdd BLAST108
Domaini454 – 535SANDPROSITE-ProRule annotationAdd BLAST82
Domaini581 – 676BromoAdd BLAST96

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni525 – 529Nuclear hormone receptor interactionSequence analysis5

Motif

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Motifi281 – 294Nuclear localization signalSequence analysisAdd BLAST14
Motifi428 – 444Nuclear localization signalSequence analysisAdd BLAST17

Sequence similaritiesi

Contains 1 bromo domain.Curated
Contains 1 HSR domain.PROSITE-ProRule annotation
Contains 1 PHD-type zinc finger.PROSITE-ProRule annotation
Contains 1 SAND domain.PROSITE-ProRule annotation

Zinc finger

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri534 – 580PHD-typePROSITE-ProRule annotationAdd BLAST47

Keywords - Domaini

Bromodomain, Zinc-finger

Phylogenomic databases

eggNOGiKOG2177. Eukaryota.
ENOG4111G04. LUCA.
GeneTreeiENSGT00510000046835.
HOGENOMiHOG000089984.
HOVERGENiHBG006294.
InParanoidiQ9HB58.
OMAiHCSKLPV.
OrthoDBiEOG091G0KIE.
PhylomeDBiQ9HB58.
TreeFamiTF335091.

Family and domain databases

Gene3Di1.20.920.10. 1 hit.
3.10.390.10. 1 hit.
3.30.40.10. 1 hit.
InterProiIPR001487. Bromodomain.
IPR004865. HSR_dom.
IPR000770. SAND_dom.
IPR010919. SAND_dom-like.
IPR019786. Zinc_finger_PHD-type_CS.
IPR011011. Znf_FYVE_PHD.
IPR001965. Znf_PHD.
IPR019787. Znf_PHD-finger.
IPR013083. Znf_RING/FYVE/PHD.
[Graphical view]
PfamiPF03172. HSR. 1 hit.
PF01342. SAND. 1 hit.
[Graphical view]
SMARTiSM00297. BROMO. 1 hit.
SM00249. PHD. 1 hit.
SM00258. SAND. 1 hit.
[Graphical view]
SUPFAMiSSF47370. SSF47370. 1 hit.
SSF57903. SSF57903. 1 hit.
SSF63763. SSF63763. 1 hit.
PROSITEiPS51414. HSR. 1 hit.
PS50864. SAND. 1 hit.
PS01359. ZF_PHD_1. 1 hit.
PS50016. ZF_PHD_2. 1 hit.
[Graphical view]

Sequences (7)i

Sequence statusi: Complete.

This entry describes 7 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9HB58-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MFTMTRAMEE ALFQHFMHQK LGIAYAIHKP FPFFEGLLDN SIITKRMYME
60 70 80 90 100
SLEACRNLIP VSRVVHNILT QLERTFNLSL LVTLFSQINL REYPNLVTIY
110 120 130 140 150
RSFKRVGASY EWQSRDTPIL LEAPTGLAEG SSLHTPLALP PPQPPQPSCS
160 170 180 190 200
PCAPRVSEPG TSSQQSDEIL SESPSPSDPV LPLPALIQEG RSTSVTNDKL
210 220 230 240 250
TSKMNAEEDS EEMPSLLTST VQVASDNLIP QIRDKEDPQE MPHSPLGSMP
260 270 280 290 300
EIRDNSPEPN DPEEPQEVSS TPSDKKGKKR KRCIWSTPKR RHKKKSLPGG
310 320 330 340 350
TASSRHGIQK KLKRVDQVPQ KKDDSTCNST VETRAQKART ECARKSRSEE
360 370 380 390 400
IIDGTSEMNE GKRSQKTPST PRRVTQGAAS PGHGIQEKLQ VVDKVTQRKD
410 420 430 440 450
DSTWNSEVMM RVQKARTKCA RKSRLKEKKK EKDICSSSKR RFQKNIHRRG
460 470 480 490 500
KPKSDTVDFH CSKLPVTCGE AKGILYKKKM KHGSSVKCIR NEDGTWLTPN
510 520 530 540 550
EFEVEGKGRN AKNWKRNIRC EGMTLGELLK RKNSDECEVC CQGGQLLCCG
560 570 580 590 600
TCPRVFHEDC HIPPVEAKRM LWSCTFCRMK RSSGSQQCHH VSKTLERQMQ
610 620 630 640 650
PQDQLIRDYG EPFQEAMWLD LVKERLITEM YTVAWFVRDM RLMFRNHKTF
660 670 680
YKASDFGQVG LDLEAEFEKD LKDVLGFHEA NDGGFWTLP
Length:689
Mass (Da):78,396
Last modified:November 30, 2010 - v5
Checksum:i31552E1A4C498EF7
GO
Isoform 2 (identifier: Q9HB58-2) [UniParc]FASTAAdd to basket
Also known as: IFI75, 75

The sequence of this isoform differs from the canonical sequence as follows:
     1-203: Missing.
     606-611: IRDYGE → NVSSSS
     612-689: Missing.

Show »
Length:408
Mass (Da):46,285
Checksum:iC9AFB49FD426F9F5
GO
Isoform 3 (identifier: Q9HB58-3) [UniParc]FASTAAdd to basket
Also known as: Sp110b

The sequence of this isoform differs from the canonical sequence as follows:
     531-549: RKNSDECEVCCQGGQLLCC → SGLLLCPPRINLKRELNSK
     550-689: Missing.

Show »
Length:549
Mass (Da):61,898
Checksum:iA074ED9825817FC3
GO
Isoform 4 (identifier: Q9HB58-4) [UniParc]FASTAAdd to basket
Also known as: IFI41, 41

The sequence of this isoform differs from the canonical sequence as follows:
     1-251: Missing.
     252-275: IRDNSPEPNDPEEPQEVSSTPSDK → MASSGVKNTPRWRRKAPHGRERKE
     300-349: Missing.
     531-549: RKNSDECEVCCQGGQLLCC → SGLLLCPPRINLKRELNSK
     550-689: Missing.

Show »
Length:248
Mass (Da):28,510
Checksum:i54B0C13AA73E51E1
GO
Isoform 5 (identifier: Q9HB58-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     300-303: GTAS → AL
     531-549: RKNSDECEVCCQGGQLLCC → SGLLLCPPRINLKRELNSK
     550-689: Missing.

Show »
Length:547
Mass (Da):61,766
Checksum:iFA0D7C52A2025D87
GO
Isoform 6 (identifier: Q9HB58-6) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     605-605: L → LKCEFLLLKAYCHPQSSFFTGIPFN

Show »
Length:713
Mass (Da):81,169
Checksum:i146D7A3D0EEB99EA
GO
Isoform 7 (identifier: Q9HB58-7) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MGRGFRM
     531-549: RKNSDECEVCCQGGQLLCC → SGLLLCPPRINLKRELNSK
     550-689: Missing.

Note: No experimental confirmation available.
Show »
Length:555
Mass (Da):62,603
Checksum:i44866BA4C9BBA7D0
GO

Sequence cautioni

The sequence AAF99318 differs from that shown. Reason: Frameshift at several positions.Curated
Isoform 3 : The sequence AAF99318 differs from that shown. Reason: Frameshift at position 534.Curated
The sequence AAG09826 differs from that shown. Reason: Frameshift at positions 141 and 143.Curated
The sequence AK026488 differs from that shown. Reason: Frameshift at positions 296, 542 and 567.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti167D → T in AAF99318 (PubMed:10913195).Curated1
Sequence conflicti167D → T in AAG09826 (PubMed:10913195).Curated1
Sequence conflicti464L → S in AAA18806 (PubMed:7693701).Curated1
Sequence conflicti570M → I in AK026488 (PubMed:14702039).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_0360298M → T in a breast cancer sample; somatic mutation. 1 PublicationCorresponds to variant rs200067258dbSNPEnsembl.1
Natural variantiVAR_027170112W → R.2 PublicationsCorresponds to variant rs1129411dbSNPEnsembl.1
Natural variantiVAR_027171128A → V.1 PublicationCorresponds to variant rs11556887dbSNPEnsembl.1
Natural variantiVAR_047051173S → L.Corresponds to variant rs41552315dbSNPEnsembl.1
Natural variantiVAR_027172206A → V.1 PublicationCorresponds to variant rs28930679dbSNPEnsembl.1
Natural variantiVAR_027173207E → K.3 PublicationsCorresponds to variant rs9061dbSNPEnsembl.1
Natural variantiVAR_047052210S → A.Corresponds to variant rs1063154dbSNPEnsembl.1
Natural variantiVAR_027174212E → G.1 PublicationCorresponds to variant rs1047254dbSNPEnsembl.1
Natural variantiVAR_027175249M → V.1 PublicationCorresponds to variant rs3769838dbSNPEnsembl.1
Natural variantiVAR_027176267E → G.1 PublicationCorresponds to variant rs1129425dbSNPEnsembl.1
Natural variantiVAR_027177299G → R.6 PublicationsCorresponds to variant rs1365776dbSNPEnsembl.1
Natural variantiVAR_027178367T → M.1 PublicationCorresponds to variant rs59573011dbSNPEnsembl.1
Natural variantiVAR_027179425L → S Polymorphism; may be associated with increased susceptibility to tuberculosis. 3 PublicationsCorresponds to variant rs3948464dbSNPEnsembl.1
Natural variantiVAR_027180523M → T.2 PublicationsCorresponds to variant rs1135791dbSNPEnsembl.1
Natural variantiVAR_027181579M → I.1 PublicationCorresponds to variant rs3948463dbSNPEnsembl.1
Natural variantiVAR_036030683G → S in a breast cancer sample; somatic mutation. 1 Publication1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0059911 – 251Missing in isoform 4. 1 PublicationAdd BLAST251
Alternative sequenceiVSP_0059921 – 203Missing in isoform 2. 1 PublicationAdd BLAST203
Alternative sequenceiVSP_0460791M → MGRGFRM in isoform 7. 1 Publication1
Alternative sequenceiVSP_005994252 – 275IRDNS…TPSDK → MASSGVKNTPRWRRKAPHGR ERKE in isoform 4. 1 PublicationAdd BLAST24
Alternative sequenceiVSP_005995300 – 349Missing in isoform 4. 1 PublicationAdd BLAST50
Alternative sequenceiVSP_005996300 – 303GTAS → AL in isoform 5. 1 Publication4
Alternative sequenceiVSP_005997531 – 549RKNSD…QLLCC → SGLLLCPPRINLKRELNSK in isoform 3, isoform 4, isoform 5 and isoform 7. 4 PublicationsAdd BLAST19
Alternative sequenceiVSP_006000550 – 689Missing in isoform 3, isoform 4, isoform 5 and isoform 7. 4 PublicationsAdd BLAST140
Alternative sequenceiVSP_035593605L → LKCEFLLLKAYCHPQSSFFT GIPFN in isoform 6. 1 Publication1
Alternative sequenceiVSP_006001606 – 611IRDYGE → NVSSSS in isoform 2. 1 Publication6
Alternative sequenceiVSP_006002612 – 689Missing in isoform 2. 1 PublicationAdd BLAST78

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
L22342 mRNA. Translation: AAA18806.1.
L22343 mRNA. Translation: AAD13402.1.
AF280094 mRNA. Translation: AAF99318.1. Frameshift.
AF280095 mRNA. Translation: AAG09826.1. Frameshift.
AK026488 mRNA. No translation available.
AK301097 mRNA. Translation: BAG62696.1.
AC009950 Genomic DNA. Translation: AAX93281.1.
CH471063 Genomic DNA. Translation: EAW70915.1.
BC019059 mRNA. Translation: AAH19059.1.
CCDSiCCDS2474.1. [Q9HB58-1]
CCDS2475.1. [Q9HB58-6]
CCDS2476.1. [Q9HB58-3]
CCDS54435.1. [Q9HB58-7]
PIRiA49515.
RefSeqiNP_001171944.1. NM_001185015.1. [Q9HB58-7]
NP_004500.3. NM_004509.3.
NP_536349.2. NM_080424.2.
UniGeneiHs.145150.

Genome annotation databases

EnsembliENST00000258381; ENSP00000258381; ENSG00000135899. [Q9HB58-6]
ENST00000258382; ENSP00000258382; ENSG00000135899. [Q9HB58-3]
ENST00000358662; ENSP00000351488; ENSG00000135899. [Q9HB58-1]
GeneIDi3431.
KEGGihsa:3431.
UCSCiuc002vqg.5. human. [Q9HB58-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Web resourcesi

SP110base

SP110 mutation db

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
L22342 mRNA. Translation: AAA18806.1.
L22343 mRNA. Translation: AAD13402.1.
AF280094 mRNA. Translation: AAF99318.1. Frameshift.
AF280095 mRNA. Translation: AAG09826.1. Frameshift.
AK026488 mRNA. No translation available.
AK301097 mRNA. Translation: BAG62696.1.
AC009950 Genomic DNA. Translation: AAX93281.1.
CH471063 Genomic DNA. Translation: EAW70915.1.
BC019059 mRNA. Translation: AAH19059.1.
CCDSiCCDS2474.1. [Q9HB58-1]
CCDS2475.1. [Q9HB58-6]
CCDS2476.1. [Q9HB58-3]
CCDS54435.1. [Q9HB58-7]
PIRiA49515.
RefSeqiNP_001171944.1. NM_001185015.1. [Q9HB58-7]
NP_004500.3. NM_004509.3.
NP_536349.2. NM_080424.2.
UniGeneiHs.145150.

3D structure databases

ProteinModelPortaliQ9HB58.
SMRiQ9HB58.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi109657. 22 interactors.
IntActiQ9HB58. 18 interactors.
MINTiMINT-1401836.
STRINGi9606.ENSP00000258381.

PTM databases

iPTMnetiQ9HB58.
PhosphoSitePlusiQ9HB58.

Polymorphism and mutation databases

BioMutaiSP110.
DMDMi313104323.

Proteomic databases

MaxQBiQ9HB58.
PaxDbiQ9HB58.
PeptideAtlasiQ9HB58.
PRIDEiQ9HB58.

Protocols and materials databases

DNASUi3431.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000258381; ENSP00000258381; ENSG00000135899. [Q9HB58-6]
ENST00000258382; ENSP00000258382; ENSG00000135899. [Q9HB58-3]
ENST00000358662; ENSP00000351488; ENSG00000135899. [Q9HB58-1]
GeneIDi3431.
KEGGihsa:3431.
UCSCiuc002vqg.5. human. [Q9HB58-1]

Organism-specific databases

CTDi3431.
DisGeNETi3431.
GeneCardsiSP110.
GeneReviewsiSP110.
HGNCiHGNC:5401. SP110.
HPAiHPA047036.
MalaCardsiSP110.
MIMi235550. phenotype.
604457. gene.
neXtProtiNX_Q9HB58.
OpenTargetsiENSG00000135899.
ENSG00000280755.
Orphaneti79124. Hepatic veno-occlusive disease - immunodeficiency.
PharmGKBiPA35104.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG2177. Eukaryota.
ENOG4111G04. LUCA.
GeneTreeiENSGT00510000046835.
HOGENOMiHOG000089984.
HOVERGENiHBG006294.
InParanoidiQ9HB58.
OMAiHCSKLPV.
OrthoDBiEOG091G0KIE.
PhylomeDBiQ9HB58.
TreeFamiTF335091.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000135899-MONOMER.

Miscellaneous databases

ChiTaRSiSP110. human.
GeneWikiiSP110.
GenomeRNAii3431.
PROiQ9HB58.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000135899.
CleanExiHS_SP110.
ExpressionAtlasiQ9HB58. baseline and differential.
GenevisibleiQ9HB58. HS.

Family and domain databases

Gene3Di1.20.920.10. 1 hit.
3.10.390.10. 1 hit.
3.30.40.10. 1 hit.
InterProiIPR001487. Bromodomain.
IPR004865. HSR_dom.
IPR000770. SAND_dom.
IPR010919. SAND_dom-like.
IPR019786. Zinc_finger_PHD-type_CS.
IPR011011. Znf_FYVE_PHD.
IPR001965. Znf_PHD.
IPR019787. Znf_PHD-finger.
IPR013083. Znf_RING/FYVE/PHD.
[Graphical view]
PfamiPF03172. HSR. 1 hit.
PF01342. SAND. 1 hit.
[Graphical view]
SMARTiSM00297. BROMO. 1 hit.
SM00249. PHD. 1 hit.
SM00258. SAND. 1 hit.
[Graphical view]
SUPFAMiSSF47370. SSF47370. 1 hit.
SSF57903. SSF57903. 1 hit.
SSF63763. SSF63763. 1 hit.
PROSITEiPS51414. HSR. 1 hit.
PS50864. SAND. 1 hit.
PS01359. ZF_PHD_1. 1 hit.
PS50016. ZF_PHD_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiSP110_HUMAN
AccessioniPrimary (citable) accession number: Q9HB58
Secondary accession number(s): B4DVI4
, F5H1M1, Q14976, Q14977, Q53TG2, Q8WUZ6, Q9HCT8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: August 2, 2002
Last sequence update: November 30, 2010
Last modified: November 2, 2016
This is version 157 of the entry and version 5 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 2
    Human chromosome 2: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.