Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Superkiller viralicidic activity 2-like 2

Gene

SKIV2L2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

May be involved in pre-mRNA splicing. Associated with the RNA exosome complex and involved in the 3'-processing of the 7S pre-RNA to the mature 5.8S rRNA.1 Publication

Catalytic activityi

ATP + H2O = ADP + phosphate.

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Nucleotide bindingi161 – 1688ATPPROSITE-ProRule annotation

GO - Molecular functioni

  • ATP binding Source: UniProtKB-KW
  • poly(A) RNA binding Source: UniProtKB
  • RNA helicase activity Source: InterPro

GO - Biological processi

  • maturation of 5.8S rRNA Source: UniProtKB
  • mRNA splicing, via spliceosome Source: UniProtKB
  • RNA catabolic process Source: InterPro
  • rRNA processing Source: Reactome
Complete GO annotation...

Keywords - Molecular functioni

Helicase, Hydrolase

Keywords - Biological processi

mRNA processing, mRNA splicing, rRNA processing

Keywords - Ligandi

ATP-binding, Nucleotide-binding

Enzyme and pathway databases

ReactomeiR-HSA-6791226. Major pathway of rRNA processing in the nucleolus.

Names & Taxonomyi

Protein namesi
Recommended name:
Superkiller viralicidic activity 2-like 2 (EC:3.6.4.13)
Alternative name(s):
ATP-dependent RNA helicase DOB11 Publication
ATP-dependent RNA helicase SKIV2L2
TRAMP-like complex helicase
Gene namesi
Name:SKIV2L2
Synonyms:DOB1, KIAA0052, Mtr4
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 5

Organism-specific databases

HGNCiHGNC:18734. SKIV2L2.

Subcellular locationi

GO - Cellular componenti

  • catalytic step 2 spliceosome Source: UniProtKB
  • nucleolus Source: UniProtKB-SubCell
  • nucleoplasm Source: Reactome
  • nucleus Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Nucleus, Spliceosome

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA134901921.

Polymorphism and mutation databases

BioMutaiSKIV2L2.
DMDMi71153172.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Initiator methionineiRemovedCombined sources1 Publication
Chaini2 – 10421041Superkiller viralicidic activity 2-like 2PRO_0000102094Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei2 – 21N-acetylalanineCombined sources1 Publication
Modified residuei51 – 511N6-acetyllysineCombined sources
Modified residuei78 – 781N6-acetyllysineBy similarity
Cross-linki684 – 684Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources

Keywords - PTMi

Acetylation, Isopeptide bond, Ubl conjugation

Proteomic databases

EPDiP42285.
MaxQBiP42285.
PaxDbiP42285.
PeptideAtlasiP42285.
PRIDEiP42285.

2D gel databases

SWISS-2DPAGEP42285.

PTM databases

iPTMnetiP42285.
PhosphoSiteiP42285.
SwissPalmiP42285.

Expressioni

Gene expression databases

BgeeiP42285.
CleanExiHS_SKIV2L2.
ExpressionAtlasiP42285. baseline and differential.
GenevisibleiP42285. HS.

Organism-specific databases

HPAiHPA037379.

Interactioni

Subunit structurei

Component of a TRAMP-like complex, an ATP-dependent exosome regulatory complex consisting of a helicase (SKIV2L2/MTR4), an oligadenylate polymerase (PAPD5 or PAPD7), and a substrate specific RNA-binding factor (ZCCHC7 or ZCCHC8). Several TRAMP-like complexes exist with specific compositions and are associated with nuclear, or nucleolar RNA exosomes. Identified in the spliceosome C complex. Interacts with isoform 1 of NVL in an ATP-dependent manner.3 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
MPHOSPH6Q995472EBI-347612,EBI-373187

Protein-protein interaction databases

BioGridi117064. 101 interactions.
IntActiP42285. 41 interactions.
MINTiMINT-3015504.
STRINGi9606.ENSP00000230640.

Structurei

3D structure databases

ProteinModelPortaliP42285.
SMRiP42285. Positions 112-1040.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini148 – 304157Helicase ATP-bindingPROSITE-ProRule annotationAdd
BLAST
Domaini405 – 577173Helicase C-terminalPROSITE-ProRule annotationAdd
BLAST

Motif

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Motifi252 – 2554DEIH box

Sequence similaritiesi

Belongs to the helicase family. SKI2 subfamily.Curated
Contains 1 helicase ATP-binding domain.PROSITE-ProRule annotation
Contains 1 helicase C-terminal domain.PROSITE-ProRule annotation

Phylogenomic databases

eggNOGiKOG0948. Eukaryota.
COG4581. LUCA.
GeneTreeiENSGT00820000127042.
HOGENOMiHOG000163047.
HOVERGENiHBG104255.
InParanoidiP42285.
KOiK12598.
OMAiHDVSYPE.
OrthoDBiEOG7XSTCX.
PhylomeDBiP42285.
TreeFamiTF300597.

Family and domain databases

Gene3Di3.40.50.300. 3 hits.
InterProiIPR011545. DEAD/DEAH_box_helicase_dom.
IPR014001. Helicase_ATP-bd.
IPR001650. Helicase_C.
IPR027417. P-loop_NTPase.
IPR025696. rRNA_proc-arch_dom.
IPR016438. Ski2.
IPR012961. Ski2_C.
[Graphical view]
PfamiPF00270. DEAD. 1 hit.
PF08148. DSHCT. 1 hit.
PF00271. Helicase_C. 1 hit.
PF13234. rRNA_proc-arch. 1 hit.
[Graphical view]
PIRSFiPIRSF005198. Antiviral_helicase_SKI2. 1 hit.
SMARTiSM00487. DEXDc. 1 hit.
SM01142. DSHCT. 1 hit.
SM00490. HELICc. 1 hit.
[Graphical view]
SUPFAMiSSF52540. SSF52540. 3 hits.
PROSITEiPS51192. HELICASE_ATP_BIND_1. 1 hit.
PS51194. HELICASE_CTER. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P42285-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MADAFGDELF SVFEGDSTTA AGTKKDKEKD KGKWKGPPGS ADKAGKRFDG
60 70 80 90 100
KLQSESTNNG KNKRDVDFEG TDEPIFGKKP RIEESITEDL SLADLMPRVK
110 120 130 140 150
VQSVETVEGC THEVALPAEE DYLPLKPRVG KAAKEYPFIL DAFQREAIQC
160 170 180 190 200
VDNNQSVLVS AHTSAGKTVC AEYAIALALR EKQRVIFTSP IKALSNQKYR
210 220 230 240 250
EMYEEFQDVG LMTGDVTINP TASCLVMTTE ILRSMLYRGS EVMREVAWVI
260 270 280 290 300
FDEIHYMRDS ERGVVWEETI ILLPDNVHYV FLSATIPNAR QFAEWICHLH
310 320 330 340 350
KQPCHVIYTD YRPTPLQHYI FPAGGDGLHL VVDENGDFRE DNFNTAMQVL
360 370 380 390 400
RDAGDLAKGD QKGRKGGTKG PSNVFKIVKM IMERNFQPVI IFSFSKKDCE
410 420 430 440 450
AYALQMTKLD FNTDEEKKMV EEVFSNAIDC LSDEDKKLPQ VEHVLPLLKR
460 470 480 490 500
GIGIHHGGLL PILKETIEIL FSEGLIKALF ATETFAMGIN MPARTVLFTN
510 520 530 540 550
ARKFDGKDFR WISSGEYIQM SGRAGRRGMD DRGIVILMVD EKMSPTIGKQ
560 570 580 590 600
LLKGSADPLN SAFHLTYNMV LNLLRVEEIN PEYMLEKSFY QFQHYRAIPG
610 620 630 640 650
VVEKVKNSEE QYNKIVIPNE ESVVIYYKIR QQLAKLGKEI EEYIHKPKYC
660 670 680 690 700
LPFLQPGRLV KVKNEGDDFG WGVVVNFSKK SNVKPNSGEL DPLYVVEVLL
710 720 730 740 750
RCSKESLKNS ATEAAKPAKP DEKGEMQVVP VLVHLLSAIS SVRLYIPKDL
760 770 780 790 800
RPVDNRQSVL KSIQEVQKRF PDGIPLLDPI DDMGIQDQGL KKVIQKVEAF
810 820 830 840 850
EHRMYSHPLH NDPNLETVYT LCEKKAQIAI DIKSAKRELK KARTVLQMDE
860 870 880 890 900
LKCRKRVLRR LGFATSSDVI EMKGRVACEI SSADELLLTE MMFNGLFNDL
910 920 930 940 950
SAEQATALLS CFVFQENSSE MPKLTEQLAG PLRQMQECAK RIAKVSAEAK
960 970 980 990 1000
LEIDEETYLS SFKPHLMDVV YTWATGATFA HICKMTDVFE GSIIRCMRRL
1010 1020 1030 1040
EELLRQMCQA AKAIGNTELE NKFAEGITKI KRDIVFAASL YL
Length:1,042
Mass (Da):117,805
Last modified:July 19, 2005 - v3
Checksum:i49F47BEC753FBEE7
GO

Sequence cautioni

The sequence AAH65258.1 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.Curated
The sequence BAA06124.2 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti342 – 3421N → K in AAH14669 (PubMed:15489334).Curated
Sequence conflicti722 – 7221E → G in CAE45877 (PubMed:17974005).Curated
Sequence conflicti900 – 9001L → I in AAH28604 (PubMed:15489334).Curated
Sequence conflicti927 – 9271Q → R in CAE45877 (PubMed:17974005).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti346 – 3461A → P.
Corresponds to variant rs35643285 [ dbSNP | Ensembl ].
VAR_049343

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D29641 mRNA. Translation: BAA06124.2. Different initiation.
BX640789 mRNA. Translation: CAE45877.1.
BC014669 mRNA. Translation: AAH14669.2.
BC028604 mRNA. Translation: AAH28604.3.
BC031779 mRNA. Translation: AAH31779.1.
BC065258 mRNA. Translation: AAH65258.1. Different initiation.
BC104996 mRNA. Translation: AAI04997.1.
BC113509 mRNA. Translation: AAI13510.1.
CCDSiCCDS3967.1.
RefSeqiNP_056175.3. NM_015360.4.
UniGeneiHs.274531.

Genome annotation databases

EnsembliENST00000230640; ENSP00000230640; ENSG00000039123.
GeneIDi23517.
KEGGihsa:23517.
UCSCiuc003jpy.5. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D29641 mRNA. Translation: BAA06124.2. Different initiation.
BX640789 mRNA. Translation: CAE45877.1.
BC014669 mRNA. Translation: AAH14669.2.
BC028604 mRNA. Translation: AAH28604.3.
BC031779 mRNA. Translation: AAH31779.1.
BC065258 mRNA. Translation: AAH65258.1. Different initiation.
BC104996 mRNA. Translation: AAI04997.1.
BC113509 mRNA. Translation: AAI13510.1.
CCDSiCCDS3967.1.
RefSeqiNP_056175.3. NM_015360.4.
UniGeneiHs.274531.

3D structure databases

ProteinModelPortaliP42285.
SMRiP42285. Positions 112-1040.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi117064. 101 interactions.
IntActiP42285. 41 interactions.
MINTiMINT-3015504.
STRINGi9606.ENSP00000230640.

PTM databases

iPTMnetiP42285.
PhosphoSiteiP42285.
SwissPalmiP42285.

Polymorphism and mutation databases

BioMutaiSKIV2L2.
DMDMi71153172.

2D gel databases

SWISS-2DPAGEP42285.

Proteomic databases

EPDiP42285.
MaxQBiP42285.
PaxDbiP42285.
PeptideAtlasiP42285.
PRIDEiP42285.

Protocols and materials databases

DNASUi23517.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000230640; ENSP00000230640; ENSG00000039123.
GeneIDi23517.
KEGGihsa:23517.
UCSCiuc003jpy.5. human.

Organism-specific databases

CTDi23517.
GeneCardsiSKIV2L2.
HGNCiHGNC:18734. SKIV2L2.
HPAiHPA037379.
neXtProtiNX_P42285.
PharmGKBiPA134901921.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0948. Eukaryota.
COG4581. LUCA.
GeneTreeiENSGT00820000127042.
HOGENOMiHOG000163047.
HOVERGENiHBG104255.
InParanoidiP42285.
KOiK12598.
OMAiHDVSYPE.
OrthoDBiEOG7XSTCX.
PhylomeDBiP42285.
TreeFamiTF300597.

Enzyme and pathway databases

ReactomeiR-HSA-6791226. Major pathway of rRNA processing in the nucleolus.

Miscellaneous databases

ChiTaRSiSKIV2L2. human.
GeneWikiiSKIV2L2.
GenomeRNAii23517.
PROiP42285.

Gene expression databases

BgeeiP42285.
CleanExiHS_SKIV2L2.
ExpressionAtlasiP42285. baseline and differential.
GenevisibleiP42285. HS.

Family and domain databases

Gene3Di3.40.50.300. 3 hits.
InterProiIPR011545. DEAD/DEAH_box_helicase_dom.
IPR014001. Helicase_ATP-bd.
IPR001650. Helicase_C.
IPR027417. P-loop_NTPase.
IPR025696. rRNA_proc-arch_dom.
IPR016438. Ski2.
IPR012961. Ski2_C.
[Graphical view]
PfamiPF00270. DEAD. 1 hit.
PF08148. DSHCT. 1 hit.
PF00271. Helicase_C. 1 hit.
PF13234. rRNA_proc-arch. 1 hit.
[Graphical view]
PIRSFiPIRSF005198. Antiviral_helicase_SKI2. 1 hit.
SMARTiSM00487. DEXDc. 1 hit.
SM01142. DSHCT. 1 hit.
SM00490. HELICc. 1 hit.
[Graphical view]
SUPFAMiSSF52540. SSF52540. 3 hits.
PROSITEiPS51192. HELICASE_ATP_BIND_1. 1 hit.
PS51194. HELICASE_CTER. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Prediction of the coding sequences of unidentified human genes. II. The coding sequences of 40 new genes (KIAA0041-KIAA0080) deduced by analysis of cDNA clones from human cell line KG-1."
    Nomura N., Nagase T., Miyajima N., Sazuka T., Tanaka A., Sato S., Seki N., Kawarabayasi Y., Ishikawa K., Tabata S.
    DNA Res. 1:223-229(1994) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Myelomonocyte.
  2. Ohara O., Nagase T., Kikuno R., Nomura N.
    Submitted (OCT-2001) to the EMBL/GenBank/DDBJ databases
    Cited for: SEQUENCE REVISION TO C-TERMINUS.
  3. "The AAA-ATPase NVL2 is a component of pre-ribosomal particles that interacts with the DExD/H-box RNA helicase DOB1."
    Nagahama M., Yamazoe T., Hara Y., Tani K., Tsuji A., Tagaya M.
    Biochem. Biophys. Res. Commun. 346:1075-1082(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], SUBCELLULAR LOCATION, INTERACTION WITH NVL.
    Tissue: Kidney.
  4. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Endometrium.
  5. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Hippocampus, Natural killer cell, Prostate and Retina.
  6. Bienvenut W.V., Lilla S., von Kriegsheim A., Lempens A., Kolch W.
    Submitted (DEC-2008) to UniProtKB
    Cited for: PROTEIN SEQUENCE OF 2-24; 82-98; 135-145; 185-192; 340-351; 385-396; 409-418; 438-449; 451-464; 495-502; 533-542; 554-604; 681-701; 724-743; 844-852; 861-873; 924-933; 985-995 AND 1013-1022, CLEAVAGE OF INITIATOR METHIONINE, ACETYLATION AT ALA-2, IDENTIFICATION BY MASS SPECTROMETRY.
    Tissue: Ovarian carcinoma.
  7. "AU binding proteins recruit the exosome to degrade ARE-containing mRNAs."
    Chen C.-Y., Gherzi R., Ong S.-E., Chan E.L., Raijmakers R., Pruijn G.J.M., Stoecklin G., Moroni C., Mann M., Karin M.
    Cell 107:451-464(2001) [PubMed] [Europe PMC] [Abstract]
    Cited for: ASSOCIATION WITH THE RNA EXOSOME COMPLEX, IDENTIFICATION BY MASS SPECTROMETRY.
  8. Cited for: SUBCELLULAR LOCATION [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  9. "Purification and characterization of native spliceosomes suitable for three-dimensional structural analysis."
    Jurica M.S., Licklider L.J., Gygi S.P., Grigorieff N., Moore M.J.
    RNA 8:426-439(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION BY MASS SPECTROMETRY, IDENTIFICATION IN THE SPLICEOSOMAL C COMPLEX.
  10. "C1D and hMtr4p associate with the human exosome subunit PM/Scl-100 and are involved in pre-rRNA processing."
    Schilders G., van Dijk E., Pruijn G.J.M.
    Nucleic Acids Res. 35:2564-2572(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, INTERACTION WITH MPHOSPH6.
  11. "Lys-N and trypsin cover complementary parts of the phosphoproteome in a refined SCX-based approach."
    Gauci S., Helbig A.O., Slijper M., Krijgsveld J., Heck A.J., Mohammed S.
    Anal. Chem. 81:4493-4501(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: ACETYLATION [LARGE SCALE ANALYSIS] AT ALA-2, CLEAVAGE OF INITIATOR METHIONINE [LARGE SCALE ANALYSIS], IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  12. "Lysine acetylation targets protein complexes and co-regulates major cellular functions."
    Choudhary C., Kumar C., Gnad F., Nielsen M.L., Rehman M., Walther T.C., Olsen J.V., Mann M.
    Science 325:834-840(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: ACETYLATION [LARGE SCALE ANALYSIS] AT LYS-51, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  13. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  14. Cited for: IDENTIFICATION IN A TRAMP-LIKE COMPLEX, SUBUNIT, SUBCELLULAR LOCATION.
  15. "Comparison of the yeast and human nuclear exosome complexes."
    Sloan K.E., Schneider C., Watkins N.J.
    Biochem. Soc. Trans. 40:850-855(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: REVIEW ON RNA EXOSOMES.
  16. "Comparative large-scale characterisation of plant vs. mammal proteins reveals similar and idiosyncratic N-alpha acetylation features."
    Bienvenut W.V., Sumpton D., Martinez A., Lilla S., Espagne C., Meinnel T., Giglione C.
    Mol. Cell. Proteomics 11:M111.015131-M111.015131(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: ACETYLATION [LARGE SCALE ANALYSIS] AT ALA-2, CLEAVAGE OF INITIATOR METHIONINE [LARGE SCALE ANALYSIS], IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  17. "SUMO-2 orchestrates chromatin modifiers in response to DNA damage."
    Hendriks I.A., Treffers L.W., Verlaan-de Vries M., Olsen J.V., Vertegaal A.C.
    Cell Rep. 10:1778-1791(2015) [PubMed] [Europe PMC] [Abstract]
    Cited for: SUMOYLATION [LARGE SCALE ANALYSIS] AT LYS-684, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].

Entry informationi

Entry nameiSK2L2_HUMAN
AccessioniPrimary (citable) accession number: P42285
Secondary accession number(s): Q2M386
, Q6MZZ8, Q6P170, Q8N5R0, Q8TAG2
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 1, 1995
Last sequence update: July 19, 2005
Last modified: June 8, 2016
This is version 171 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 5
    Human chromosome 5: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.