Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Dual specificity tyrosine-phosphorylation-regulated kinase mbk-2

Gene

mbk-2

Organism
Caenorhabditis elegans
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Required for oocyte-to-zygote transition in which it phosphorylates oocyte proteins, including mei-1, oma-1, oma-2, mex-5, and mex-6, modifying their activity and/or stability following meiosis (PubMed:16289132, PubMed:16338136, PubMed:17869113, PubMed:18854162, PubMed:18199581). Functions in both spindle positioning and in the posterior localization of cytoplasmic determinants, including pie-1, pos-1, and pgl-1, in early embryos (PubMed:14697358). Involved in the asymmetric distribution of plk-1 at the 2-cell embryonic stage (PubMed:18199581).6 Publications

Catalytic activityi

ATP + a protein = ADP + a phosphoprotein.3 Publications

Cofactori

Mg2+3 Publications

Enzyme regulationi

Activated during oocyte maturation by phosphorylation on Ser-362 by cdk-1. The pseudotyrosine phosphatases egg-4 and egg-5 sequester activated mbk-2 until the meiotic divisions and inhibit mbk-2 kinase activity directly, using a mixed-inhibition mechanism that does not involve tyrosine dephosphorylation.1 Publication

Kineticsi

The presence of egg-4 in the assay increases the KM of mbk-2 for mei-1.1 Publication
  1. KM=0.3 µM for mei-1 (Isoform a)1 Publication

    Sites

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Binding sitei490ATPPROSITE-ProRule annotation3 Publications1
    Active sitei587Proton acceptorPROSITE-ProRule annotationBy similarity1

    Regions

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Nucleotide bindingi467 – 475ATPPROSITE-ProRule annotationBy similarity9

    GO - Molecular functioni

    • ATP binding Source: WormBase
    • protein kinase activity Source: WormBase
    • protein serine/threonine/tyrosine kinase activity Source: UniProtKB-EC
    • protein serine/threonine kinase activity Source: WormBase
    • protein tyrosine kinase activity Source: WormBase

    GO - Biological processi

    • asymmetric protein localization involved in cell fate determination Source: WormBase
    • embryo development ending in birth or egg hatching Source: WormBase
    • embryonic pattern specification Source: WormBase
    • microtubule-based process Source: WormBase
    • mitotic cytokinesis Source: WormBase
    • peptidyl-serine phosphorylation Source: WormBase
    • peptidyl-threonine phosphorylation Source: WormBase
    • peptidyl-tyrosine phosphorylation Source: WormBase
    • P granule disassembly Source: WormBase
    • positive regulation of proteasomal ubiquitin-dependent protein catabolic process Source: WormBase
    • positive regulation of protein catabolic process Source: WormBase
    • protein autophosphorylation Source: WormBase
    • protein phosphorylation Source: WormBase

    Keywordsi

    Molecular functionDevelopmental protein, Kinase, Serine/threonine-protein kinase, Transferase, Tyrosine-protein kinase
    LigandATP-binding, Magnesium, Nucleotide-binding

    Enzyme and pathway databases

    ReactomeiR-CEL-6804756. Regulation of TP53 Activity through Phosphorylation.
    SignaLinkiQ9XTF3.

    Names & Taxonomyi

    Protein namesi
    Recommended name:
    Dual specificity tyrosine-phosphorylation-regulated kinase mbk-2By similarity (EC:2.7.12.13 Publications)
    Alternative name(s):
    Dual specificity Yak1-related kinase mbk-2By similarity
    Minibrain Kinase 21 Publication
    Gene namesi
    Name:mbk-2
    ORF Names:F49E11.1
    OrganismiCaenorhabditis elegans
    Taxonomic identifieri6239 [NCBI]
    Taxonomic lineageiEukaryotaMetazoaEcdysozoaNematodaChromadoreaRhabditidaRhabditoideaRhabditidaePeloderinaeCaenorhabditis
    Proteomesi
    • UP000001940 Componenti: Chromosome IV

    Organism-specific databases

    WormBaseiF49E11.1a; CE05897; WBGene00003150; mbk-2.
    F49E11.1b; CE19878; WBGene00003150; mbk-2.
    F49E11.1c; CE23751; WBGene00003150; mbk-2.
    F49E11.1d; CE39735; WBGene00003150; mbk-2.
    F49E11.1e; CE39938; WBGene00003150; mbk-2.
    F49E11.1f; CE47098; WBGene00003150; mbk-2.
    F49E11.1g; CE47427; WBGene00003150; mbk-2.
    F49E11.1h; CE47790; WBGene00003150; mbk-2.

    Subcellular locationi

    GO - Cellular componenti

    • cell cortex Source: WormBase
    • centrosome Source: WormBase
    • chromosome Source: WormBase
    • condensed chromosome Source: WormBase
    • cytoplasm Source: WormBase
    • mitotic spindle Source: WormBase
    • P granule Source: WormBase

    Keywords - Cellular componenti

    Cytoplasm

    Pathology & Biotechi

    Disruption phenotypei

    Maternal-effect embryonic lethality due to defects in spindle positioning and cytokinesis in the early embryo (PubMed:12618396, PubMed:14697358). Microtubules are fragmented and disordered (PubMed:14697358). Abolishes phosphorylation of RNA-binding protein oma-1 in embryos (PubMed:16289132). RNAi-mediated knockdown causes a loss in plk-1 asymmetric localization in 2-cell stage embryo without affecting mex-5 polarization (PubMed:18199581).4 Publications

    Mutagenesis

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Mutagenesisi362S → A: Loss of phosphorylation by cdk-1. Loss of kinase activity. 1 Publication1
    Mutagenesisi362S → E: Constitutively active. 1 Publication1
    Mutagenesisi490K → R: Loss of autophosphorylation activity. Slight loss of binding to egg-4 and egg-5. 3 Publications1
    Mutagenesisi619Y → F: Reduced binding to egg-4 and egg-5 and translocation to the cytoplasm; when associated with F-621. 1 Publication1
    Mutagenesisi621Y → F: Reduced binding to egg-5 and egg-5 and translocation to the cytoplasm; when associated with F-619. 1 Publication1
    Mutagenesisi764T → A: No loss of phosphorylation by cdk-1. 1 Publication1

    PTM / Processingi

    Molecule processing

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    ChainiPRO_00003907171 – 817Dual specificity tyrosine-phosphorylation-regulated kinase mbk-2Add BLAST817

    Amino acid modifications

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Modified residuei362Phosphoserine; by cdk-11 Publication1

    Post-translational modificationi

    Autophosphorylated.1 Publication

    Keywords - PTMi

    Phosphoprotein

    Proteomic databases

    EPDiQ9XTF3.
    PaxDbiQ9XTF3.
    PeptideAtlasiQ9XTF3.

    PTM databases

    iPTMnetiQ9XTF3.

    Expressioni

    Tissue specificityi

    In L1 larvae, expressed widely in the nervous system, including head neurons and the ventral nerve cord. In adult animals, continues to be expressed in the nervous system and is also expressed in body wall muscle.1 Publication

    Developmental stagei

    Expressed both maternally and zygotically.1 Publication

    Gene expression databases

    BgeeiWBGene00003150.
    ExpressionAtlasiQ9XTF3. baseline.

    Interactioni

    Subunit structurei

    Interacts with egg-3, egg-4 and egg-5.2 Publications

    Binary interactionsi

    Show more details

    Protein-protein interaction databases

    BioGridi43342. 9 interactors.
    IntActiQ9XTF3. 11 interactors.
    STRINGi6239.F49E11.1b.

    Structurei

    3D structure databases

    ProteinModelPortaliQ9XTF3.
    SMRiQ9XTF3.
    ModBaseiSearch...
    MobiDBiSearch...

    Family & Domainsi

    Domains and Repeats

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Domaini461 – 774Protein kinasePROSITE-ProRule annotationAdd BLAST314

    Compositional bias

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Compositional biasi194 – 258Gln-richSequence analysisAdd BLAST65
    Compositional biasi307 – 391Ser-richSequence analysisAdd BLAST85

    Sequence similaritiesi

    Phylogenomic databases

    eggNOGiKOG0667. Eukaryota.
    ENOG410XPET. LUCA.
    GeneTreeiENSGT00760000119032.
    InParanoidiQ9XTF3.
    KOiK18669.
    OMAiNTASTHD.
    OrthoDBiEOG091G0Q46.
    PhylomeDBiQ9XTF3.

    Family and domain databases

    InterProiView protein in InterPro
    IPR011009. Kinase-like_dom.
    IPR000719. Prot_kinase_dom.
    IPR017441. Protein_kinase_ATP_BS.
    IPR008271. Ser/Thr_kinase_AS.
    PfamiView protein in Pfam
    PF00069. Pkinase. 1 hit.
    SMARTiView protein in SMART
    SM00220. S_TKc. 1 hit.
    SUPFAMiSSF56112. SSF56112. 2 hits.
    PROSITEiView protein in PROSITE
    PS00107. PROTEIN_KINASE_ATP. 1 hit.
    PS50011. PROTEIN_KINASE_DOM. 1 hit.
    PS00108. PROTEIN_KINASE_ST. 1 hit.

    Sequences (8)i

    Sequence statusi: Complete.

    This entry describes 8 isoformsi produced by alternative splicing. AlignAdd to basket

    Isoform b1 Publication (identifier: Q9XTF3-1) [UniParc]FASTAAdd to basket

    This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

    « Hide

            10         20         30         40         50
    MAALASFTRN SRSYGQQPID VTQQGQRDRS VMSLDAQGRS VSHECPTSTT
    60 70 80 90 100
    LVRQLYLPQI PQSASFAAAP TSFSGASSSS SNHHHPVYHS QNSLPPNLLG
    110 120 130 140 150
    SSQNSASSNS LVQGHRNPAL GSGNTLTRSY HQPSSTNSST NNLYGPLGTI
    160 170 180 190 200
    SRDLKQSIRD ISPPVINSSA NPHLVNYVQT SSFDNGSYEF PSGQAQQQRR
    210 220 230 240 250
    LGGSQQHLAP LQQTASSLYS NPQSSSSQLL GQQQAVRPNY AYQQSLPRQQ
    260 270 280 290 300
    HINSHQTQAF FGTVRGPTNS TNIVTPLRAS KTMIDVLAPV RDTVAAQATG
    310 320 330 340 350
    LPNVGTSSSN GSSNSSSGVG SGGSGSLMTQ SIGGPNKHLS ASHSTLNTAS
    360 370 380 390 400
    THDMMHSKIP KSPSNESLSR SHTSSSGGSQ GGHNSNSGSN SGFRPEDAVQ
    410 420 430 440 450
    TFGAKLVPFE KNEIYNYTRV FFVGSHAKKQ AGVIGGANNG GYDDENGSYQ
    460 470 480 490 500
    LVVHDHIAYR YEVLKVIGKG SFGQVIKAFD HKYQQYVALK LVRNEKRFHR
    510 520 530 540 550
    QADEEIRILD HLRRQDSDGT HNIIHMLDYF NFRNHKCITF ELLSINLYEL
    560 570 580 590 600
    IKRNKFQGFS LMLVRKFAYS MLLCLDLLQK NRLIHCDLKP ENVLLKQQGR
    610 620 630 640 650
    SGIKVIDFGS SCFDDQRIYT YIQSRFYRAP EVILGTKYGM PIDMWSLGCI
    660 670 680 690 700
    LAELLTGYPL LPGEDENDQL ALIIELLGMP PPKSLETAKR ARTFITSKGY
    710 720 730 740 750
    PRYCTATSMP DGSVVLAGAR SKRGKMRGPP ASRSWSTALK NMGDELFVDF
    760 770 780 790 800
    LKRCLDWDPE TRMTPAQALK HKWLRRRLPN PPRDGLESMG GLADHEKKTE
    810
    TLPNIDSNAN ILMRKKF
    Note: No experimental confirmation available.Curated
    Length:817
    Mass (Da):89,885
    Last modified:November 1, 1999 - v1
    Checksum:i74903C85CB68C428
    GO
    Isoform a1 Publication (identifier: Q9XTF3-2) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         1-294: Missing.
         295-310: AAQATGLPNVGTSSSN → MTLFEPSTSGNRMGYR
         797-817: KKTETLPNIDSNANILMRKKF → VCFIIF

    Note: No experimental confirmation available.
    Show »
    Length:508
    Mass (Da):56,841
    Checksum:i3EC0F264B291AC2C
    GO
    Isoform c2 Publications (identifier: Q9XTF3-3) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         797-817: Missing.

    Show »
    Length:796
    Mass (Da):87,441
    Checksum:iFE8365EC2FDE8D5A
    GO
    Isoform d1 Publication (identifier: Q9XTF3-4) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         1-327: Missing.
         797-817: KKTETLPNIDSNANILMRKKF → VCFIIF

    Note: No experimental confirmation available.Curated
    Show »
    Length:475
    Mass (Da):53,646
    Checksum:iFBDA51EA9638F7F0
    GO
    Isoform e1 Publication (identifier: Q9XTF3-5) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         1-301: Missing.
         302-310: PNVGTSSSN → MYSSLFELR
         797-817: KKTETLPNIDSNANILMRKKF → VCFIIF

    Note: No experimental confirmation available.Curated
    Show »
    Length:501
    Mass (Da):56,139
    Checksum:i3BA892EA8A8B8777
    GO
    Isoform f (identifier: Q9XTF3-6) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         1-297: Missing.
         298-310: ATGLPNVGTSSSN → MFAIPFHRFYSDE
         797-802: KKTETL → VCFIIF
         803-817: Missing.

    Note: No experimental confirmation available.
    Show »
    Length:505
    Mass (Da):56,654
    Checksum:iC07154636CC01114
    GO
    Isoform g (identifier: Q9XTF3-7) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         148-164: Missing.

    Note: No experimental confirmation available.
    Show »
    Length:800
    Mass (Da):88,020
    Checksum:i111F8423FBE140FC
    GO
    Isoform h (identifier: Q9XTF3-8) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         1-282: Missing.

    Note: No experimental confirmation available.
    Show »
    Length:535
    Mass (Da):59,500
    Checksum:iF73E4415D844DE58
    GO

    Alternative sequence

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Alternative sequenceiVSP_0531811 – 327Missing in isoform d. 1 PublicationAdd BLAST327
    Alternative sequenceiVSP_0531821 – 301Missing in isoform e. 1 PublicationAdd BLAST301
    Alternative sequenceiVSP_0536251 – 297Missing in isoform f. CuratedAdd BLAST297
    Alternative sequenceiVSP_0531831 – 294Missing in isoform a. 1 PublicationAdd BLAST294
    Alternative sequenceiVSP_0536261 – 282Missing in isoform h. CuratedAdd BLAST282
    Alternative sequenceiVSP_053627148 – 164Missing in isoform g. CuratedAdd BLAST17
    Alternative sequenceiVSP_053184295 – 310AAQAT…TSSSN → MTLFEPSTSGNRMGYR in isoform a. 1 PublicationAdd BLAST16
    Alternative sequenceiVSP_053628298 – 310ATGLP…TSSSN → MFAIPFHRFYSDE in isoform f. CuratedAdd BLAST13
    Alternative sequenceiVSP_053185302 – 310PNVGTSSSN → MYSSLFELR in isoform e. 1 Publication9
    Alternative sequenceiVSP_053186797 – 817KKTET…MRKKF → VCFIIF in isoform a, isoform d and isoform e. 1 PublicationAdd BLAST21
    Alternative sequenceiVSP_053187797 – 817Missing in isoform c. 2 PublicationsAdd BLAST21
    Alternative sequenceiVSP_053629797 – 802KKTETL → VCFIIF in isoform f. Curated6
    Alternative sequenceiVSP_053630803 – 817Missing in isoform f. CuratedAdd BLAST15

    Sequence databases

    Select the link destinations:
    EMBLi
    GenBanki
    DDBJi
    Links Updated
    AY090019 mRNA. Translation: AAM09088.1.
    Z70308 Genomic DNA. Translation: CAA94352.1.
    Z70308, Z81121, Z81146 Genomic DNA. Translation: CAA94353.1.
    Z70308, Z81121, Z81146 Genomic DNA. Translation: CAB54254.2.
    Z70308 Genomic DNA. Translation: CAJ76942.1.
    Z70308, Z81121 Genomic DNA. Translation: CAJ80814.1.
    Z70308, Z81121, Z81146 Genomic DNA. Translation: CCG28133.1.
    Z70308, Z81121 Genomic DNA. Translation: CCG28134.1.
    Z70308, Z81121 Genomic DNA. Translation: CCM09396.1.
    PIRiT22440.
    T22442.
    RefSeqiNP_001023207.1. NM_001028036.2. [Q9XTF3-1]
    NP_001023208.1. NM_001028037.2. [Q9XTF3-3]
    NP_001040950.1. NM_001047485.3. [Q9XTF3-4]
    NP_001040951.1. NM_001047486.2. [Q9XTF3-5]
    NP_001255694.1. NM_001268765.1. [Q9XTF3-7]
    NP_001255695.1. NM_001268766.1. [Q9XTF3-6]
    NP_001263802.1. NM_001276873.1. [Q9XTF3-8]
    NP_502492.2. NM_070091.6. [Q9XTF3-2]
    UniGeneiCel.39641.
    Cel.7508.

    Genome annotation databases

    EnsemblMetazoaiF49E11.1b; F49E11.1b; WBGene00003150. [Q9XTF3-1]
    GeneIDi178250.
    KEGGicel:CELE_F49E11.1.
    UCSCiF49E11.1c. c. elegans.

    Keywords - Coding sequence diversityi

    Alternative splicing

    Cross-referencesi

    Sequence databases

    Select the link destinations:
    EMBLi
    GenBanki
    DDBJi
    Links Updated
    AY090019 mRNA. Translation: AAM09088.1.
    Z70308 Genomic DNA. Translation: CAA94352.1.
    Z70308, Z81121, Z81146 Genomic DNA. Translation: CAA94353.1.
    Z70308, Z81121, Z81146 Genomic DNA. Translation: CAB54254.2.
    Z70308 Genomic DNA. Translation: CAJ76942.1.
    Z70308, Z81121 Genomic DNA. Translation: CAJ80814.1.
    Z70308, Z81121, Z81146 Genomic DNA. Translation: CCG28133.1.
    Z70308, Z81121 Genomic DNA. Translation: CCG28134.1.
    Z70308, Z81121 Genomic DNA. Translation: CCM09396.1.
    PIRiT22440.
    T22442.
    RefSeqiNP_001023207.1. NM_001028036.2. [Q9XTF3-1]
    NP_001023208.1. NM_001028037.2. [Q9XTF3-3]
    NP_001040950.1. NM_001047485.3. [Q9XTF3-4]
    NP_001040951.1. NM_001047486.2. [Q9XTF3-5]
    NP_001255694.1. NM_001268765.1. [Q9XTF3-7]
    NP_001255695.1. NM_001268766.1. [Q9XTF3-6]
    NP_001263802.1. NM_001276873.1. [Q9XTF3-8]
    NP_502492.2. NM_070091.6. [Q9XTF3-2]
    UniGeneiCel.39641.
    Cel.7508.

    3D structure databases

    ProteinModelPortaliQ9XTF3.
    SMRiQ9XTF3.
    ModBaseiSearch...
    MobiDBiSearch...

    Protein-protein interaction databases

    BioGridi43342. 9 interactors.
    IntActiQ9XTF3. 11 interactors.
    STRINGi6239.F49E11.1b.

    PTM databases

    iPTMnetiQ9XTF3.

    Proteomic databases

    EPDiQ9XTF3.
    PaxDbiQ9XTF3.
    PeptideAtlasiQ9XTF3.

    Protocols and materials databases

    Structural Biology KnowledgebaseSearch...

    Genome annotation databases

    EnsemblMetazoaiF49E11.1b; F49E11.1b; WBGene00003150. [Q9XTF3-1]
    GeneIDi178250.
    KEGGicel:CELE_F49E11.1.
    UCSCiF49E11.1c. c. elegans.

    Organism-specific databases

    CTDi178250.
    WormBaseiF49E11.1a; CE05897; WBGene00003150; mbk-2.
    F49E11.1b; CE19878; WBGene00003150; mbk-2.
    F49E11.1c; CE23751; WBGene00003150; mbk-2.
    F49E11.1d; CE39735; WBGene00003150; mbk-2.
    F49E11.1e; CE39938; WBGene00003150; mbk-2.
    F49E11.1f; CE47098; WBGene00003150; mbk-2.
    F49E11.1g; CE47427; WBGene00003150; mbk-2.
    F49E11.1h; CE47790; WBGene00003150; mbk-2.

    Phylogenomic databases

    eggNOGiKOG0667. Eukaryota.
    ENOG410XPET. LUCA.
    GeneTreeiENSGT00760000119032.
    InParanoidiQ9XTF3.
    KOiK18669.
    OMAiNTASTHD.
    OrthoDBiEOG091G0Q46.
    PhylomeDBiQ9XTF3.

    Enzyme and pathway databases

    ReactomeiR-CEL-6804756. Regulation of TP53 Activity through Phosphorylation.
    SignaLinkiQ9XTF3.

    Miscellaneous databases

    PROiPR:Q9XTF3.

    Gene expression databases

    BgeeiWBGene00003150.
    ExpressionAtlasiQ9XTF3. baseline.

    Family and domain databases

    InterProiView protein in InterPro
    IPR011009. Kinase-like_dom.
    IPR000719. Prot_kinase_dom.
    IPR017441. Protein_kinase_ATP_BS.
    IPR008271. Ser/Thr_kinase_AS.
    PfamiView protein in Pfam
    PF00069. Pkinase. 1 hit.
    SMARTiView protein in SMART
    SM00220. S_TKc. 1 hit.
    SUPFAMiSSF56112. SSF56112. 2 hits.
    PROSITEiView protein in PROSITE
    PS00107. PROTEIN_KINASE_ATP. 1 hit.
    PS50011. PROTEIN_KINASE_DOM. 1 hit.
    PS00108. PROTEIN_KINASE_ST. 1 hit.
    ProtoNetiSearch...

    Entry informationi

    Entry nameiMBK2_CAEEL
    AccessioniPrimary (citable) accession number: Q9XTF3
    Secondary accession number(s): H9G2V8
    , H9G2V9, J7SF89, Q20604, Q27GP3, Q2EEP1, Q9TVF4
    Entry historyiIntegrated into UniProtKB/Swiss-Prot: January 19, 2010
    Last sequence update: November 1, 1999
    Last modified: March 15, 2017
    This is version 139 of the entry and version 1 of the sequence. See complete history.
    Entry statusiReviewed (UniProtKB/Swiss-Prot)
    Annotation programCaenorhabditis annotation project

    Miscellaneousi

    Keywords - Technical termi

    Complete proteome, Reference proteome

    Documents

    1. Caenorhabditis elegans
      Caenorhabditis elegans: entries, gene names and cross-references to WormBase
    2. SIMILARITY comments
      Index of protein domains and families

    Similar proteinsi

    Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
    100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
    90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
    50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.