Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Cytoplasmic polyadenylation element-binding protein 3

Gene

Cpeb3

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Sequence-specific RNA-binding protein which acts as a translational repressor in the basal unstimulated state but, following neuronal stimulation, acts as a translational activator (PubMed:17024188, PubMed:26074072). In contrast to CPEB1, does not bind to the cytoplasmic polyadenylation element (CPE), a uridine-rich sequence element within the mRNA 3'-UTR, but binds to a U-rich loop within a stem-loop structure (PubMed:17024188). Required for the consolidation and maintenance of hippocampal-based long term memory (PubMed:26074003). In the basal state, binds to the mRNA 3'-UTR of the glutamate receptors GRIA1 and GRIA2 and negatively regulates their translation (PubMed:17024188, PubMed:22153079). Also represses the translation of DLG4, GRIN1 GRIN2A and GRIN2B (PubMed:24155305). When activated, acts as a translational activator of GRIA1 and GRIA2 (PubMed:22153079, PubMed:26074003). In the basal state, suppresses SUMO2 translation but activates it following neuronal stimulation (PubMed:26074071). Binds to the 3'-UTR of TRPV1 mRNA and represses TRPV1 translation which is required to maintain normal thermoception (PubMed:26915043). Binds actin mRNA, leading to actin translational repression in the basal state and to translational activation following neuronal stimulation (PubMed:26074072). Negatively regulates target mRNA levels by binding to TOB1 which recruits CNOT7/CAF1 to a ternary complex and this leads to target mRNA deadenylation and decay (By similarity). In addition to its role in translation, binds to and inhibits the transcriptional activation activity of STAT5B without affecting its dimerization or DNA-binding activity. This, in turn, represses transcription of the STAT5B target gene EGFR which has been shown to play a role in enhancing learning and memory performance (By similarity). In contrast to CPEB1, CPEB2 and CPEB4, not required for cell cycle progression (By similarity).By similarity8 Publications

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei462Required for RNA-binding activityBy similarity1
Sitei506Required for RNA-binding activityBy similarity1

GO - Molecular functioni

  • mRNA 3'-UTR AU-rich region binding Source: UniProtKB
  • mRNA 3'-UTR binding Source: UniProtKB
  • nucleotide binding Source: InterPro
  • poly(A) RNA binding Source: MGI
  • ribosome binding Source: GO_Central
  • RNA binding Source: MGI
  • RNA stem-loop binding Source: UniProtKB
  • translation factor activity, RNA binding Source: UniProtKB
  • translation repressor activity, nucleic acid binding Source: UniProtKB

GO - Biological processi

  • 3'-UTR-mediated mRNA destabilization Source: UniProtKB
  • cellular response to amino acid stimulus Source: UniProtKB
  • long-term memory Source: UniProtKB
  • negative regulation of cytoplasmic translational elongation Source: UniProtKB
  • negative regulation of transcription from RNA polymerase II promoter Source: UniProtKB
  • negative regulation of translation Source: UniProtKB
  • positive regulation of dendritic spine development Source: UniProtKB
  • positive regulation of long-term synaptic potentiation Source: UniProtKB
  • positive regulation of mRNA polyadenylation Source: UniProtKB
  • positive regulation of nuclear-transcribed mRNA catabolic process, deadenylation-dependent decay Source: UniProtKB
  • positive regulation of nuclear-transcribed mRNA poly(A) tail shortening Source: UniProtKB
  • positive regulation of translation Source: UniProtKB
  • regulation of dendritic spine development Source: UniProtKB
  • regulation of synaptic plasticity Source: UniProtKB
  • thermoception Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Activator, Repressor

Keywords - Biological processi

Translation regulation

Keywords - Ligandi

RNA-binding

Names & Taxonomyi

Protein namesi
Recommended name:
Cytoplasmic polyadenylation element-binding protein 3
Short name:
CPE-BP3
Short name:
CPE-binding protein 3
Short name:
mCPEB-3
Gene namesi
Name:Cpeb3
Synonyms:Kiaa0940
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 19

Organism-specific databases

MGIiMGI:2443075. Cpeb3.

Subcellular locationi

GO - Cellular componenti

  • apical dendrite Source: UniProtKB
  • CCR4-NOT complex Source: UniProtKB
  • cell junction Source: UniProtKB-KW
  • cytoplasm Source: UniProtKB
  • dendrite Source: UniProtKB
  • messenger ribonucleoprotein complex Source: GO_Central
  • neuron projection Source: UniProtKB
  • nucleus Source: UniProtKB
  • postsynaptic density Source: UniProtKB-SubCell
  • postsynaptic membrane Source: UniProtKB-KW
  • synapse Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Amyloid, Cell junction, Cell membrane, Cell projection, Cytoplasm, Membrane, Nucleus, Postsynaptic cell membrane, Synapse

Pathology & Biotechi

Disruption phenotypei

Enhanced thermal sensitivity and increased levels of Trpv1 in lumbar sciatic nerves and spinal cord (PubMed:26915043). Elevated short-term fear response, enhanced long-term spatial memory, dendritic spine enlargement and elevated levels of Dlg4/Psd95, Gria1 and NMDA receptor subunits Grin1, Grin2a and Grin2b (PubMed:26074071). Conditional knockout in the adult forebrain results in viable mice with no gross neurological defects which display normal locomotor, exploratory and anxiety behaviors but have impaired long-term memory, impaired long-term synaptic plasticity and increased levels of Gria1 and Gria2 (PubMed:26074003).3 Publications

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi419S → A: Abolishes phosphorylation by PKA. 1 Publication1
Mutagenesisi420S → A: Reduces phosphorylation by PKA. 1 Publication1

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00002692621 – 716Cytoplasmic polyadenylation element-binding protein 3Add BLAST716

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei194PhosphoserineCombined sources1
Modified residuei197PhosphoserineCombined sources1
Modified residuei291Phosphoserine1 Publication1
Modified residuei309Asymmetric dimethylarginineCombined sources1
Modified residuei419Phosphoserine1 Publication1
Modified residuei420Phosphoserine1 Publication1

Post-translational modificationi

Activated by NEURL1-mediated monoubiquitination, resulting in the growth of new dendritic spines and increased levels of GRIA1 and GRIA2. NEURL1-mediated monoubiquitination facilitates synaptic plasticity and hippocampal-dependent memory storage.1 Publication
Under basal unstimulated conditions when CPEB3 is mainly unaggregated, sumoylated and acts as a translational repressor. Following neuronal stimulation, becomes desumoylated and aggregated which is required for the translation of mRNA targets and for dendritic filopodia formation.1 Publication
Following neuronal stimulation, cleaved by CAPN2 which abolishes its translational repressor activity, leading to translation of CPEB3 target mRNAs.1 Publication
Phosphorylation is enhanced by neuronal stimulation.1 Publication

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei459 – 460Cleavage; by CAPN21 Publication2

Keywords - PTMi

Methylation, Phosphoprotein, Ubl conjugation

Proteomic databases

PaxDbiQ7TN99.
PeptideAtlasiQ7TN99.
PRIDEiQ7TN99.

PTM databases

iPTMnetiQ7TN99.
PhosphoSitePlusiQ7TN99.

Expressioni

Tissue specificityi

Highly expressed in brain (at protein level) (PubMed:17024188). In brain, expressed in the hippocampus, granule cells and interneurons of the cerebellum, and mitral cells of the olfactory bulb (at protein level) (PubMed:17024188). Detected in the spinal cord and in peripheral dorsal root ganglia (at protein level) (PubMed:26915043). In the retina, strongly expressed in the retinal ganglion layer and, to a lesser extent, in the inner margin of the inner nuclear layer with expression also detected in the inner and outer plexiform layers (at protein level) (PubMed:20003455). Highly expressed in brain and heart, less in liver, kidney, embryo, skeletal muscle, lung and ovary (PubMed:12871996). Weakly expressed in granular cells of dentate gyrus and the pyramidal cells of CA3 and CA1 of the hippocampus (PubMed:12871996).4 Publications

Developmental stagei

In the retina, expression increases throughout postnatal development and remains high in the adult (at protein level).1 Publication

Inductioni

Up-regulated following synaptic activity (at protein level) (PubMed:26074003, PubMed:26074072). Up-regulated in granular cells of the dentate gyrus and the pyramidal cells of CA1 and CA3 after kainate-induced seizures (PubMed:12871996).3 Publications

Gene expression databases

BgeeiENSMUSG00000039652.
ExpressionAtlasiQ7TN99. baseline and differential.
GenevisibleiQ7TN99. MM.

Interactioni

Subunit structurei

Following synaptic activity, aggregates to form amyloid-like oligomers (PubMed:26074003, PubMed:26074072). Aggregation requires an intact actin cytoskeleton (PubMed:26074072). Interacts with STAT5B; this inhibits STAT5B-mediated transcriptional activation (PubMed:20639532). Interacts with E3 ubiquitin-protein ligase NEURL1; this leads to monoubiquitination and activation of CPEB3 (PubMed:22153079). Interacts with CAPN2; this leads to cleavage of CPEB3 (PubMed:22711986). Interacts (via C-terminal RNA-binding region) with TOB1; TOB1 also binds CNOT7/CAF1 and recruits it to CPEB3 to form a ternary complex (By similarity). Interacts with SUMO-conjugating enzyme UBC9 (PubMed:26074071). Interacts with IPO5; the interaction is enhanced in a RAN-regulated manner following neuronal stimulation and mediates CPEB3 nuclear import (By similarity). Interacts with exportin XPO1/CRM1 (By similarity).By similarity6 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
Neurl1Q923S69EBI-5376779,EBI-5376802
UbcP629912EBI-5376779,EBI-413074

Protein-protein interaction databases

BioGridi229024. 1 interactor.
IntActiQ7TN99. 2 interactors.
STRINGi10090.ENSMUSP00000078690.

Structurei

3D structure databases

ProteinModelPortaliQ7TN99.
SMRiQ7TN99.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini459 – 550RRM 1PROSITE-ProRule annotationAdd BLAST92
Domaini567 – 649RRM 2PROSITE-ProRule annotationAdd BLAST83

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi13 – 31Gln-richAdd BLAST19
Compositional biasi32 – 202Pro-richAdd BLAST171
Compositional biasi226 – 236Poly-AlaAdd BLAST11

Domaini

The N-terminal Gln-rich region is required for the formation of amyloid-like oligomers and for the stability of long-term potentiation and spatial memory.1 Publication

Sequence similaritiesi

Belongs to the RRM CPEB family.Curated
Contains 2 RRM (RNA recognition motif) domains.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiKOG0129. Eukaryota.
ENOG410Y1XZ. LUCA.
GeneTreeiENSGT00390000012886.
HOGENOMiHOG000290660.
HOVERGENiHBG058010.
InParanoidiQ7TN99.
KOiK02602.
PhylomeDBiQ7TN99.
TreeFamiTF317658.

Family and domain databases

Gene3Di3.30.70.330. 2 hits.
InterProiIPR032296. CEBP_ZZ.
IPR012677. Nucleotide-bd_a/b_plait.
IPR000504. RRM_dom.
[Graphical view]
PfamiPF16366. CEBP_ZZ. 1 hit.
PF16367. RRM_7. 1 hit.
[Graphical view]
SMARTiSM00360. RRM. 2 hits.
[Graphical view]
SUPFAMiSSF54928. SSF54928. 1 hit.
PROSITEiPS50102. RRM. 2 hits.
[Graphical view]

Sequences (6)i

Sequence statusi: Complete.

This entry describes 6 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q7TN99-1) [UniParc]FASTAAdd to basket
Also known as: mCPEB-3a

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MQDDLLMDKS KTQPQSQQQQ RQQQQQQQQL QPEPGAAEAP STPLSSEIPK
60 70 80 90 100
PEDSSAVPAL SPASAPPAPN GPDKMQMESP LLPGLSFHQP PQQPPPPQEP
110 120 130 140 150
TAPGASLSPS FGSTWSTGTT NAVEDSFFQG ITPVNGTMLF QNFPHHVNPV
160 170 180 190 200
FGGTFSPQIG LAQTQHHQQP PPPAPQPPQP AQPPQAQPSQ QRRSPASPSQ
210 220 230 240 250
APYAQRSAAA YGHQPIMTSK PSSSSAVAAA AAAAAASSAS SSWNTHQSVN
260 270 280 290 300
AAWSAPSNPW GGLQAGRDPR RAVGVGVGVG VGVPSPLNPI SPLKKPFSSN
310 320 330 340 350
VIAPPKFPRA APLTSKSWME DNAFRTDNGN NLLPFQDRSR PYDTFNLHSL
360 370 380 390 400
ENSLMDMIRT DHEPLKGKHY PNSGPPMSFA DIMWRNHFAG RMGINFHHPG
410 420 430 440 450
TDNIMALNTR SYGRRRGRSS LFPFEDAFLD DSHGDQALSS GLSSPTRCQN
460 470 480 490 500
GERVERYSRK VFVGGLPPDI DEDEITASFR RFGPLVVDWP HKAESKSYFP
510 520 530 540 550
PKGYAFLLFQ EESSVQALID ACLEEDGKLY LCVSSPTIKD KPVQIRPWNL
560 570 580 590 600
SDSDFVMDGS QPLDPRKTIF VGGVPRPLRA VELAMIMDRL YGGVCYAGID
610 620 630 640 650
TDPELKYPKG AGRVAFSNQQ SYIAAISARF VQLQHNDIDK RVEVKPYVLD
660 670 680 690 700
DQMCDECQGT RCGGKFAPFF CANVTCLQYY CEYCWASIHS RAGREFHKPL
710
VKEGGDRPRH VPFRWS
Length:716
Mass (Da):78,335
Last modified:October 1, 2003 - v1
Checksum:i1FA1D8125FD69819
GO
Isoform 2 (identifier: Q7TN99-2) [UniParc]FASTAAdd to basket
Also known as: mCPEB-3b

The sequence of this isoform differs from the canonical sequence as follows:
     409-416: Missing.

Show »
Length:708
Mass (Da):77,302
Checksum:i44921082F8A1AEAC
GO
Isoform 3 (identifier: Q7TN99-3) [UniParc]FASTAAdd to basket
Also known as: mCPEB-3c

The sequence of this isoform differs from the canonical sequence as follows:
     367-389: Missing.

Show »
Length:693
Mass (Da):75,692
Checksum:i8DF1697A5BAEF6AF
GO
Isoform 4 (identifier: Q7TN99-4) [UniParc]FASTAAdd to basket
Also known as: mCPEB-3d

The sequence of this isoform differs from the canonical sequence as follows:
     367-389: Missing.
     409-416: Missing.

Show »
Length:685
Mass (Da):74,659
Checksum:i088EA7D992502AFD
GO
Isoform 5 (identifier: Q7TN99-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     367-389: Missing.
     581-584: VELA → GEWK
     585-716: Missing.

Show »
Length:561
Mass (Da):60,702
Checksum:iDB0FAD814A67D91E
GO
Isoform 6 (identifier: Q7TN99-6) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-216: Missing.
     367-389: Missing.

Show »
Length:477
Mass (Da):52,690
Checksum:i3B60E9638C701797
GO

Sequence cautioni

The sequence BAC41458 differs from that shown. Reason: Erroneous initiation.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti372N → P in BAE27791 (PubMed:16141072).Curated1
Sequence conflicti372N → P in BAC41458 (Ref. 3) Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0585651 – 216Missing in isoform 6. 1 PublicationAdd BLAST216
Alternative sequenceiVSP_022036367 – 389Missing in isoform 3, isoform 4, isoform 5 and isoform 6. 3 PublicationsAdd BLAST23
Alternative sequenceiVSP_022037409 – 416Missing in isoform 2 and isoform 4. 1 Publication8
Alternative sequenceiVSP_022038581 – 584VELA → GEWK in isoform 5. 1 Publication4
Alternative sequenceiVSP_022039585 – 716Missing in isoform 5. 1 PublicationAdd BLAST132

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AY313774 mRNA. Translation: AAQ20843.1.
AK147243 mRNA. Translation: BAE27791.1.
AK161513 mRNA. Translation: BAE36436.1.
AB093274 mRNA. Translation: BAC41458.1. Different initiation.
BC128377 mRNA. Translation: AAI28378.1.
CCDSiCCDS29775.1. [Q7TN99-1]
CCDS70946.1. [Q7TN99-4]
CCDS70947.1. [Q7TN99-3]
RefSeqiNP_001277755.1. NM_001290826.1.
NP_001277756.1. NM_001290827.1.
NP_001277757.1. NM_001290828.1. [Q7TN99-3]
NP_001277758.1. NM_001290829.1.
NP_938042.2. NM_198300.3.
XP_006526892.1. XM_006526829.3.
XP_006526893.1. XM_006526830.3.
XP_006526894.1. XM_006526831.3.
XP_006526895.1. XM_006526832.3.
XP_006526896.1. XM_006526833.3.
XP_006526897.1. XM_006526834.2.
XP_006526902.1. XM_006526839.3.
XP_011245510.1. XM_011247208.1. [Q7TN99-3]
UniGeneiMm.391176.

Genome annotation databases

EnsembliENSMUST00000126188; ENSMUSP00000120416; ENSMUSG00000039652. [Q7TN99-3]
ENSMUST00000126781; ENSMUSP00000122442; ENSMUSG00000039652. [Q7TN99-5]
GeneIDi208922.
KEGGimmu:208922.
UCSCiuc008hhz.2. mouse. [Q7TN99-1]
uc008hia.2. mouse. [Q7TN99-3]
uc008hic.2. mouse. [Q7TN99-5]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AY313774 mRNA. Translation: AAQ20843.1.
AK147243 mRNA. Translation: BAE27791.1.
AK161513 mRNA. Translation: BAE36436.1.
AB093274 mRNA. Translation: BAC41458.1. Different initiation.
BC128377 mRNA. Translation: AAI28378.1.
CCDSiCCDS29775.1. [Q7TN99-1]
CCDS70946.1. [Q7TN99-4]
CCDS70947.1. [Q7TN99-3]
RefSeqiNP_001277755.1. NM_001290826.1.
NP_001277756.1. NM_001290827.1.
NP_001277757.1. NM_001290828.1. [Q7TN99-3]
NP_001277758.1. NM_001290829.1.
NP_938042.2. NM_198300.3.
XP_006526892.1. XM_006526829.3.
XP_006526893.1. XM_006526830.3.
XP_006526894.1. XM_006526831.3.
XP_006526895.1. XM_006526832.3.
XP_006526896.1. XM_006526833.3.
XP_006526897.1. XM_006526834.2.
XP_006526902.1. XM_006526839.3.
XP_011245510.1. XM_011247208.1. [Q7TN99-3]
UniGeneiMm.391176.

3D structure databases

ProteinModelPortaliQ7TN99.
SMRiQ7TN99.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi229024. 1 interactor.
IntActiQ7TN99. 2 interactors.
STRINGi10090.ENSMUSP00000078690.

PTM databases

iPTMnetiQ7TN99.
PhosphoSitePlusiQ7TN99.

Proteomic databases

PaxDbiQ7TN99.
PeptideAtlasiQ7TN99.
PRIDEiQ7TN99.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000126188; ENSMUSP00000120416; ENSMUSG00000039652. [Q7TN99-3]
ENSMUST00000126781; ENSMUSP00000122442; ENSMUSG00000039652. [Q7TN99-5]
GeneIDi208922.
KEGGimmu:208922.
UCSCiuc008hhz.2. mouse. [Q7TN99-1]
uc008hia.2. mouse. [Q7TN99-3]
uc008hic.2. mouse. [Q7TN99-5]

Organism-specific databases

CTDi22849.
MGIiMGI:2443075. Cpeb3.
RougeiSearch...

Phylogenomic databases

eggNOGiKOG0129. Eukaryota.
ENOG410Y1XZ. LUCA.
GeneTreeiENSGT00390000012886.
HOGENOMiHOG000290660.
HOVERGENiHBG058010.
InParanoidiQ7TN99.
KOiK02602.
PhylomeDBiQ7TN99.
TreeFamiTF317658.

Miscellaneous databases

ChiTaRSiCpeb3. mouse.
PROiQ7TN99.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000039652.
ExpressionAtlasiQ7TN99. baseline and differential.
GenevisibleiQ7TN99. MM.

Family and domain databases

Gene3Di3.30.70.330. 2 hits.
InterProiIPR032296. CEBP_ZZ.
IPR012677. Nucleotide-bd_a/b_plait.
IPR000504. RRM_dom.
[Graphical view]
PfamiPF16366. CEBP_ZZ. 1 hit.
PF16367. RRM_7. 1 hit.
[Graphical view]
SMARTiSM00360. RRM. 2 hits.
[Graphical view]
SUPFAMiSSF54928. SSF54928. 1 hit.
PROSITEiPS50102. RRM. 2 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCPEB3_MOUSE
AccessioniPrimary (citable) accession number: Q7TN99
Secondary accession number(s): A1A562
, Q3TT89, Q3UHR6, Q8CHC2
Entry historyi
Integrated into UniProtKB/Swiss-Prot: December 12, 2006
Last sequence update: October 1, 2003
Last modified: November 30, 2016
This is version 106 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.