Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1

Gene

Plcg1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Mediates the production of the second messenger molecules diacylglycerol (DAG) and inositol 1,4,5-trisphosphate (IP3). Plays an important role in the regulation of intracellular signaling cascades. Becomes activated in response to ligand-mediated activation of receptor-type tyrosine kinases, such as PDGFRA, PDGFRB, FGFR1, FGFR2, FGFR3 and FGFR4. Plays a role in actin reorganization and cell migration.1 Publication

Catalytic activityi

1-phosphatidyl-1D-myo-inositol 4,5-bisphosphate + H2O = 1D-myo-inositol 1,4,5-trisphosphate + diacylglycerol.

Cofactori

Enzyme regulationi

Activated by phosphorylation on tyrosine residues.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Active sitei335PROSITE-ProRule annotation1
Active sitei380PROSITE-ProRule annotation1

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Calcium bindingi165 – 176PROSITE-ProRule annotationAdd BLAST12

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Hydrolase, Transducer

Keywords - Biological processi

Lipid degradation, Lipid metabolism

Keywords - Ligandi

Calcium, Metal-binding

Enzyme and pathway databases

BRENDAi3.1.4.11. 3474.
ReactomeiR-MMU-1169408. ISG15 antiviral mechanism.
R-MMU-170968. Frs2-mediated activation.
R-MMU-1855204. Synthesis of IP3 and IP4 in the cytosol.
R-MMU-186763. Downstream signal transduction.
R-MMU-202433. Generation of second messenger molecules.
R-MMU-212718. EGFR interacts with phospholipase C-gamma.
R-MMU-2424491. DAP12 signaling.
R-MMU-2871796. FCERI mediated MAPK activation.
R-MMU-2871809. FCERI mediated Ca+2 mobilization.
R-MMU-5218921. VEGFR2 mediated cell proliferation.
R-MMU-5621480. Dectin-2 family.
R-MMU-5654219. Phospholipase C-mediated cascade: FGFR1.
R-MMU-5654221. Phospholipase C-mediated cascade, FGFR2.
R-MMU-5654227. Phospholipase C-mediated cascade, FGFR3.
R-MMU-5654228. Phospholipase C-mediated cascade, FGFR4.
R-MMU-8853659. RET signaling.
R-MMU-983695. Antigen activates B Cell Receptor (BCR) leading to generation of second messengers.

Names & Taxonomyi

Protein namesi
Recommended name:
1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1 (EC:3.1.4.11)
Alternative name(s):
Phosphoinositide phospholipase C-gamma-1
Phospholipase C-gamma-1
Short name:
PLC-gamma-1
Gene namesi
Name:Plcg1
Synonyms:Plcg-1
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 2

Organism-specific databases

MGIiMGI:97615. Plcg1.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Cell projection

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Initiator methionineiRemovedBy similarity
ChainiPRO_00000884992 – 13021-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1Add BLAST1301

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei2N-acetylalanineBy similarity1
Modified residuei506PhosphotyrosineCombined sources1
Modified residuei771Phosphotyrosine; by SYKCombined sources1
Modified residuei775PhosphotyrosineCombined sources1
Modified residuei783Phosphotyrosine; by ITK, SYK and TXKBy similarity1
Modified residuei977PhosphotyrosineCombined sources1
Modified residuei1221PhosphoserineBy similarity1
Modified residuei1227PhosphoserineBy similarity1
Modified residuei1233PhosphoserineBy similarity1
Modified residuei1248PhosphoserineBy similarity1
Modified residuei1253PhosphotyrosineBy similarity1
Modified residuei1263PhosphoserineBy similarity1

Post-translational modificationi

Tyrosine phosphorylated in response to signaling via activated FLT3, KIT and PDGFRA (By similarity). Tyrosine phosphorylated by activated FGFR1, FGFR2, FGFR3 and FGFR4. Tyrosine phosphorylated by activated FLT1 and KDR. Tyrosine phosphorylated by activated PDGFRB. The receptor-mediated activation of PLCG1 involves its phosphorylation by tyrosine kinases in response to ligation of a variety of growth factor receptors and immune system receptors. For instance, SYK phosphorylates and activates PLCG1 in response to ligation of the B-cell receptor. Phosphorylated by ITK and TXK on Tyr-783 upon TCR activation in T-cells. May be dephosphorylated by PTPRJ (By similarity).By similarity
Ubiquitinated by CBLB in activated T-cells.1 Publication

Keywords - PTMi

Acetylation, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiQ62077.
MaxQBiQ62077.
PaxDbiQ62077.
PRIDEiQ62077.

PTM databases

iPTMnetiQ62077.
PhosphoSitePlusiQ62077.

Expressioni

Gene expression databases

BgeeiENSMUSG00000016933.
CleanExiMM_PLCG1.
ExpressionAtlasiQ62077. baseline and differential.

Interactioni

Subunit structurei

Interacts (via SH2 domain) with FGFR1, FGFR2, FGFR3 and FGFR4 (phosphorylated). Interacts with RALGPS1. Interacts (via SH2 domains) with VIL1 (phosphorylated at C-terminus tyrosine phosphorylation sites). Interacts (via SH2 domain) with RET (By similarity). Interacts with AGAP2 via its SH3 domain. Interacts with LAT (phosphorylated) upon TCR activation. Interacts (via SH3 domain) with the Pro-rich domain of TNK1. Associates with BLNK, VAV1, GRB2 and NCK1 in a B-cell antigen receptor-dependent fashion. Interacts with CBLB in activated T-cells; which inhibits phosphorylation. Interacts with SHB. Interacts (via SH3 domain) with the Arg/Gly-rich-flanked Pro-rich domains of KHDRBS1/SAM68. This interaction is selectively regulated by arginine methylation of KHDRBS1/SAM68. Interacts with INPP5D/SHIP1, THEMIS and CLNK. Interacts with FLT4 and KIT. Interacts with AXL (By similarity). Interacts with SYK; activates PLCG1 (By similarity). Interacts with FLT1 (tyrosine-phosphorylated). Interacts (via SH2 domain) with PDGFRA and PDGFRB (tyrosine phosphorylated). Interacts with PIP5K1C. Interacts with NTRK1 and NTRK2 (phosphorylated upon ligand-binding). Interacts with TESPA1 (By similarity). Interacts with GRB2, LAT and THEMIS upon TCR activation in thymocytes; the association is weaker in the absence of TESPA1.By similarity14 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
EporP147534EBI-300133,EBI-617901
Grb2Q606312EBI-300133,EBI-1688
LatO549572EBI-300133,EBI-6390034
LeprP483562EBI-300133,EBI-2257257

GO - Molecular functioni

  • glutamate receptor binding Source: MGI
  • neurotrophin TRKA receptor binding Source: MGI
  • protein kinase binding Source: MGI
  • receptor tyrosine kinase binding Source: UniProtKB

Protein-protein interaction databases

BioGridi202238. 10 interactors.
DIPiDIP-29284N.
IntActiQ62077. 19 interactors.
MINTiMINT-124146.
STRINGi10090.ENSMUSP00000099404.

Structurei

3D structure databases

ProteinModelPortaliQ62077.
SMRiQ62077.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini27 – 142PH 1PROSITE-ProRule annotationAdd BLAST116
Domaini152 – 187EF-handPROSITE-ProRule annotationAdd BLAST36
Domaini320 – 464PI-PLC X-boxPROSITE-ProRule annotationAdd BLAST145
Domaini489 – 523PH 2; first partPROSITE-ProRule annotationAdd BLAST35
Domaini550 – 657SH2 1PROSITE-ProRule annotationAdd BLAST108
Domaini668 – 756SH2 2PROSITE-ProRule annotationAdd BLAST89
Domaini791 – 851SH3PROSITE-ProRule annotationAdd BLAST61
Domaini895 – 931PH 2; second partPROSITE-ProRule annotationAdd BLAST37
Domaini953 – 1070PI-PLC Y-boxPROSITE-ProRule annotationAdd BLAST118
Domaini1075 – 1177C2PROSITE-ProRule annotationAdd BLAST103

Domaini

The SH3 domain mediates interaction with RALGPS1 (By similarity). The SH3 domain also mediates interaction with CLNK.By similarity1 Publication

Sequence similaritiesi

Contains 1 C2 domain.PROSITE-ProRule annotation
Contains 1 EF-hand domain.PROSITE-ProRule annotation
Contains 2 PH domains.PROSITE-ProRule annotation
Contains 1 PI-PLC X-box domain.PROSITE-ProRule annotation
Contains 1 PI-PLC Y-box domain.PROSITE-ProRule annotation
Contains 2 SH2 domains.PROSITE-ProRule annotation
Contains 1 SH3 domain.PROSITE-ProRule annotation

Keywords - Domaini

Repeat, SH2 domain, SH3 domain

Phylogenomic databases

eggNOGiKOG1264. Eukaryota.
ENOG410XPXE. LUCA.
GeneTreeiENSGT00730000110782.
HOGENOMiHOG000230864.
HOVERGENiHBG053611.
InParanoidiQ62077.
KOiK01116.
OMAiTMDLPFL.
OrthoDBiEOG091G07R3.
PhylomeDBiQ62077.
TreeFamiTF313216.

Family and domain databases

Gene3Di1.10.238.10. 1 hit.
2.30.29.30. 3 hits.
2.60.40.150. 1 hit.
3.20.20.190. 2 hits.
3.30.505.10. 2 hits.
InterProiIPR000008. C2_dom.
IPR011992. EF-hand-dom_pair.
IPR018247. EF_Hand_1_Ca_BS.
IPR002048. EF_hand_dom.
IPR011993. PH_dom-like.
IPR001849. PH_domain.
IPR001192. PI-PLC_fam.
IPR016279. PLC-gamma.
IPR028380. PLC-gamma1.
IPR017946. PLC-like_Pdiesterase_TIM-brl.
IPR000909. PLipase_C_PInositol-sp_X_dom.
IPR001711. PLipase_C_Pinositol-sp_Y.
IPR000980. SH2.
IPR001452. SH3_domain.
[Graphical view]
PANTHERiPTHR10336. PTHR10336. 2 hits.
PTHR10336:SF52. PTHR10336:SF52. 2 hits.
PfamiPF00168. C2. 1 hit.
PF00388. PI-PLC-X. 1 hit.
PF00387. PI-PLC-Y. 1 hit.
PF00017. SH2. 2 hits.
PF00018. SH3_1. 1 hit.
[Graphical view]
PIRSFiPIRSF000952. PLC-gamma. 1 hit.
PRINTSiPR00390. PHPHLIPASEC.
PR00401. SH2DOMAIN.
PR00452. SH3DOMAIN.
SMARTiSM00239. C2. 1 hit.
SM00233. PH. 3 hits.
SM00148. PLCXc. 1 hit.
SM00149. PLCYc. 1 hit.
SM00252. SH2. 2 hits.
SM00326. SH3. 1 hit.
[Graphical view]
SUPFAMiSSF47473. SSF47473. 1 hit.
SSF49562. SSF49562. 1 hit.
SSF50044. SSF50044. 1 hit.
SSF50729. SSF50729. 1 hit.
SSF51695. SSF51695. 2 hits.
SSF55550. SSF55550. 2 hits.
PROSITEiPS50004. C2. 1 hit.
PS00018. EF_HAND_1. 1 hit.
PS50222. EF_HAND_2. 1 hit.
PS50003. PH_DOMAIN. 2 hits.
PS50007. PIPLC_X_DOMAIN. 1 hit.
PS50008. PIPLC_Y_DOMAIN. 1 hit.
PS50001. SH2. 2 hits.
PS50002. SH3. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

Q62077-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MAGVATPCAN GCGPGAPSEA EVLHLCRSLE VGTVMTLFYS KKSQRPERKT
60 70 80 90 100
FQVKLETRQI TWSRGADKIE GSIDIREIKE IRPGKTSRDF DRYQEDPAFR
110 120 130 140 150
PDQSHCFVIL YGMEFRLKTL SLQATSEDEV NMWIKGLTWL MEDTLQAATP
160 170 180 190 200
LQIERWLRKQ FYSVDRNRED RISAKDLKNM LSQVNYRVPN MRFLRERLTD
210 220 230 240 250
LEQRSGDITY GQFAQLYRSL MYSAQKTMDL PFLETNALRT GERPEHCQVS
260 270 280 290 300
LSEFQQFLLE YQGELWAVDR LQVQEFMLSF LRDPLREIEE PYFFLDELVT
310 320 330 340 350
FLFSKENSVW NSQLDAVCPD TMNNPLSHYW ISSSHNTYLT GDQFSSESSL
360 370 380 390 400
EAYARCLRMG CRCIELDCWD GPDGMPVIYH GHTLTTKIKF SDVLHTIKEH
410 420 430 440 450
AFVASEYPVI LSIEDHCSIA QQRNMAQHFR KVLGDTLLTK PVDIAADGLP
460 470 480 490 500
SPNQLRRKIL IKHKKLAEGS AYEEVPTSVM YSENDISNSI KNGILYLEDP
510 520 530 540 550
VNHEWYPHYF VLTSSKIYYS EETSSDQGNE DEEEPKEASS STELHSSEKW
560 570 580 590 600
FHGKLGAGRD GRHIAERLLT EYCIETGAPD GSFLVRESET FVGDYTLSFW
610 620 630 640 650
RNGKVQHCRI HSRQDAGTPK FFLTDNLVFD SLYDLITHYQ QVPLRCNEFE
660 670 680 690 700
MRLSEPVPQT NAHESKEWYH ASLTRAQAEH MLMRVPRDGA FLVRKRNEPN
710 720 730 740 750
SYAISFRAEG KIKHCRVQQE GQTVMLGNSE FDSLVDLISY YEKHPLYRKM
760 770 780 790 800
KLRYPINEEA LEKIGTAEPD YGALYEGRNP GFYVEANPMP TFKCAVKALF
810 820 830 840 850
DYKAQREDEL TFTKSAIIQN VEKQDGGWWR GDYGGKKQLW FPSNYVEEMI
860 870 880 890 900
NPAVLEPERE HLDENSPLGD LLRGVLDVPA CQIAIRPEGK NNRLFVFSIS
910 920 930 940 950
MPSVAQWSLD VAADSQEELQ DWVKKIREVA QTADARLTEG KMMERRKKIA
960 970 980 990 1000
LELSELVVYC RPVPFDEEKI GTERACYRDM SSFPETKAEK YVNKAKGKKF
1010 1020 1030 1040 1050
LQYNRLQLSR IYPKGQRLDS SNYDPLPMWI CGSQLVALNF QTPDKPMQMN
1060 1070 1080 1090 1100
QALFMAGGHC GYVLQPSTMR DEAFDPFDKS SLRGLEPCVI CIEVLGARHL
1110 1120 1130 1140 1150
PKNGRGIVCP FVEIEVAGAE YDSTKQKTEF VVDNGLNPVW PAKPFHFQIS
1160 1170 1180 1190 1200
NPEFAFLRFV VYEEDMFSDQ NFLAQATFPV KGLKTGYRAV PLKNNYSEDL
1210 1220 1230 1240 1250
ELASLLIKID IFPAKENGDL SPFSGISLRE RASDASSQLF HVRAREGSFE
1260 1270 1280 1290 1300
ARYQQPFEDF RISQEHLADH FDSRERSTSD GPSSATNLIE DPLHDKLWKC

SL
Length:1,302
Mass (Da):149,668
Last modified:January 24, 2006 - v2
Checksum:i5D123C508D425EB2
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti966D → A in CAA64639 (PubMed:8687404).Curated1
Sequence conflicti984P → R in CAA64639 (PubMed:8687404).Curated1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
BC065091 mRNA. Translation: AAH65091.1.
X95346 mRNA. Translation: CAA64639.1.
CCDSiCCDS16996.1.
RefSeqiNP_067255.2. NM_021280.3.
UniGeneiMm.44463.

Genome annotation databases

EnsembliENSMUST00000103115; ENSMUSP00000099404; ENSMUSG00000016933.
GeneIDi18803.
KEGGimmu:18803.
UCSCiuc008nra.1. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
BC065091 mRNA. Translation: AAH65091.1.
X95346 mRNA. Translation: CAA64639.1.
CCDSiCCDS16996.1.
RefSeqiNP_067255.2. NM_021280.3.
UniGeneiMm.44463.

3D structure databases

ProteinModelPortaliQ62077.
SMRiQ62077.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi202238. 10 interactors.
DIPiDIP-29284N.
IntActiQ62077. 19 interactors.
MINTiMINT-124146.
STRINGi10090.ENSMUSP00000099404.

PTM databases

iPTMnetiQ62077.
PhosphoSitePlusiQ62077.

Proteomic databases

EPDiQ62077.
MaxQBiQ62077.
PaxDbiQ62077.
PRIDEiQ62077.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000103115; ENSMUSP00000099404; ENSMUSG00000016933.
GeneIDi18803.
KEGGimmu:18803.
UCSCiuc008nra.1. mouse.

Organism-specific databases

CTDi5335.
MGIiMGI:97615. Plcg1.

Phylogenomic databases

eggNOGiKOG1264. Eukaryota.
ENOG410XPXE. LUCA.
GeneTreeiENSGT00730000110782.
HOGENOMiHOG000230864.
HOVERGENiHBG053611.
InParanoidiQ62077.
KOiK01116.
OMAiTMDLPFL.
OrthoDBiEOG091G07R3.
PhylomeDBiQ62077.
TreeFamiTF313216.

Enzyme and pathway databases

BRENDAi3.1.4.11. 3474.
ReactomeiR-MMU-1169408. ISG15 antiviral mechanism.
R-MMU-170968. Frs2-mediated activation.
R-MMU-1855204. Synthesis of IP3 and IP4 in the cytosol.
R-MMU-186763. Downstream signal transduction.
R-MMU-202433. Generation of second messenger molecules.
R-MMU-212718. EGFR interacts with phospholipase C-gamma.
R-MMU-2424491. DAP12 signaling.
R-MMU-2871796. FCERI mediated MAPK activation.
R-MMU-2871809. FCERI mediated Ca+2 mobilization.
R-MMU-5218921. VEGFR2 mediated cell proliferation.
R-MMU-5621480. Dectin-2 family.
R-MMU-5654219. Phospholipase C-mediated cascade: FGFR1.
R-MMU-5654221. Phospholipase C-mediated cascade, FGFR2.
R-MMU-5654227. Phospholipase C-mediated cascade, FGFR3.
R-MMU-5654228. Phospholipase C-mediated cascade, FGFR4.
R-MMU-8853659. RET signaling.
R-MMU-983695. Antigen activates B Cell Receptor (BCR) leading to generation of second messengers.

Miscellaneous databases

PROiQ62077.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000016933.
CleanExiMM_PLCG1.
ExpressionAtlasiQ62077. baseline and differential.

Family and domain databases

Gene3Di1.10.238.10. 1 hit.
2.30.29.30. 3 hits.
2.60.40.150. 1 hit.
3.20.20.190. 2 hits.
3.30.505.10. 2 hits.
InterProiIPR000008. C2_dom.
IPR011992. EF-hand-dom_pair.
IPR018247. EF_Hand_1_Ca_BS.
IPR002048. EF_hand_dom.
IPR011993. PH_dom-like.
IPR001849. PH_domain.
IPR001192. PI-PLC_fam.
IPR016279. PLC-gamma.
IPR028380. PLC-gamma1.
IPR017946. PLC-like_Pdiesterase_TIM-brl.
IPR000909. PLipase_C_PInositol-sp_X_dom.
IPR001711. PLipase_C_Pinositol-sp_Y.
IPR000980. SH2.
IPR001452. SH3_domain.
[Graphical view]
PANTHERiPTHR10336. PTHR10336. 2 hits.
PTHR10336:SF52. PTHR10336:SF52. 2 hits.
PfamiPF00168. C2. 1 hit.
PF00388. PI-PLC-X. 1 hit.
PF00387. PI-PLC-Y. 1 hit.
PF00017. SH2. 2 hits.
PF00018. SH3_1. 1 hit.
[Graphical view]
PIRSFiPIRSF000952. PLC-gamma. 1 hit.
PRINTSiPR00390. PHPHLIPASEC.
PR00401. SH2DOMAIN.
PR00452. SH3DOMAIN.
SMARTiSM00239. C2. 1 hit.
SM00233. PH. 3 hits.
SM00148. PLCXc. 1 hit.
SM00149. PLCYc. 1 hit.
SM00252. SH2. 2 hits.
SM00326. SH3. 1 hit.
[Graphical view]
SUPFAMiSSF47473. SSF47473. 1 hit.
SSF49562. SSF49562. 1 hit.
SSF50044. SSF50044. 1 hit.
SSF50729. SSF50729. 1 hit.
SSF51695. SSF51695. 2 hits.
SSF55550. SSF55550. 2 hits.
PROSITEiPS50004. C2. 1 hit.
PS00018. EF_HAND_1. 1 hit.
PS50222. EF_HAND_2. 1 hit.
PS50003. PH_DOMAIN. 2 hits.
PS50007. PIPLC_X_DOMAIN. 1 hit.
PS50008. PIPLC_Y_DOMAIN. 1 hit.
PS50001. SH2. 2 hits.
PS50002. SH3. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiPLCG1_MOUSE
AccessioniPrimary (citable) accession number: Q62077
Secondary accession number(s): Q6P1G1
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 15, 1998
Last sequence update: January 24, 2006
Last modified: November 2, 2016
This is version 166 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.