Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Poly(ADP-ribose) glycohydrolase

Gene

PARG

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Poly(ADP-ribose) synthesized after DNA damage is only present transiently and is rapidly degraded by poly(ADP-ribose) glycohydrolase (PubMed:23102699). PARG acts both as an endo- and exoglycosidase, releasing PAR of different length as well as ADP-ribose monomers (PubMed:23102699). Required for retinoid acid-dependent gene transactivation, probably by dePARsylating histone demethylase KDM4D, allowing chromatin derepression at RAR-dependent gene promoters (PubMed:23102699). Involved in the synthesis of ATP in the nucleus, together with PARP1, NMNAT1 and NUDT5 (PubMed:27257257). Nuclear ATP generation is required for extensive chromatin remodeling events that are energy-consuming (PubMed:27257257).2 Publications

Catalytic activityi

Hydrolyzes poly(ADP-D-ribose) at glycosidic (1''-2') linkage of ribose-ribose bond to produce free ADP-D-ribose.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Active sitei7371 Publication1
Binding sitei740Substrate1
Binding sitei754Substrate; via amide nitrogen1
Active sitei7551 Publication1
Active sitei7561 Publication1
Binding sitei795SubstrateBy similarity1

GO - Molecular functioni

  • poly(ADP-ribose) glycohydrolase activity Source: UniProtKB

GO - Biological processi

  • ATP generation from poly-ADP-D-ribose Source: UniProtKB
  • carbohydrate metabolic process Source: InterPro
Complete GO annotation...

Keywords - Molecular functioni

Hydrolase

Enzyme and pathway databases

BioCyciZFISH:ENSG00000138964-MONOMER.
ZFISH:HS05225-MONOMER.
BRENDAi3.2.1.143. 2681.
ReactomeiR-HSA-110362. POLB-Dependent Long Patch Base Excision Repair.

Names & Taxonomyi

Protein namesi
Recommended name:
Poly(ADP-ribose) glycohydrolase (EC:3.2.1.143)
Gene namesi
Name:PARG
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 10

Organism-specific databases

HGNCiHGNC:8605. PARG.

Subcellular locationi

Isoform 1 :
Isoform 2 :

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Mitochondrion, Nucleus

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi12K → A: Abolishes nuclear targeting; when associated with G-13. 1 Publication1
Mutagenesisi13R → G: Abolishes nuclear targeting; when associated with A-12. 1 Publication1
Mutagenesisi36R → A: No effect. 1 Publication1
Mutagenesisi37R → G: No effect. 1 Publication1

Organism-specific databases

DisGeNETi8505.
OpenTargetsiENSG00000227345.
PharmGKBiPA32940.

Chemistry databases

ChEMBLiCHEMBL1795143.

Polymorphism and mutation databases

BioMutaiPARG.
DMDMi56417893.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000666021 – 976Poly(ADP-ribose) glycohydrolaseAdd BLAST976

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei22PhosphoserineCombined sources1
Modified residuei68PhosphoserineCombined sources1
Modified residuei133PhosphoserineCombined sources1
Modified residuei137PhosphoserineCombined sources1
Modified residuei139PhosphothreonineCombined sources1
Modified residuei197PhosphoserineCombined sources1 Publication1
Modified residuei199PhosphothreonineCombined sources1 Publication1
Modified residuei261PhosphoserineCombined sources1
Modified residuei264PhosphoserineCombined sources1
Modified residuei286PhosphoserineCombined sources1
Modified residuei291PhosphoserineCombined sources1
Modified residuei298Phosphoserine1 Publication1
Modified residuei302PhosphoserineCombined sources1
Modified residuei316PhosphoserineCombined sources1
Modified residuei340N6-acetyllysineBy similarity1
Modified residuei448PhosphoserineCombined sources1
Isoform 2 (identifier: Q86W56-2)
Modified residuei1N-acetylmethionineCombined sources1

Keywords - PTMi

Acetylation, Phosphoprotein

Proteomic databases

EPDiQ86W56.
MaxQBiQ86W56.
PaxDbiQ86W56.
PeptideAtlasiQ86W56.
PRIDEiQ86W56.

PTM databases

iPTMnetiQ86W56.
PhosphoSitePlusiQ86W56.

Miscellaneous databases

PMAP-CutDBQ86W56.

Expressioni

Tissue specificityi

Ubiquitously expressed.1 Publication

Gene expression databases

BgeeiENSG00000227345.
CleanExiHS_PARG.
ExpressionAtlasiQ86W56. baseline and differential.
GenevisibleiQ86W56. HS.

Organism-specific databases

HPAiHPA021819.
HPA053007.

Interactioni

Subunit structurei

Interacts with PCNA (PubMed:21398629). Interacts with NUDT5 (PubMed:27257257).2 Publications

Protein-protein interaction databases

BioGridi114077. 7 interactors.
IntActiQ86W56. 2 interactors.
STRINGi9606.ENSP00000384408.

Chemistry databases

BindingDBiQ86W56.

Structurei

Secondary structure

1976
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi452 – 456Combined sources5
Helixi458 – 460Combined sources3
Turni464 – 467Combined sources4
Beta strandi480 – 482Combined sources3
Helixi486 – 488Combined sources3
Beta strandi497 – 501Combined sources5
Beta strandi520 – 522Combined sources3
Beta strandi532 – 534Combined sources3
Helixi535 – 543Combined sources9
Helixi550 – 559Combined sources10
Helixi562 – 564Combined sources3
Turni565 – 567Combined sources3
Helixi571 – 579Combined sources9
Helixi583 – 591Combined sources9
Helixi593 – 602Combined sources10
Helixi604 – 607Combined sources4
Beta strandi621 – 626Combined sources6
Helixi627 – 638Combined sources12
Turni652 – 655Combined sources4
Helixi662 – 665Combined sources4
Helixi672 – 688Combined sources17
Beta strandi694 – 701Combined sources8
Helixi708 – 710Combined sources3
Beta strandi718 – 724Combined sources7
Helixi726 – 729Combined sources4
Turni730 – 732Combined sources3
Beta strandi733 – 739Combined sources7
Turni743 – 750Combined sources8
Helixi754 – 761Combined sources8
Helixi763 – 767Combined sources5
Helixi768 – 771Combined sources4
Beta strandi779 – 784Combined sources6
Beta strandi790 – 793Combined sources4
Helixi796 – 798Combined sources3
Beta strandi800 – 804Combined sources5
Beta strandi815 – 818Combined sources4
Beta strandi820 – 825Combined sources6
Helixi832 – 836Combined sources5
Helixi838 – 852Combined sources15
Helixi859 – 861Combined sources3
Beta strandi865 – 868Combined sources4
Helixi873 – 875Combined sources3
Helixi879 – 892Combined sources14
Beta strandi897 – 900Combined sources4
Helixi905 – 920Combined sources16
Helixi925 – 939Combined sources15
Turni940 – 942Combined sources3
Beta strandi945 – 947Combined sources3
Helixi952 – 961Combined sources10

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
4A0DX-ray1.75A448-976[»]
4B1GX-ray1.83A448-976[»]
4B1HX-ray2.00A448-976[»]
4B1IX-ray2.14A448-976[»]
4B1JX-ray2.08A448-976[»]
5A7RX-ray1.95A448-976[»]
5LHBX-ray2.23A448-976[»]
ProteinModelPortaliQ86W56.
SMRiQ86W56.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 456A-domainAdd BLAST456
Regioni610 – 795CatalyticAdd BLAST186
Regioni726 – 727Substrate binding2
Regioni869 – 874Substrate binding6

Motif

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Motifi10 – 16Nuclear localization signal1 Publication7
Motifi76 – 83PIP-box (PCNA interacting peptide)8

Domaini

The PIP-box mediates interaction with PCNA and localization to replication foci.

Sequence similaritiesi

Phylogenomic databases

eggNOGiKOG2064. Eukaryota.
ENOG410XT3Y. LUCA.
GeneTreeiENSGT00390000003652.
HOVERGENiHBG053510.
InParanoidiQ86W56.
KOiK07759.
OMAiNAGPGCE.
OrthoDBiEOG091G01FM.
PhylomeDBiQ86W56.
TreeFamiTF323527.

Family and domain databases

InterProiIPR007724. Poly_GlycHdrlase.
[Graphical view]
PANTHERiPTHR12837. PTHR12837. 1 hit.
PfamiPF05028. PARG_cat. 1 hit.
[Graphical view]

Sequences (5)i

Sequence statusi: Complete.

This entry describes 5 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q86W56-1) [UniParc]FASTAAdd to basket
Also known as: hPARG111

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MNAGPGCEPC TKRPRWGAAT TSPAASDARS FPSRQRRVLD PKDAHVQFRV
60 70 80 90 100
PPSSPACVPG RAGQHRGSAT SLVFKQKTIT SWMDTKGIKT AESESLDSKE
110 120 130 140 150
NNNTRIESMM SSVQKDNFYQ HNVEKLENVS QLSLDKSPTE KSTQYLNQHQ
160 170 180 190 200
TAAMCKWQNE GKHTEQLLES EPQTVTLVPE QFSNANIDRS PQNDDHSDTD
210 220 230 240 250
SEENRDNQQF LTTVKLANAK QTTEDEQARE AKSHQKCSKS CDPGEDCASC
260 270 280 290 300
QQDEIDVVPE SPLSDVGSED VGTGPKNDNK LTRQESCLGN SPPFEKESEP
310 320 330 340 350
ESPMDVDNSK NSCQDSEADE ETSPGFDEQE DGSSSQTANK PSRFQARDAD
360 370 380 390 400
IEFRKRYSTK GGEVRLHFQF EGGESRTGMN DLNAKLPGNI SSLNVECRNS
410 420 430 440 450
KQHGKKDSKI TDHFMRLPKA EDRRKEQWET KHQRTERKIP KYVPPHLSPD
460 470 480 490 500
KKWLGTPIEE MRRMPRCGIR LPLLRPSANH TVTIRVDLLR AGEVPKPFPT
510 520 530 540 550
HYKDLWDNKH VKMPCSEQNL YPVEDENGER TAGSRWELIQ TALLNKFTRP
560 570 580 590 600
QNLKDAILKY NVAYSKKWDF TALIDFWDKV LEEAEAQHLY QSILPDMVKI
610 620 630 640 650
ALCLPNICTQ PIPLLKQKMN HSITMSQEQI ASLLANAFFC TFPRRNAKMK
660 670 680 690 700
SEYSSYPDIN FNRLFEGRSS RKPEKLKTLF CYFRRVTEKK PTGLVTFTRQ
710 720 730 740 750
SLEDFPEWER CEKPLTRLHV TYEGTIEENG QGMLQVDFAN RFVGGGVTSA
760 770 780 790 800
GLVQEEIRFL INPELIISRL FTEVLDHNEC LIITGTEQYS EYTGYAETYR
810 820 830 840 850
WSRSHEDGSE RDDWQRRCTE IVAIDALHFR RYLDQFVPEK MRRELNKAYC
860 870 880 890 900
GFLRPGVSSE NLSAVATGNW GCGAFGGDAR LKALIQILAA AAAERDVVYF
910 920 930 940 950
TFGDSELMRD IYSMHIFLTE RKLTVGDVYK LLLRYYNEEC RNCSTPGPDI
960 970
KLYPFIYHAV ESCAETADHS GQRTGT
Length:976
Mass (Da):111,110
Last modified:June 1, 2003 - v1
Checksum:iD6646353C6D0180E
GO
Isoform 2 (identifier: Q86W56-2) [UniParc]FASTAAdd to basket
Also known as: hPARG102

The sequence of this isoform differs from the canonical sequence as follows:
     1-82: Missing.

Show »
Length:894
Mass (Da):102,312
Checksum:i666B77E891E4A500
GO
Isoform 3 (identifier: Q86W56-3) [UniParc]FASTAAdd to basket
Also known as: hPARG99

The sequence of this isoform differs from the canonical sequence as follows:
     1-108: Missing.

Show »
Length:868
Mass (Da):99,432
Checksum:i85BCE7201ABABA5F
GO
Isoform 4 (identifier: Q86W56-4) [UniParc]FASTAAdd to basket
Also known as: hPARG60

The sequence of this isoform differs from the canonical sequence as follows:
     1-15: MNAGPGCEPCTKRPR → MVQAGAEKDAQSISL
     16-423: Missing.
     485-526: RVDLLRAGEVPKPFPTHYKDLWDNKHVKMPCSEQNLYPVEDE → W

Note: Catalytically inactive.
Show »
Length:527
Mass (Da):60,873
Checksum:i0FBB26152C56D33D
GO
Isoform 5 (identifier: Q86W56-5) [UniParc]FASTAAdd to basket
Also known as: hPARG55

The sequence of this isoform differs from the canonical sequence as follows:
     1-460: Missing.
     485-526: RVDLLRAGEVPKPFPTHYKDLWDNKHVKMPCSEQNLYPVEDE → W

Note: Catalytically inactive.
Show »
Length:475
Mass (Da):54,794
Checksum:i5443EFFA21AA84CE
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti13 – 14RP → AT in AAB61614 (PubMed:10449915).Curated2
Sequence conflicti61R → Q in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti127E → V in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti138P → L in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti227Q → H in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti242D → H in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti260E → K in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti275P → S in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti282T → I in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti414F → L in AAB61614 (PubMed:10449915).Curated1
Sequence conflicti466R → Q in AFM56043 (PubMed:22433848).Curated1
Sequence conflicti484I → V in AFM56043 (PubMed:22433848).Curated1
Sequence conflicti814 – 815WQ → CE in AAB61614 (PubMed:10449915).Curated2
Sequence conflicti817R → H in AAH52966 (PubMed:15489334).Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0446741 – 460Missing in isoform 5. 1 PublicationAdd BLAST460
Alternative sequenceiVSP_0117701 – 108Missing in isoform 3. 2 PublicationsAdd BLAST108
Alternative sequenceiVSP_0117691 – 82Missing in isoform 2. 2 PublicationsAdd BLAST82
Alternative sequenceiVSP_0446751 – 15MNAGP…TKRPR → MVQAGAEKDAQSISL in isoform 4. 1 PublicationAdd BLAST15
Alternative sequenceiVSP_04467616 – 423Missing in isoform 4. 1 PublicationAdd BLAST408
Alternative sequenceiVSP_044677485 – 526RVDLL…PVEDE → W in isoform 4 and isoform 5. 2 PublicationsAdd BLAST42

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AY258587 mRNA. Translation: AAP83314.1.
AY575848 mRNA. Translation: AAT66421.1.
AY575849 mRNA. Translation: AAT66422.1.
AF005043 mRNA. Translation: AAB61614.1.
EF382674 mRNA. Translation: ABR10027.1.
JQ890226 mRNA. Translation: AFM56043.1.
AK295786 mRNA. Translation: BAG58607.1.
AK302560 mRNA. Translation: BAG63826.1.
AK314909 mRNA. Translation: BAG37421.1.
BC050560 mRNA. Translation: AAH50560.1.
BC052966 mRNA. Translation: AAH52966.1.
CCDSiCCDS73130.1. [Q86W56-1]
RefSeqiNP_001289706.1. NM_001302777.1.
NP_001290415.1. NM_001303486.1. [Q86W56-2]
NP_001290416.1. NM_001303487.1. [Q86W56-3]
NP_001311310.1. NM_001324381.1. [Q86W56-2]
NP_003622.2. NM_003631.3. [Q86W56-1]
UniGeneiHs.10136.
Hs.535298.
Hs.732225.

Genome annotation databases

EnsembliENST00000402038; ENSP00000384408; ENSG00000227345. [Q86W56-1]
ENST00000616448; ENSP00000484285; ENSG00000227345. [Q86W56-1]
GeneIDi670.
8505.
KEGGihsa:8505.
UCSCiuc057tfe.1. human. [Q86W56-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AY258587 mRNA. Translation: AAP83314.1.
AY575848 mRNA. Translation: AAT66421.1.
AY575849 mRNA. Translation: AAT66422.1.
AF005043 mRNA. Translation: AAB61614.1.
EF382674 mRNA. Translation: ABR10027.1.
JQ890226 mRNA. Translation: AFM56043.1.
AK295786 mRNA. Translation: BAG58607.1.
AK302560 mRNA. Translation: BAG63826.1.
AK314909 mRNA. Translation: BAG37421.1.
BC050560 mRNA. Translation: AAH50560.1.
BC052966 mRNA. Translation: AAH52966.1.
CCDSiCCDS73130.1. [Q86W56-1]
RefSeqiNP_001289706.1. NM_001302777.1.
NP_001290415.1. NM_001303486.1. [Q86W56-2]
NP_001290416.1. NM_001303487.1. [Q86W56-3]
NP_001311310.1. NM_001324381.1. [Q86W56-2]
NP_003622.2. NM_003631.3. [Q86W56-1]
UniGeneiHs.10136.
Hs.535298.
Hs.732225.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
4A0DX-ray1.75A448-976[»]
4B1GX-ray1.83A448-976[»]
4B1HX-ray2.00A448-976[»]
4B1IX-ray2.14A448-976[»]
4B1JX-ray2.08A448-976[»]
5A7RX-ray1.95A448-976[»]
5LHBX-ray2.23A448-976[»]
ProteinModelPortaliQ86W56.
SMRiQ86W56.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi114077. 7 interactors.
IntActiQ86W56. 2 interactors.
STRINGi9606.ENSP00000384408.

Chemistry databases

BindingDBiQ86W56.
ChEMBLiCHEMBL1795143.

PTM databases

iPTMnetiQ86W56.
PhosphoSitePlusiQ86W56.

Polymorphism and mutation databases

BioMutaiPARG.
DMDMi56417893.

Proteomic databases

EPDiQ86W56.
MaxQBiQ86W56.
PaxDbiQ86W56.
PeptideAtlasiQ86W56.
PRIDEiQ86W56.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000402038; ENSP00000384408; ENSG00000227345. [Q86W56-1]
ENST00000616448; ENSP00000484285; ENSG00000227345. [Q86W56-1]
GeneIDi670.
8505.
KEGGihsa:8505.
UCSCiuc057tfe.1. human. [Q86W56-1]

Organism-specific databases

CTDi670.
8505.
DisGeNETi8505.
GeneCardsiPARG.
H-InvDBHIX0127080.
HGNCiHGNC:8605. PARG.
HPAiHPA021819.
HPA053007.
MIMi603501. gene.
neXtProtiNX_Q86W56.
OpenTargetsiENSG00000227345.
PharmGKBiPA32940.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG2064. Eukaryota.
ENOG410XT3Y. LUCA.
GeneTreeiENSGT00390000003652.
HOVERGENiHBG053510.
InParanoidiQ86W56.
KOiK07759.
OMAiNAGPGCE.
OrthoDBiEOG091G01FM.
PhylomeDBiQ86W56.
TreeFamiTF323527.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000138964-MONOMER.
ZFISH:HS05225-MONOMER.
BRENDAi3.2.1.143. 2681.
ReactomeiR-HSA-110362. POLB-Dependent Long Patch Base Excision Repair.

Miscellaneous databases

ChiTaRSiPARG. human.
GeneWikiiPARG.
PMAP-CutDBQ86W56.
PROiQ86W56.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000227345.
CleanExiHS_PARG.
ExpressionAtlasiQ86W56. baseline and differential.
GenevisibleiQ86W56. HS.

Family and domain databases

InterProiIPR007724. Poly_GlycHdrlase.
[Graphical view]
PANTHERiPTHR12837. PTHR12837. 1 hit.
PfamiPF05028. PARG_cat. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiPARG_HUMAN
AccessioniPrimary (citable) accession number: Q86W56
Secondary accession number(s): A5YBK3
, B2RC24, B4DIU5, B4DYR4, I6RUV3, Q6E4P6, Q6E4P7, Q7Z742, Q9Y4W7
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 25, 2004
Last sequence update: June 1, 2003
Last modified: November 30, 2016
This is version 126 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Glycosyl hydrolases
    Classification of glycosyl hydrolase families and list of entries
  2. Human chromosome 10
    Human chromosome 10: entries, gene names and cross-references to MIM
  3. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  4. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.