Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Hydroxymethylglutaryl-CoA synthase, mitochondrial

Gene

HMGCS2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

This enzyme condenses acetyl-CoA with acetoacetyl-CoA to form HMG-CoA, which is the substrate for HMG-CoA reductase.

Catalytic activityi

Acetyl-CoA + H2O + acetoacetyl-CoA = (S)-3-hydroxy-3-methylglutaryl-CoA + CoA.PROSITE-ProRule annotation

Pathwayi: (R)-mevalonate biosynthesis

This protein is involved in step 2 of the subpathway that synthesizes (R)-mevalonate from acetyl-CoA.
Proteins known to be involved in the 3 steps of the subpathway in this organism are:
  1. no protein annotated in this organism
  2. Hydroxymethylglutaryl-CoA synthase, cytoplasmic (HMGCS1), Hydroxymethylglutaryl-CoA synthase, mitochondrial (HMGCS2)
  3. 3-hydroxy-3-methylglutaryl coenzyme A reductase, 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGCR), 3-hydroxy-3-methylglutaryl-coenzyme A reductase (HMGCR), 3-hydroxy-3-methylglutaryl coenzyme A reductase
This subpathway is part of the pathway (R)-mevalonate biosynthesis, which is itself part of Metabolic intermediate biosynthesis.
View all proteins of this organism that are known to be involved in the subpathway that synthesizes (R)-mevalonate from acetyl-CoA, the pathway (R)-mevalonate biosynthesis and in Metabolic intermediate biosynthesis.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Binding sitei80Substrate1
Active sitei132Proton donor/acceptor1 Publication1
Active sitei166Acyl-thioester intermediatePROSITE-ProRule annotation1 Publication1
Binding sitei204Substrate1
Binding sitei258Substrate1
Active sitei301Proton donor/acceptor1 Publication1
Binding sitei380Substrate1

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Transferase

Keywords - Biological processi

Cholesterol biosynthesis, Cholesterol metabolism, Lipid biosynthesis, Lipid metabolism, Steroid biosynthesis, Steroid metabolism, Sterol biosynthesis, Sterol metabolism

Enzyme and pathway databases

BioCyciMetaCyc:HS05836-MONOMER.
ZFISH:HS05836-MONOMER.
BRENDAi2.3.3.10. 2681.
ReactomeiR-HSA-1989781. PPARA activates gene expression.
R-HSA-77111. Synthesis of Ketone Bodies.
UniPathwayiUPA00058; UER00102.

Chemistry databases

SwissLipidsiSLP:000001249. [P54868-1]

Names & Taxonomyi

Protein namesi
Recommended name:
Hydroxymethylglutaryl-CoA synthase, mitochondrial (EC:2.3.3.10)
Short name:
HMG-CoA synthase
Alternative name(s):
3-hydroxy-3-methylglutaryl coenzyme A synthase
Gene namesi
Name:HMGCS2
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 1

Organism-specific databases

HGNCiHGNC:5008. HMGCS2.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Mitochondrion

Pathology & Biotechi

Involvement in diseasei

3-hydroxy-3-methylglutaryl-CoA synthase-2 deficiency (HMGCS2D)3 Publications
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA metabolic disorder characterized by severe hypoketotic hypoglycemia, encephalopathy, and hepatomegaly.
See also OMIM:605911
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_03275754V → M in HMGCS2D. 1 PublicationCorresponds to variant rs28937320dbSNPEnsembl.1
Natural variantiVAR_032758167Y → C in HMGCS2D. 1 PublicationCorresponds to variant rs137852640dbSNPEnsembl.1
Natural variantiVAR_032711174F → L in HMGCS2D; reduced peptide level; no enzymatic activity. 1 PublicationCorresponds to variant rs137852636dbSNPEnsembl.1
Natural variantiVAR_032759212G → R in HMGCS2D. 1 PublicationCorresponds to variant rs137852638dbSNPEnsembl.1
Natural variantiVAR_032760500R → H in HMGCS2D. 1 PublicationCorresponds to variant rs137852639dbSNPEnsembl.1

Keywords - Diseasei

Disease mutation

Organism-specific databases

DisGeNETi3158.
MalaCardsiHMGCS2.
MIMi605911. phenotype.
OpenTargetsiENSG00000134240.
Orphaneti35701. 3-hydroxy-3-methylglutaryl-CoA synthase deficiency.
PharmGKBiPA29338.

Polymorphism and mutation databases

BioMutaiHMGCS2.
DMDMi1708234.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Transit peptidei1 – 37MitochondrionCuratedAdd BLAST37
ChainiPRO_000001348338 – 508Hydroxymethylglutaryl-CoA synthase, mitochondrialAdd BLAST471

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei52N6-succinyllysineBy similarity1
Modified residuei83N6-acetyllysine; alternateBy similarity1
Modified residuei83N6-succinyllysine; alternateBy similarity1
Modified residuei221N6-succinyllysineBy similarity1
Modified residuei243N6-acetyllysineBy similarity1
Modified residuei256N6-acetyllysine; alternateBy similarity1
Modified residuei256N6-succinyllysine; alternateBy similarity1
Modified residuei306N6-acetyllysineBy similarity1
Modified residuei310N6-acetyllysine; alternateBy similarity1
Modified residuei310N6-succinyllysine; alternateBy similarity1
Modified residuei333N6-succinyllysineBy similarity1
Modified residuei342N6-acetyllysine; alternateBy similarity1
Modified residuei342N6-succinyllysine; alternateBy similarity1
Modified residuei350N6-acetyllysine; alternateBy similarity1
Modified residuei350N6-succinyllysine; alternateBy similarity1
Modified residuei354N6-acetyllysine; alternateBy similarity1
Modified residuei354N6-succinyllysine; alternateBy similarity1
Modified residuei358N6-acetyllysine; alternateBy similarity1
Modified residuei358N6-succinyllysine; alternateBy similarity1
Modified residuei433PhosphoserineCombined sources1
Modified residuei437N6-acetyllysineBy similarity1
Modified residuei440PhosphoserineCombined sources1
Modified residuei447N6-acetyllysine; alternateBy similarity1
Modified residuei447N6-succinyllysine; alternateBy similarity1
Modified residuei456PhosphoserineBy similarity1
Modified residuei473N6-acetyllysine; alternateBy similarity1
Modified residuei473N6-succinyllysine; alternateBy similarity1
Modified residuei477PhosphoserineBy similarity1

Post-translational modificationi

Succinylated. Desuccinylated by SIRT5. Succinylation, at least at Lys-83 and Lys-310, inhibits the enzymatic activity.By similarity

Keywords - PTMi

Acetylation, Phosphoprotein

Proteomic databases

MaxQBiP54868.
PaxDbiP54868.
PeptideAtlasiP54868.
PRIDEiP54868.

2D gel databases

REPRODUCTION-2DPAGEIPI00008934.

PTM databases

iPTMnetiP54868.
PhosphoSitePlusiP54868.
SwissPalmiP54868.

Expressioni

Tissue specificityi

Expression in liver is 200-fold higher than in any other tissue. Low expression in colon, kidney, testis, and pancreas. Very low expression in heart and skeletal muscle. Not detected in brain. The relative expression of isoform 3 (at mRNA level) is highest in heart (70%) and skeletal muscle (60%).3 Publications

Gene expression databases

BgeeiENSG00000134240.
CleanExiHS_HMGCS2.
ExpressionAtlasiP54868. baseline and differential.
GenevisibleiP54868. HS.

Organism-specific databases

HPAiCAB032906.
HPA027423.
HPA027442.

Interactioni

Subunit structurei

Homodimer.1 Publication

Protein-protein interaction databases

BioGridi109401. 5 interactors.
IntActiP54868. 1 interactor.
STRINGi9606.ENSP00000358414.

Structurei

Secondary structure

1508
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi55 – 62Combined sources8
Beta strandi65 – 69Combined sources5
Helixi70 – 76Combined sources7
Helixi83 – 88Combined sources6
Beta strandi92 – 94Combined sources3
Helixi102 – 117Combined sources16
Helixi121 – 123Combined sources3
Beta strandi124 – 130Combined sources7
Beta strandi137 – 139Combined sources3
Helixi141 – 145Combined sources5
Helixi146 – 148Combined sources3
Helixi150 – 152Combined sources3
Beta strandi161 – 164Combined sources4
Helixi165 – 167Combined sources3
Helixi168 – 180Combined sources13
Beta strandi189 – 198Combined sources10
Helixi206 – 208Combined sources3
Beta strandi210 – 221Combined sources12
Beta strandi223 – 226Combined sources4
Beta strandi232 – 235Combined sources4
Beta strandi240 – 242Combined sources3
Helixi255 – 283Combined sources29
Helixi292 – 294Combined sources3
Beta strandi296 – 300Combined sources5
Helixi305 – 322Combined sources18
Helixi325 – 331Combined sources7
Helixi333 – 338Combined sources6
Helixi343 – 347Combined sources5
Helixi350 – 367Combined sources18
Helixi369 – 372Combined sources4
Helixi373 – 378Combined sources6
Helixi382 – 384Combined sources3
Helixi385 – 396Combined sources12
Helixi399 – 402Combined sources4
Beta strandi406 – 413Combined sources8
Turni414 – 416Combined sources3
Beta strandi417 – 425Combined sources9
Helixi434 – 440Combined sources7
Turni441 – 444Combined sources4
Helixi445 – 450Combined sources6
Beta strandi452 – 455Combined sources4
Helixi457 – 470Combined sources14
Helixi482 – 484Combined sources3
Beta strandi490 – 495Combined sources6
Beta strandi501 – 505Combined sources5

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2WYAX-ray1.70A/B/C/D51-508[»]
ProteinModelPortaliP54868.
SMRiP54868.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP54868.

Family & Domainsi

Sequence similaritiesi

Belongs to the HMG-CoA synthase family.Curated

Keywords - Domaini

Transit peptide

Phylogenomic databases

eggNOGiKOG1393. Eukaryota.
COG3425. LUCA.
GeneTreeiENSGT00390000006096.
HOGENOMiHOG000012351.
HOVERGENiHBG051912.
InParanoidiP54868.
KOiK01641.
OMAiQWKQAGS.
OrthoDBiEOG091G0791.
PhylomeDBiP54868.
TreeFamiTF105361.

Family and domain databases

Gene3Di3.40.47.10. 1 hit.
InterProiIPR000590. HMG_CoA_synt_AS.
IPR013746. HMG_CoA_synt_C_dom.
IPR013528. HMG_CoA_synth_N.
IPR010122. HMG_CoA_synthase_euk.
IPR016039. Thiolase-like.
[Graphical view]
PfamiPF08540. HMG_CoA_synt_C. 1 hit.
PF01154. HMG_CoA_synt_N. 1 hit.
[Graphical view]
SUPFAMiSSF53901. SSF53901. 3 hits.
TIGRFAMsiTIGR01833. HMG-CoA-S_euk. 1 hit.
PROSITEiPS01226. HMG_COA_SYNTHASE. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P54868-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MQRLLTPVKR ILQLTRAVQE TSLTPARLLP VAHQRFSTAS AVPLAKTDTW
60 70 80 90 100
PKDVGILALE VYFPAQYVDQ TDLEKYNNVE AGKYTVGLGQ TRMGFCSVQE
110 120 130 140 150
DINSLCLTVV QRLMERIQLP WDSVGRLEVG TETIIDKSKA VKTVLMELFQ
160 170 180 190 200
DSGNTDIEGI DTTNACYGGT ASLFNAANWM ESSSWDGRYA MVVCGDIAVY
210 220 230 240 250
PSGNARPTGG AGAVAMLIGP KAPLALERGL RGTHMENVYD FYKPNLASEY
260 270 280 290 300
PIVDGKLSIQ CYLRALDRCY TSYRKKIQNQ WKQAGSDRPF TLDDLQYMIF
310 320 330 340 350
HTPFCKMVQK SLARLMFNDF LSASSDTQTS LYKGLEAFGG LKLEDTYTNK
360 370 380 390 400
DLDKALLKAS QDMFDKKTKA SLYLSTHNGN MYTSSLYGCL ASLLSHHSAQ
410 420 430 440 450
ELAGSRIGAF SYGSGLAASF FSFRVSQDAA PGSPLDKLVS STSDLPKRLA
460 470 480 490 500
SRKCVSPEEF TEIMNQREQF YHKVNFSPPG DTNSLFPGTW YLERVDEQHR

RKYARRPV
Length:508
Mass (Da):56,635
Last modified:October 1, 1996 - v1
Checksum:iBD362D631F7C3C80
GO
Isoform 2 (identifier: P54868-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     187-228: Missing.

Note: No experimental confirmation available.
Show »
Length:466
Mass (Da):52,482
Checksum:i9D3C527559B74618
GO
Isoform 3 (identifier: P54868-3) [UniParc]FASTAAdd to basket
Also known as: HMGCS2delta4

The sequence of this isoform differs from the canonical sequence as follows:
     229-283: Missing.

Show »
Length:453
Mass (Da):50,050
Checksum:i80C1D1EBB7403EC0
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti234H → Y in CAG33131 (Ref. 3) Curated1
Sequence conflicti385S → T in CAG33131 (Ref. 3) Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_03275754V → M in HMGCS2D. 1 PublicationCorresponds to variant rs28937320dbSNPEnsembl.1
Natural variantiVAR_032758167Y → C in HMGCS2D. 1 PublicationCorresponds to variant rs137852640dbSNPEnsembl.1
Natural variantiVAR_032711174F → L in HMGCS2D; reduced peptide level; no enzymatic activity. 1 PublicationCorresponds to variant rs137852636dbSNPEnsembl.1
Natural variantiVAR_032759212G → R in HMGCS2D. 1 PublicationCorresponds to variant rs137852638dbSNPEnsembl.1
Natural variantiVAR_032760500R → H in HMGCS2D. 1 PublicationCorresponds to variant rs137852639dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_042892187 – 228Missing in isoform 2. 1 PublicationAdd BLAST42
Alternative sequenceiVSP_047445229 – 283Missing in isoform 3. 1 PublicationAdd BLAST55

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X83618 mRNA. Translation: CAA58593.1.
U81859
, U81851, U81852, U81853, U81854, U81855, U81856, U81857, U81858 Genomic DNA. Translation: AAB72036.1.
CR456850 mRNA. Translation: CAG33131.1.
AK303777 mRNA. Translation: BAH14049.1.
AL589734 Genomic DNA. Translation: CAI22408.1.
CH471122 Genomic DNA. Translation: EAW56709.1.
BC044217 mRNA. Translation: AAH44217.1.
U12788 mRNA. Translation: AAA92673.1.
U12789 mRNA. Translation: AAA92674.1.
GU433940 mRNA. Translation: ADD21696.1.
CCDSiCCDS53353.1. [P54868-2]
CCDS905.1. [P54868-1]
PIRiS71623.
RefSeqiNP_001159579.1. NM_001166107.1. [P54868-2]
NP_005509.1. NM_005518.3. [P54868-1]
XP_011539615.1. XM_011541313.1. [P54868-3]
UniGeneiHs.59889.

Genome annotation databases

EnsembliENST00000369406; ENSP00000358414; ENSG00000134240. [P54868-1]
ENST00000544913; ENSP00000439495; ENSG00000134240. [P54868-2]
GeneIDi3158.
KEGGihsa:3158.
UCSCiuc001eid.4. human. [P54868-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X83618 mRNA. Translation: CAA58593.1.
U81859
, U81851, U81852, U81853, U81854, U81855, U81856, U81857, U81858 Genomic DNA. Translation: AAB72036.1.
CR456850 mRNA. Translation: CAG33131.1.
AK303777 mRNA. Translation: BAH14049.1.
AL589734 Genomic DNA. Translation: CAI22408.1.
CH471122 Genomic DNA. Translation: EAW56709.1.
BC044217 mRNA. Translation: AAH44217.1.
U12788 mRNA. Translation: AAA92673.1.
U12789 mRNA. Translation: AAA92674.1.
GU433940 mRNA. Translation: ADD21696.1.
CCDSiCCDS53353.1. [P54868-2]
CCDS905.1. [P54868-1]
PIRiS71623.
RefSeqiNP_001159579.1. NM_001166107.1. [P54868-2]
NP_005509.1. NM_005518.3. [P54868-1]
XP_011539615.1. XM_011541313.1. [P54868-3]
UniGeneiHs.59889.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2WYAX-ray1.70A/B/C/D51-508[»]
ProteinModelPortaliP54868.
SMRiP54868.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi109401. 5 interactors.
IntActiP54868. 1 interactor.
STRINGi9606.ENSP00000358414.

Chemistry databases

SwissLipidsiSLP:000001249. [P54868-1]

PTM databases

iPTMnetiP54868.
PhosphoSitePlusiP54868.
SwissPalmiP54868.

Polymorphism and mutation databases

BioMutaiHMGCS2.
DMDMi1708234.

2D gel databases

REPRODUCTION-2DPAGEIPI00008934.

Proteomic databases

MaxQBiP54868.
PaxDbiP54868.
PeptideAtlasiP54868.
PRIDEiP54868.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000369406; ENSP00000358414; ENSG00000134240. [P54868-1]
ENST00000544913; ENSP00000439495; ENSG00000134240. [P54868-2]
GeneIDi3158.
KEGGihsa:3158.
UCSCiuc001eid.4. human. [P54868-1]

Organism-specific databases

CTDi3158.
DisGeNETi3158.
GeneCardsiHMGCS2.
H-InvDBHIX0160007.
HGNCiHGNC:5008. HMGCS2.
HPAiCAB032906.
HPA027423.
HPA027442.
MalaCardsiHMGCS2.
MIMi600234. gene.
605911. phenotype.
neXtProtiNX_P54868.
OpenTargetsiENSG00000134240.
Orphaneti35701. 3-hydroxy-3-methylglutaryl-CoA synthase deficiency.
PharmGKBiPA29338.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG1393. Eukaryota.
COG3425. LUCA.
GeneTreeiENSGT00390000006096.
HOGENOMiHOG000012351.
HOVERGENiHBG051912.
InParanoidiP54868.
KOiK01641.
OMAiQWKQAGS.
OrthoDBiEOG091G0791.
PhylomeDBiP54868.
TreeFamiTF105361.

Enzyme and pathway databases

UniPathwayiUPA00058; UER00102.
BioCyciMetaCyc:HS05836-MONOMER.
ZFISH:HS05836-MONOMER.
BRENDAi2.3.3.10. 2681.
ReactomeiR-HSA-1989781. PPARA activates gene expression.
R-HSA-77111. Synthesis of Ketone Bodies.

Miscellaneous databases

EvolutionaryTraceiP54868.
GenomeRNAii3158.
PROiP54868.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000134240.
CleanExiHS_HMGCS2.
ExpressionAtlasiP54868. baseline and differential.
GenevisibleiP54868. HS.

Family and domain databases

Gene3Di3.40.47.10. 1 hit.
InterProiIPR000590. HMG_CoA_synt_AS.
IPR013746. HMG_CoA_synt_C_dom.
IPR013528. HMG_CoA_synth_N.
IPR010122. HMG_CoA_synthase_euk.
IPR016039. Thiolase-like.
[Graphical view]
PfamiPF08540. HMG_CoA_synt_C. 1 hit.
PF01154. HMG_CoA_synt_N. 1 hit.
[Graphical view]
SUPFAMiSSF53901. SSF53901. 3 hits.
TIGRFAMsiTIGR01833. HMG-CoA-S_euk. 1 hit.
PROSITEiPS01226. HMG_COA_SYNTHASE. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiHMCS2_HUMAN
AccessioniPrimary (citable) accession number: P54868
Secondary accession number(s): B7Z8R3
, D3Y5K6, Q5SZU2, Q6IBF4
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 1, 1996
Last sequence update: October 1, 1996
Last modified: November 30, 2016
This is version 162 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 1
    Human chromosome 1: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PATHWAY comments
    Index of metabolic and biosynthesis pathways
  6. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  7. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.