Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Neuropathy target esterase

Gene

Pnpla6

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at transcript leveli

Functioni

Phospholipase B that deacylates intracellular phosphatidylcholine (PtdCho), generating glycerophosphocholine (GroPtdCho). This deacylation occurs at both sn-2 and sn-1 positions of PtdCho. Its specific chemical modification by certain organophosphorus (OP) compounds leads to distal axonopathy.2 Publications

Catalytic activityi

2-lysophosphatidylcholine + H2O = glycerophosphocholine + a carboxylate.

Enzyme regulationi

Inhibited by a series a OPs such as mipafox (MPX), phenyl saligenin phosphate (PSP), phenyl dipentyl phosphinate (PDPP), diisopropyl fluorophosphate and paraoxon.1 Publication

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Active sitei994 – 9941By similarity

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Nucleotide bindingi179 – 306128cNMP 1Add
BLAST
Nucleotide bindingi492 – 614123cNMP 2Add
BLAST
Nucleotide bindingi610 – 730121cNMP 3Add
BLAST

GO - Molecular functioni

  • carboxylic ester hydrolase activity Source: MGI
  • lysophospholipase activity Source: GO_Central

GO - Biological processi

  • angiogenesis Source: MGI
  • lipid catabolic process Source: UniProtKB-KW
  • organ morphogenesis Source: MGI
  • phosphatidylcholine metabolic process Source: InterPro
Complete GO annotation...

Keywords - Molecular functioni

Hydrolase

Keywords - Biological processi

Lipid degradation, Lipid metabolism

Names & Taxonomyi

Protein namesi
Recommended name:
Neuropathy target esterase (EC:3.1.1.5)
Alternative name(s):
Patatin-like phospholipase domain-containing protein 6
Gene namesi
Name:Pnpla6
Synonyms:Nte
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
ProteomesiUP000000589 Componenti: Chromosome 8

Organism-specific databases

MGIiMGI:1354723. Pnpla6.

Subcellular locationi

Topology

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Topological domaini1 – 4343LumenalSequence AnalysisAdd
BLAST
Transmembranei44 – 6421HelicalSequence AnalysisAdd
BLAST
Topological domaini65 – 13551291CytoplasmicSequence AnalysisAdd
BLAST

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Endoplasmic reticulum, Membrane

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 13551355Neuropathy target esterasePRO_0000292200Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Glycosylationi9 – 91N-linked (GlcNAc...)Sequence Analysis
Modified residuei338 – 3381PhosphoserineBy similarity
Modified residuei345 – 3451PhosphothreonineBy similarity
Modified residuei346 – 3461PhosphoserineBy similarity
Modified residuei405 – 4051PhosphoserineBy similarity

Post-translational modificationi

Glycosylated.By similarity

Keywords - PTMi

Glycoprotein, Phosphoprotein

Proteomic databases

MaxQBiQ3TRM4.
PaxDbiQ3TRM4.
PRIDEiQ3TRM4.

PTM databases

PhosphoSiteiQ3TRM4.

Expressioni

Tissue specificityi

Expressed ubiquitously in brain of young mice. Reaching adulthood, there is a most prominent expression in Purkinje cells, granule cells and pyramidal neurons of the hippocampus and some large neurons in the medulla oblongata, nucleus dentatus and pons.1 Publication

Developmental stagei

Expressed in the embryonic respiratory system, different epithelial structures and strongly in the spinal ganglia, during the development.1 Publication

Gene expression databases

BgeeiQ3TRM4.
CleanExiMM_PNPLA6.
GenevisibleiQ3TRM4. MM.

Interactioni

Protein-protein interaction databases

BioGridi206099. 1 interaction.
STRINGi10090.ENSMUSP00000004681.

Structurei

3D structure databases

ProteinModelPortaliQ3TRM4.
SMRiQ3TRM4. Positions 517-704.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini961 – 1127167PatatinAdd
BLAST

Motif

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Motifi992 – 9965GXSXG

Sequence similaritiesi

Belongs to the NTE family.Curated
Contains 3 cyclic nucleotide-binding domains.PROSITE-ProRule annotation
Contains 1 patatin domain.Curated

Keywords - Domaini

Repeat, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiCOG0664.
GeneTreeiENSGT00390000002533.
HOGENOMiHOG000016081.
HOVERGENiHBG053067.
InParanoidiQ3TRM4.
KOiK14676.
OMAiQQLQGPF.
OrthoDBiEOG7QRQT1.
PhylomeDBiQ3TRM4.
TreeFamiTF300519.

Family and domain databases

Gene3Di2.60.120.10. 3 hits.
InterProiIPR016035. Acyl_Trfase/lysoPLipase.
IPR018490. cNMP-bd-like.
IPR000595. cNMP-bd_dom.
IPR001423. LysoPLipase_patatin_CS.
IPR002641. Patatin/PLipase_A2-rel.
IPR014710. RmlC-like_jellyroll.
[Graphical view]
PfamiPF00027. cNMP_binding. 3 hits.
PF01734. Patatin. 1 hit.
[Graphical view]
SMARTiSM00100. cNMP. 3 hits.
[Graphical view]
SUPFAMiSSF51206. SSF51206. 3 hits.
SSF52151. SSF52151. 1 hit.
PROSITEiPS50042. CNMP_BINDING_3. 3 hits.
PS01237. UPF0028. 1 hit.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q3TRM4-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MGTPSHELNT TSSGAEVIQK TLEEGLGRRI CVAQPVPFVP QVLGVMIGAG
60 70 80 90 100
VAVLVTAVLI LLVVRRLRVQ KTPAPEGPRY RFRKRDKVLF YGRKIMRKVS
110 120 130 140 150
QSTSSLVDTS VSTTSRPRMK KKLKMLNIAK KILRIQKETP TLQRKEPPPS
160 170 180 190 200
VLEADLTEGD LANSHLPSEV LYMLKNVRVL GHFEKPLFLE LCRHMVFQRL
210 220 230 240 250
GQGDYVFRPG QPDASIYVVQ DGLLELCLPG PDGKECVVKE VVPGDSVNSL
260 270 280 290 300
LSILDVITGH QHPQRTVSAR AARDSTVLRL PVEAFSAVFT KYPESLVRVV
310 320 330 340 350
QIIMVRLQRV TFLALHNYLG LTNELFSHEI QPLRLFPSPG LPTRTSPVRG
360 370 380 390 400
SKRVVSTSGT EDTSKETSGR PLDSIGAPLP GPAGDPVKPT SLEAPPAPLL
410 420 430 440 450
SRCISMPVDI SGLQGGPRSD FDMAYERGRI SVSLQEEASG GPQTASPREL
460 470 480 490 500
REQPAGACEY SYCEDESATG GCPFGPYQGR QTSSIFEAAK RELAKLMRIE
510 520 530 540 550
DPSLLNSRVL LHHAKAGTII ARQGDQDVSL HFVLWGCLHV YQRMIDKAEE
560 570 580 590 600
VCLFVAQPGE LVGQLAVLTG EPLIFTLRAQ RDCTFLRISK SHFYEIMRAQ
610 620 630 640 650
PSVVLSAAHT VAARMSPFVR QMDFAIDWTA VEAGRALYRQ GDRSDCTYIV
660 670 680 690 700
LNGRLRSVIQ RGSGKKELVG EYGRGDLIGV VEALTRQPRA TTVHAVRDTE
710 720 730 740 750
LAKLPEGTLG HIKRRYPQVV TRLIHLLSQK ILGNLQQLQG PFPGSGLSVP
760 770 780 790 800
QHSELTNPAS NLSTVAILPV CAEVPMMAFT LELQHALQAI GPTLLLNSDV
810 820 830 840 850
IRALLGASAL DSIQEFRLSG WLAQQEDAHR IVLYQTDTSL TPWTVRCLRQ
860 870 880 890 900
ADCILIVGLG DQEPTVGQLE QMLENTAVRA LKQLVLLHRE EGPGPTRTVE
910 920 930 940 950
WLNMRSWCSG HLHLRCPRRL FSRRSPAKLH ELYEKVFSRR ADRHSDFSRL
960 970 980 990 1000
ARVLTGNTIA LVLGGGGARG CSHIGVLKAL EEAGVPVDLV GGTSIGSFIG
1010 1020 1030 1040 1050
ALYAEERSAS RTKQRAREWA KSMTSVLEPV LDLTYPVTSM FTGSAFNRSI
1060 1070 1080 1090 1100
HRVFQDKQIE DLWLPYFNVT TDITASAMRV HKDGSLWRYV RASMTLSGYL
1110 1120 1130 1140 1150
PPLCDPKDGH LLMDGGYINN LPADIARSMG AKTVIAIDVG SQDETDLSTY
1160 1170 1180 1190 1200
GDSLSGWWLL WKRLNPWADK VKVPDMAEIQ SRLAYVSCVR QLEVVKSSSY
1210 1220 1230 1240 1250
CEYLRPSIDC FKTMDFGKFD QIYDVGYQYG KAVFGGWTRG EVIEKMLTDR
1260 1270 1280 1290 1300
RSTDLNESRR ADILAFPSSG FTDLAEIVSR IEPPTSYVSD GCADGEESDC
1310 1320 1330 1340 1350
LTEYEEDAGP DCSRDEGGSP EGASPSTASE VEEEKSTLRQ RRFLPQETPS

SVADA
Length:1,355
Mass (Da):149,537
Last modified:June 26, 2007 - v2
Checksum:i813263C6A82083ED
GO
Isoform 2 (identifier: Q3TRM4-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-41: MGTPSHELNTTSSGAEVIQKTLEEGLGRRICVAQPVPFVPQ → MEAPLQTGM

Note: No experimental confirmation available.
Show »
Length:1,323
Mass (Da):146,136
Checksum:iB27BC1F45801A2A4
GO
Isoform 3 (identifier: Q3TRM4-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-41: MGTPSHELNTTSSGAEVIQKTLEEGLGRRICVAQPVPFVPQ → MEAPLQTGM
     448-448: R → RTPTQ

Note: No experimental confirmation available.
Show »
Length:1,327
Mass (Da):146,563
Checksum:i61DC2256B2C9C4F8
GO
Isoform 4 (identifier: Q3TRM4-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1123-1169: ADIARSMGAK...LWKRLNPWAD → GKWLPTHICM...RGHTQLCTEL
     1170-1355: Missing.

Note: No experimental confirmation available.
Show »
Length:1,170
Mass (Da):129,347
Checksum:i34CC6F6A8FEE4B6D
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti240 – 2401E → K in AAD51700 (PubMed:10640712).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 4141MGTPS…PFVPQ → MEAPLQTGM in isoform 2 and isoform 3. 2 PublicationsVSP_026390Add
BLAST
Alternative sequencei448 – 4481R → RTPTQ in isoform 3. 1 PublicationVSP_026391
Alternative sequencei1123 – 116947ADIAR…NPWAD → GKWLPTHICMDTYHQTHAHT DFCTCRLEGTGLYEWRSSRG HTQLCTEL in isoform 4. 1 PublicationVSP_026392Add
BLAST
Alternative sequencei1170 – 1355186Missing in isoform 4. 1 PublicationVSP_026393Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF173829 mRNA. Translation: AAD51700.1.
AK162641 mRNA. Translation: BAE37004.1.
AC170806 Genomic DNA. No translation available.
BC054789 mRNA. Translation: AAH54789.1.
BC056999 mRNA. Translation: AAH56999.1.
CCDSiCCDS40206.1. [Q3TRM4-3]
RefSeqiNP_001116290.2. NM_001122818.2.
NP_056616.2. NM_015801.2. [Q3TRM4-3]
XP_006508887.1. XM_006508824.2. [Q3TRM4-1]
UniGeneiMm.23085.

Genome annotation databases

EnsembliENSMUST00000004681; ENSMUSP00000004681; ENSMUSG00000004565. [Q3TRM4-3]
ENSMUST00000111070; ENSMUSP00000106699; ENSMUSG00000004565. [Q3TRM4-3]
GeneIDi50767.
KEGGimmu:50767.
UCSCiuc009krr.1. mouse. [Q3TRM4-3]
uc009krs.1. mouse. [Q3TRM4-2]
uc009krt.1. mouse. [Q3TRM4-4]
uc009kru.1. mouse. [Q3TRM4-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF173829 mRNA. Translation: AAD51700.1.
AK162641 mRNA. Translation: BAE37004.1.
AC170806 Genomic DNA. No translation available.
BC054789 mRNA. Translation: AAH54789.1.
BC056999 mRNA. Translation: AAH56999.1.
CCDSiCCDS40206.1. [Q3TRM4-3]
RefSeqiNP_001116290.2. NM_001122818.2.
NP_056616.2. NM_015801.2. [Q3TRM4-3]
XP_006508887.1. XM_006508824.2. [Q3TRM4-1]
UniGeneiMm.23085.

3D structure databases

ProteinModelPortaliQ3TRM4.
SMRiQ3TRM4. Positions 517-704.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi206099. 1 interaction.
STRINGi10090.ENSMUSP00000004681.

Chemistry

ChEMBLiCHEMBL3259506.

PTM databases

PhosphoSiteiQ3TRM4.

Proteomic databases

MaxQBiQ3TRM4.
PaxDbiQ3TRM4.
PRIDEiQ3TRM4.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000004681; ENSMUSP00000004681; ENSMUSG00000004565. [Q3TRM4-3]
ENSMUST00000111070; ENSMUSP00000106699; ENSMUSG00000004565. [Q3TRM4-3]
GeneIDi50767.
KEGGimmu:50767.
UCSCiuc009krr.1. mouse. [Q3TRM4-3]
uc009krs.1. mouse. [Q3TRM4-2]
uc009krt.1. mouse. [Q3TRM4-4]
uc009kru.1. mouse. [Q3TRM4-1]

Organism-specific databases

CTDi10908.
MGIiMGI:1354723. Pnpla6.

Phylogenomic databases

eggNOGiCOG0664.
GeneTreeiENSGT00390000002533.
HOGENOMiHOG000016081.
HOVERGENiHBG053067.
InParanoidiQ3TRM4.
KOiK14676.
OMAiQQLQGPF.
OrthoDBiEOG7QRQT1.
PhylomeDBiQ3TRM4.
TreeFamiTF300519.

Miscellaneous databases

NextBioi307675.
PROiQ3TRM4.
SOURCEiSearch...

Gene expression databases

BgeeiQ3TRM4.
CleanExiMM_PNPLA6.
GenevisibleiQ3TRM4. MM.

Family and domain databases

Gene3Di2.60.120.10. 3 hits.
InterProiIPR016035. Acyl_Trfase/lysoPLipase.
IPR018490. cNMP-bd-like.
IPR000595. cNMP-bd_dom.
IPR001423. LysoPLipase_patatin_CS.
IPR002641. Patatin/PLipase_A2-rel.
IPR014710. RmlC-like_jellyroll.
[Graphical view]
PfamiPF00027. cNMP_binding. 3 hits.
PF01734. Patatin. 1 hit.
[Graphical view]
SMARTiSM00100. cNMP. 3 hits.
[Graphical view]
SUPFAMiSSF51206. SSF51206. 3 hits.
SSF52151. SSF52151. 1 hit.
PROSITEiPS50042. CNMP_BINDING_3. 3 hits.
PS01237. UPF0028. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Cloning and expression of the murine sws/NTE gene."
    Moser M., Stempfl T., Li Y., Glynn P., Buttner R., Kretzschmar D.
    Mech. Dev. 90:279-282(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 3), TISSUE SPECIFICITY, DEVELOPMENTAL STAGE.
    Strain: BALB/c.
  2. "The transcriptional landscape of the mammalian genome."
    Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.
    , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
    Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 4).
    Strain: C57BL/6J.
    Tissue: Bone.
  3. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: C57BL/6J.
  4. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
    Strain: C57BL/6.
    Tissue: Brain.
  5. "Loss of neuropathy target esterase in mice links organophosphate exposure to hyperactivity."
    Winrow C.J., Hemming M.L., Allen D.M., Quistad G.B., Casida J.E., Barlow C.
    Nat. Genet. 33:477-485(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION.
  6. "Phospholipase B activity and organophosphorus compound toxicity in cultured neural cells."
    Read D.J., Langford L., Barbour H.R., Forshaw P.J., Glynn P.
    Toxicol. Appl. Pharmacol. 219:190-195(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, ENZYME REGULATION, SUBCELLULAR LOCATION.

Entry informationi

Entry nameiPLPL6_MOUSE
AccessioniPrimary (citable) accession number: Q3TRM4
Secondary accession number(s): Q7TQD6, Q9R114
Entry historyi
Integrated into UniProtKB/Swiss-Prot: June 26, 2007
Last sequence update: June 26, 2007
Last modified: July 22, 2015
This is version 87 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.