Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Phosphatidylinositol 3,4,5-trisphosphate-dependent Rac exchanger 2 protein

Gene

PREX2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at transcript leveli

Functioni

Functions as a RAC1 guanine nucleotide exchange factor (GEF), activating Rac proteins by exchanging bound GDP for free GTP. Its activity is synergistically activated by phosphatidylinositol 3,4,5-trisphosphate and the beta gamma subunits of heterotrimeric G protein. Mediates the activation of RAC1 in a PI3K-dependent manner. May be an important mediator of Rac signaling, acting directly downstream of both G protein-coupled receptors and phosphoinositide 3-kinase.3 Publications

GO - Molecular functioni

  • GTPase activator activity Source: MGI
  • Rac guanyl-nucleotide exchange factor activity Source: MGI

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Guanine-nucleotide releasing factor

Names & Taxonomyi

Protein namesi
Recommended name:
Phosphatidylinositol 3,4,5-trisphosphate-dependent Rac exchanger 2 protein
Short name:
P-Rex2
Short name:
PtdIns(3,4,5)-dependent Rac exchanger 2
Alternative name(s):
DEP domain-containing protein 2
Gene namesi
Name:PREX2
Synonyms:DEPDC2
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 8

Organism-specific databases

HGNCiHGNC:22950. PREX2.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Pathology & Biotechi

Organism-specific databases

DisGeNETi80243.
OpenTargetsiENSG00000046889.
PharmGKBiPA164725103.

Polymorphism and mutation databases

BioMutaiPREX2.
DMDMi74758897.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00002867951 – 1606Phosphatidylinositol 3,4,5-trisphosphate-dependent Rac exchanger 2 proteinAdd BLAST1606

Proteomic databases

EPDiQ70Z35.
MaxQBiQ70Z35.
PaxDbiQ70Z35.
PeptideAtlasiQ70Z35.
PRIDEiQ70Z35.

PTM databases

iPTMnetiQ70Z35.
PhosphoSitePlusiQ70Z35.

Expressioni

Tissue specificityi

Isoform 1 is highly expressed in skeletal muscle, heart and placenta, absent from peripheral blood leukocytes. Isoform 2 is expressed in skeletal muscle, kidney, small intestine, and placenta. Isoform 3 is expressed in the heart.1 Publication

Gene expression databases

BgeeiENSG00000046889.
CleanExiHS_PREX2.
GenevisibleiQ70Z35. HS.

Organism-specific databases

HPAiHPA015234.

Interactioni

Subunit structurei

Interacts with RAC1.

Protein-protein interaction databases

BioGridi123199. 2 interactors.
IntActiQ70Z35. 4 interactors.
STRINGi9606.ENSP00000288368.

Structurei

3D structure databases

ProteinModelPortaliQ70Z35.
SMRiQ70Z35.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini23 – 214DHPROSITE-ProRule annotationAdd BLAST192
Domaini245 – 361PHPROSITE-ProRule annotationAdd BLAST117
Domaini390 – 464DEP 1PROSITE-ProRule annotationAdd BLAST75
Domaini491 – 566DEP 2PROSITE-ProRule annotationAdd BLAST76
Domaini592 – 671PDZ 1PROSITE-ProRule annotationAdd BLAST80
Domaini677 – 754PDZ 2PROSITE-ProRule annotationAdd BLAST78

Domaini

PH domain confers substrate specificity and recognition. Able to discriminate between RAC1, RHOA, and CDC42.
DH domain alone was unable to confer substrate specificity and recognition.

Sequence similaritiesi

Contains 2 DEP domains.PROSITE-ProRule annotation
Contains 1 DH (DBL-homology) domain.PROSITE-ProRule annotation
Contains 2 PDZ (DHR) domains.PROSITE-ProRule annotation
Contains 1 PH domain.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiKOG3519. Eukaryota.
KOG4428. Eukaryota.
ENOG410XRJD. LUCA.
GeneTreeiENSGT00760000118925.
HOVERGENiHBG053677.
InParanoidiQ70Z35.
KOiK17588.
OMAiESMEGYY.
OrthoDBiEOG091G00MJ.
PhylomeDBiQ70Z35.
TreeFamiTF328639.

Family and domain databases

Gene3Di1.10.10.10. 2 hits.
1.20.900.10. 1 hit.
2.30.29.30. 1 hit.
2.30.42.10. 2 hits.
InterProiIPR000591. DEP_dom.
IPR000219. DH-domain.
IPR001331. GDS_CDC24_CS.
IPR001478. PDZ.
IPR011993. PH_dom-like.
IPR001849. PH_domain.
IPR011991. WHTH_DNA-bd_dom.
[Graphical view]
PfamiPF00610. DEP. 2 hits.
PF00169. PH. 1 hit.
PF00621. RhoGEF. 1 hit.
[Graphical view]
SMARTiSM00049. DEP. 2 hits.
SM00228. PDZ. 2 hits.
SM00233. PH. 1 hit.
SM00325. RhoGEF. 1 hit.
[Graphical view]
SUPFAMiSSF46785. SSF46785. 2 hits.
SSF48065. SSF48065. 1 hit.
SSF50156. SSF50156. 2 hits.
SSF50729. SSF50729. 1 hit.
PROSITEiPS50186. DEP. 2 hits.
PS00741. DH_1. 1 hit.
PS50010. DH_2. 1 hit.
PS50106. PDZ. 2 hits.
PS50003. PH_DOMAIN. 1 hit.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q70Z35-1) [UniParc]FASTAAdd to basket
Also known as: P-Rex2

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MSEDSRGDSR AESAKDLEKQ LRLRVCVLSE LQKTERDYVG TLEFLVSAFL
60 70 80 90 100
HRMNQCAASK VDKNVTEETV KMLFSNIEDI LAVHKEFLKV VEECLHPEPN
110 120 130 140 150
AQQEVGTCFL HFKDKFRIYD EYCSNHEKAQ KLLLELNKIR TIRTFLLNCM
160 170 180 190 200
LLGGRKNTDV PLEGYLVTPI QRICKYPLIL KELLKRTPRK HSDYAAVMEA
210 220 230 240 250
LQAMKAVCSN INEAKRQMEK LEVLEEWQSH IEGWEGSNIT DTCTEMLMCG
260 270 280 290 300
VLLKISSGNI QERVFFLFDN LLVYCKRKHR RLKNSKASTD GHRYLFRGRI
310 320 330 340 350
NTEVMEVENV DDGTADFHSS GHIVVNGWKI HNTAKNKWFV CMAKTPEEKH
360 370 380 390 400
EWFEAILKER ERRKGLKLGM EQDTWVMISE QGEKLYKMMC RQGNLIKDRK
410 420 430 440 450
RKLTTFPKCF LGSEFVSWLL EIGEIHRPEE GVHLGQALLE NGIIHHVTDK
460 470 480 490 500
HQFKPEQMLY RFRYDDGTFY PRNEMQDVIS KGVRLYCRLH SLFTPVIRDK
510 520 530 540 550
DYHLRTYKSV VMANKLIDWL IAQGDCRTRE EAMIFGVGLC DNGFMHHVLE
560 570 580 590 600
KSEFKDEPLL FRFFSDEEME GSNMKHRLMK HDLKVVENVI AKSLLIKSNE
610 620 630 640 650
GSYGFGLEDK NKVPIIKLVE KGSNAEMAGM EVGKKIFAIN GDLVFMRPFN
660 670 680 690 700
EVDCFLKSCL NSRKPLRVLV STKPRETVKI PDSADGLGFQ IRGFGPSVVH
710 720 730 740 750
AVGRGTVAAA AGLHPGQCII KVNGINVSKE THASVIAHVT ACRKYRRPTK
760 770 780 790 800
QDSIQWVYNS IESAQEDLQK SHSKPPGDEA GDAFDCKVEE VIDKFNTMAI
810 820 830 840 850
IDGKKEHVSL TVDNVHLEYG VVYEYDSTAG IKCNVVEKMI EPKGFFSLTA
860 870 880 890 900
KILEALAKSD EHFVQNCTSL NSLNEVIPTD LQSKFSALCS ERIEHLCQRI
910 920 930 940 950
SSYKKFSRVL KNRAWPTFKQ AKSKISPLHS SDFCPTNCHV NVMEVSYPKT
960 970 980 990 1000
STSLGSAFGV QLDSRKHNSH DKENKSSEQG KLSPMVYIQH TITTMAAPSG
1010 1020 1030 1040 1050
LSLGQQDGHG LRYLLKEEDL ETQDIYQKLL GKLQTALKEV EMCVCQIDDL
1060 1070 1080 1090 1100
LSSITYSPKL ERKTSEGIIP TDSDNEKGER NSKRVCFNVA GDEQEDSGHD
1110 1120 1130 1140 1150
TISNRDSYSD CNSNRNSIAS FTSICSSQCS SYFHSDEMDS GDELPLSVRI
1160 1170 1180 1190 1200
SHDKQDKIHS CLEHLFSQVD SITNLLKGQA VVRAFDQTKY LTPGRGLQEF
1210 1220 1230 1240 1250
QQEMEPKLSC PKRLRLHIKQ DPWNLPSSVR TLAQNIRKFV EEVKCRLLLA
1260 1270 1280 1290 1300
LLEYSDSETQ LRRDMVFCQT LVATVCAFSE QLMAALNQMF DNSKENEMET
1310 1320 1330 1340 1350
WEASRRWLDQ IANAGVLFHF QSLLSPNLTD EQAMLEDTLV ALFDLEKVSF
1360 1370 1380 1390 1400
YFKPSEEEPL VANVPLTYQA EGSRQALKVY FYIDSYHFEQ LPQRLKNGGG
1410 1420 1430 1440 1450
FKIHPVLFAQ ALESMEGYYY RDNVSVEEFQ AQINAASLEK VKQYNQKLRA
1460 1470 1480 1490 1500
FYLDKSNSPP NSTSKAAYVD KLMRPLNALD ELYRLVASFI RSKRTAACAN
1510 1520 1530 1540 1550
TACSASGVGL LSVSSELCNR LGACHIIMCS SGVHRCTLSV TLEQAIILAR
1560 1570 1580 1590 1600
SHGLPPRYIM QATDVMRKQG ARVQNTAKNL GVRDRTPQSA PRLYKLCEPP

PPAGEE
Length:1,606
Mass (Da):182,622
Last modified:July 5, 2004 - v1
Checksum:iD9E938D65CE2C7DD
GO
Isoform 2 (identifier: Q70Z35-2) [UniParc]FASTAAdd to basket
Also known as: P-Rexw2A

The sequence of this isoform differs from the canonical sequence as follows:
     905-905: K → KVLSLQILEALAKSDEHFVQNCTSLNSLNEVIPTDLQSKFSALCSERIEHLCQRISSYKKVSVLLQ
     961-978: Missing.
     1016-1060: Missing.
     1108-1108: Y → YF
     1141-1141: G → GG
     1170-1198: DSITNLLKGQAVVRAFDQTKYLTPGRGLQ → QS
     1242-1255: EVKCRLLLALLEYS → T
     1315-1363: GVLFHFQSLLSPNLTDEQAMLEDTLVALFDLEKVSFYFKPSEEEPLVAN → NIILD
     1410-1410: Q → QHVEKLSYKVACKICLLVYT
     1450-1471: AFYLDKSNSPPNSTSKAAYVDK → YVTIQLKNQ
     1535-1535: R → RR
     1570-1577: GARVQNTA → VGLMQTWE
     1578-1606: Missing.

Show »
Length:1,504
Mass (Da):171,373
Checksum:i7395DCD2CED55256
GO
Isoform 3 (identifier: Q70Z35-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     906-979: FSRVLKNRAW...HDKENKSSEQ → VQASERFYNF...EMLLAERAPV
     980-1606: Missing.

Show »
Length:979
Mass (Da):112,091
Checksum:i98B38EB98ED26657
GO
Isoform 4 (identifier: Q70Z35-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     48-112: Missing.
     1049-1049: D → E
     1050-1606: Missing.

Note: No experimental confirmation available.
Show »
Length:984
Mass (Da):112,404
Checksum:i4BBF022C4AAF66B5
GO

Sequence cautioni

The sequence AK024079 differs from that shown. Reason: Frameshift at position 1134.Curated
The sequence BAB14375 differs from that shown. Reason: Erroneous initiation. Translation N-terminally extended.Curated
The sequence BAG57581 differs from that shown. Reason: Erroneous initiation. Translation N-terminally extended.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti158T → I in AAS82571 (Ref. 2) Curated1
Sequence conflicti158T → I in AAS82572 (Ref. 2) Curated1
Sequence conflicti1076E → V in AAS82571 (Ref. 2) Curated1
Sequence conflicti1076E → V in AK024079 (PubMed:14702039).Curated1
Sequence conflicti1268C → R in AAS82571 (Ref. 2) Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_032163312D → N.Corresponds to variant rs11784582dbSNPEnsembl.1
Natural variantiVAR_035973537V → I in a colorectal cancer sample; somatic mutation. 1 PublicationCorresponds to variant rs147538692dbSNPEnsembl.1
Natural variantiVAR_0359741571A → E in a colorectal cancer sample; somatic mutation. 1 Publication1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_05561248 – 112Missing in isoform 4. 1 PublicationAdd BLAST65
Alternative sequenceiVSP_025149905K → KVLSLQILEALAKSDEHFVQ NCTSLNSLNEVIPTDLQSKF SALCSERIEHLCQRISSYKK VSVLLQ in isoform 2. Curated1
Alternative sequenceiVSP_025150906 – 979FSRVL…KSSEQ → VQASERFYNFTARHAVWEHS FDLHSVSSTFPVPVTMEFLL LPPPLLGISQDGRQHCIPED LPSQEMLLAERAPV in isoform 3. 2 PublicationsAdd BLAST74
Alternative sequenceiVSP_025151961 – 978Missing in isoform 2. CuratedAdd BLAST18
Alternative sequenceiVSP_025152980 – 1606Missing in isoform 3. 2 PublicationsAdd BLAST627
Alternative sequenceiVSP_0251531016 – 1060Missing in isoform 2. CuratedAdd BLAST45
Alternative sequenceiVSP_0556131049D → E in isoform 4. 1 Publication1
Alternative sequenceiVSP_0556141050 – 1606Missing in isoform 4. 1 PublicationAdd BLAST557
Alternative sequenceiVSP_0251541108Y → YF in isoform 2. Curated1
Alternative sequenceiVSP_0251551141G → GG in isoform 2. Curated1
Alternative sequenceiVSP_0251561170 – 1198DSITN…GRGLQ → QS in isoform 2. CuratedAdd BLAST29
Alternative sequenceiVSP_0251571242 – 1255EVKCR…LLEYS → T in isoform 2. CuratedAdd BLAST14
Alternative sequenceiVSP_0251581315 – 1363GVLFH…PLVAN → NIILD in isoform 2. CuratedAdd BLAST49
Alternative sequenceiVSP_0251591410Q → QHVEKLSYKVACKICLLVYT in isoform 2. Curated1
Alternative sequenceiVSP_0251601450 – 1471AFYLD…AYVDK → YVTIQLKNQ in isoform 2. CuratedAdd BLAST22
Alternative sequenceiVSP_0251611535R → RR in isoform 2. Curated1
Alternative sequenceiVSP_0251621570 – 1577GARVQNTA → VGLMQTWE in isoform 2. Curated8
Alternative sequenceiVSP_0251631578 – 1606Missing in isoform 2. CuratedAdd BLAST29

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ437636 mRNA. Translation: CAD26885.2.
AY508996 mRNA. Translation: AAS82571.1.
AY508997 mRNA. Translation: AAS82572.1.
AK023049 mRNA. Translation: BAB14375.1. Different initiation.
AK024079 mRNA. No translation available.
AK294299 mRNA. Translation: BAG57581.1. Different initiation.
AC011853 Genomic DNA. No translation available.
AC103783 Genomic DNA. No translation available.
AC104416 Genomic DNA. No translation available.
BK005160 mRNA. Translation: DAA05333.1.
BK005161 mRNA. Translation: DAA05334.1.
CCDSiCCDS6201.1. [Q70Z35-1]
RefSeqiNP_079146.2. NM_024870.3. [Q70Z35-1]
NP_079446.3. NM_025170.5. [Q70Z35-3]
UniGeneiHs.169943.
Hs.591867.
Hs.731160.
Hs.732080.

Genome annotation databases

EnsembliENST00000288368; ENSP00000288368; ENSG00000046889. [Q70Z35-1]
GeneIDi80243.
KEGGihsa:80243.
UCSCiuc003xxv.2. human. [Q70Z35-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ437636 mRNA. Translation: CAD26885.2.
AY508996 mRNA. Translation: AAS82571.1.
AY508997 mRNA. Translation: AAS82572.1.
AK023049 mRNA. Translation: BAB14375.1. Different initiation.
AK024079 mRNA. No translation available.
AK294299 mRNA. Translation: BAG57581.1. Different initiation.
AC011853 Genomic DNA. No translation available.
AC103783 Genomic DNA. No translation available.
AC104416 Genomic DNA. No translation available.
BK005160 mRNA. Translation: DAA05333.1.
BK005161 mRNA. Translation: DAA05334.1.
CCDSiCCDS6201.1. [Q70Z35-1]
RefSeqiNP_079146.2. NM_024870.3. [Q70Z35-1]
NP_079446.3. NM_025170.5. [Q70Z35-3]
UniGeneiHs.169943.
Hs.591867.
Hs.731160.
Hs.732080.

3D structure databases

ProteinModelPortaliQ70Z35.
SMRiQ70Z35.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi123199. 2 interactors.
IntActiQ70Z35. 4 interactors.
STRINGi9606.ENSP00000288368.

PTM databases

iPTMnetiQ70Z35.
PhosphoSitePlusiQ70Z35.

Polymorphism and mutation databases

BioMutaiPREX2.
DMDMi74758897.

Proteomic databases

EPDiQ70Z35.
MaxQBiQ70Z35.
PaxDbiQ70Z35.
PeptideAtlasiQ70Z35.
PRIDEiQ70Z35.

Protocols and materials databases

DNASUi80243.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000288368; ENSP00000288368; ENSG00000046889. [Q70Z35-1]
GeneIDi80243.
KEGGihsa:80243.
UCSCiuc003xxv.2. human. [Q70Z35-1]

Organism-specific databases

CTDi80243.
DisGeNETi80243.
GeneCardsiPREX2.
HGNCiHGNC:22950. PREX2.
HPAiHPA015234.
MIMi612139. gene.
neXtProtiNX_Q70Z35.
OpenTargetsiENSG00000046889.
PharmGKBiPA164725103.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG3519. Eukaryota.
KOG4428. Eukaryota.
ENOG410XRJD. LUCA.
GeneTreeiENSGT00760000118925.
HOVERGENiHBG053677.
InParanoidiQ70Z35.
KOiK17588.
OMAiESMEGYY.
OrthoDBiEOG091G00MJ.
PhylomeDBiQ70Z35.
TreeFamiTF328639.

Miscellaneous databases

GeneWikiiPREX2.
GenomeRNAii80243.
PROiQ70Z35.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000046889.
CleanExiHS_PREX2.
GenevisibleiQ70Z35. HS.

Family and domain databases

Gene3Di1.10.10.10. 2 hits.
1.20.900.10. 1 hit.
2.30.29.30. 1 hit.
2.30.42.10. 2 hits.
InterProiIPR000591. DEP_dom.
IPR000219. DH-domain.
IPR001331. GDS_CDC24_CS.
IPR001478. PDZ.
IPR011993. PH_dom-like.
IPR001849. PH_domain.
IPR011991. WHTH_DNA-bd_dom.
[Graphical view]
PfamiPF00610. DEP. 2 hits.
PF00169. PH. 1 hit.
PF00621. RhoGEF. 1 hit.
[Graphical view]
SMARTiSM00049. DEP. 2 hits.
SM00228. PDZ. 2 hits.
SM00233. PH. 1 hit.
SM00325. RhoGEF. 1 hit.
[Graphical view]
SUPFAMiSSF46785. SSF46785. 2 hits.
SSF48065. SSF48065. 1 hit.
SSF50156. SSF50156. 2 hits.
SSF50729. SSF50729. 1 hit.
PROSITEiPS50186. DEP. 2 hits.
PS00741. DH_1. 1 hit.
PS50010. DH_2. 1 hit.
PS50106. PDZ. 2 hits.
PS50003. PH_DOMAIN. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiPREX2_HUMAN
AccessioniPrimary (citable) accession number: Q70Z35
Secondary accession number(s): B4DFX0
, Q32KL0, Q32KL1, Q6R7Q3, Q6R7Q4, Q9H805, Q9H961
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 15, 2007
Last sequence update: July 5, 2004
Last modified: November 2, 2016
This is version 117 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 8
    Human chromosome 8: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.