Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Leucine-rich repeat-containing G-protein coupled receptor 5

Gene

LGR5

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Receptor for R-spondins that potentiates the canonical Wnt signaling pathway and acts as a stem cell marker of the intestinal epithelium and the hair follicle. Upon binding to R-spondins (RSPO1, RSPO2, RSPO3 or RSPO4), associates with phosphorylated LRP6 and frizzled receptors that are activated by extracellular Wnt receptors, triggering the canonical Wnt signaling pathway to increase expression of target genes. In contrast to classical G-protein coupled receptors, does not activate heterotrimeric G-proteins to transduce the signal. Involved in the development and/or maintenance of the adult intestinal stem cells during postembryonic development.5 Publications

GO - Molecular functioni

  • G-protein coupled receptor activity Source: ProtInc
  • transmembrane signaling receptor activity Source: UniProtKB

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

G-protein coupled receptor, Receptor, Transducer

Enzyme and pathway databases

BioCyciZFISH:ENSG00000139292-MONOMER.
ReactomeiR-HSA-4641263. Regulation of FZD by ubiquitination.
SIGNORiO75473.

Names & Taxonomyi

Protein namesi
Recommended name:
Leucine-rich repeat-containing G-protein coupled receptor 5
Alternative name(s):
G-protein coupled receptor 49
G-protein coupled receptor 67
G-protein coupled receptor HG38
Gene namesi
Name:LGR5
Synonyms:GPR49, GPR67
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 12

Organism-specific databases

HGNCiHGNC:4504. LGR5.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini22 – 561ExtracellularSequence analysisAdd BLAST540
Transmembranei562 – 582Helical; Name=1Sequence analysisAdd BLAST21
Topological domaini583 – 593CytoplasmicSequence analysisAdd BLAST11
Transmembranei594 – 614Helical; Name=2Sequence analysisAdd BLAST21
Topological domaini615 – 638ExtracellularSequence analysisAdd BLAST24
Transmembranei639 – 659Helical; Name=3Sequence analysisAdd BLAST21
Topological domaini660 – 682CytoplasmicSequence analysisAdd BLAST23
Transmembranei683 – 703Helical; Name=4Sequence analysisAdd BLAST21
Topological domaini704 – 722ExtracellularSequence analysisAdd BLAST19
Transmembranei723 – 743Helical; Name=5Sequence analysisAdd BLAST21
Topological domaini744 – 767CytoplasmicSequence analysisAdd BLAST24
Transmembranei768 – 788Helical; Name=6Sequence analysisAdd BLAST21
Topological domaini789 – 802ExtracellularSequence analysisAdd BLAST14
Transmembranei803 – 823Helical; Name=7Sequence analysisAdd BLAST21
Topological domaini824 – 907CytoplasmicSequence analysisAdd BLAST84

GO - Cellular componenti

  • integral component of plasma membrane Source: UniProtKB
  • plasma membrane Source: Reactome
  • trans-Golgi network membrane Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Golgi apparatus, Membrane

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi146D → F: Abolishes activation of Wnt signaling. 1 Publication1
Mutagenesisi170D → F: Abolishes activation of Wnt signaling. 1 Publication1
Mutagenesisi190A → D: Abolishes activation of Wnt signaling. 1 Publication1
Mutagenesisi861S → A: Impaired internalization to the trans-Golgi network; when associated with A-864. 1 Publication1
Mutagenesisi864S → A: Impaired internalization to the trans-Golgi network; when associated with A-861. 1 Publication1

Organism-specific databases

DisGeNETi8549.
OpenTargetsiENSG00000139292.
PharmGKBiPA28894.

Chemistry databases

GuidetoPHARMACOLOGYi148.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 21Sequence analysisAdd BLAST21
ChainiPRO_000001279422 – 907Leucine-rich repeat-containing G-protein coupled receptor 5Add BLAST886

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi34 ↔ 40
Disulfide bondi38 ↔ 52
Glycosylationi63N-linked (GlcNAc...)Sequence analysis1
Glycosylationi77N-linked (GlcNAc...)Sequence analysis1
Glycosylationi208N-linked (GlcNAc...)1 Publication1
Disulfide bondi348 ↔ 373
Disulfide bondi479 ↔ 541
Glycosylationi500N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi637 ↔ 712PROSITE-ProRule annotation
Glycosylationi792N-linked (GlcNAc...)Sequence analysis1

Keywords - PTMi

Disulfide bond, Glycoprotein

Proteomic databases

MaxQBiO75473.
PaxDbiO75473.
PeptideAtlasiO75473.
PRIDEiO75473.

PTM databases

iPTMnetiO75473.
PhosphoSitePlusiO75473.

Expressioni

Tissue specificityi

Expressed in skeletal muscle, placenta, spinal cord, and various region of brain. Expressed at the base of crypts in colonic and small mucosa stem cells. In premalignant cancer expression is not restricted to the cript base. Overexpressed in cancers of the ovary, colon and liver.3 Publications

Gene expression databases

BgeeiENSG00000139292.
CleanExiHS_LGR5.
ExpressionAtlasiO75473. baseline and differential.
GenevisibleiO75473. HS.

Organism-specific databases

HPAiHPA012530.

Interactioni

Subunit structurei

Identified in a complex composed of RNF43, LGR5 and RSPO1.2 Publications

Protein-protein interaction databases

BioGridi114119. 4 interactors.
STRINGi9606.ENSP00000266674.

Structurei

Secondary structure

1907
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi39 – 42Combined sources4
Turni44 – 46Combined sources3
Beta strandi48 – 51Combined sources4
Beta strandi68 – 72Combined sources5
Helixi86 – 88Combined sources3
Beta strandi94 – 96Combined sources3
Turni107 – 112Combined sources6
Beta strandi118 – 120Combined sources3
Turni133 – 136Combined sources4
Beta strandi142 – 144Combined sources3
Turni155 – 160Combined sources6
Beta strandi166 – 168Combined sources3
Helixi181 – 183Combined sources3
Beta strandi190 – 192Combined sources3
Turni203 – 208Combined sources6
Beta strandi214 – 216Combined sources3
Turni227 – 232Combined sources6
Beta strandi238 – 240Combined sources3
Helixi251 – 255Combined sources5
Beta strandi261 – 263Combined sources3
Turni274 – 279Combined sources6
Beta strandi285 – 287Combined sources3
Turni298 – 303Combined sources6
Beta strandi309 – 314Combined sources6
Beta strandi331 – 338Combined sources8
Helixi347 – 350Combined sources4
Beta strandi356 – 358Combined sources3
Beta strandi378 – 380Combined sources3
Turni391 – 396Combined sources6
Beta strandi402 – 404Combined sources3
Turni415 – 420Combined sources6
Beta strandi426 – 428Combined sources3
Beta strandi446 – 449Combined sources4
Turni462 – 464Combined sources3
Beta strandi470 – 472Combined sources3
Helixi476 – 482Combined sources7
Helixi523 – 533Combined sources11
Beta strandi534 – 536Combined sources3
Beta strandi540 – 542Combined sources3

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
4BSRX-ray3.20A/B22-543[»]
4BSSX-ray3.20A/B/E/F22-543[»]
4BSTX-ray4.30A/B22-543[»]
4BSUX-ray3.20A/B/E/F22-543[»]
4KNGX-ray2.50A/B32-557[»]
4UFRX-ray2.20A/C32-486[»]
A/C538-544[»]
4UFSX-ray4.80A32-486[»]
A538-544[»]
ProteinModelPortaliO75473.
SMRiO75473.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini25 – 66LRRNTAdd BLAST42
Repeati67 – 90LRR 1Add BLAST24
Repeati91 – 112LRR 2Add BLAST22
Repeati115 – 136LRR 3Add BLAST22
Repeati139 – 160LRR 4Add BLAST22
Repeati163 – 184LRR 5Add BLAST22
Repeati187 – 208LRR 6Add BLAST22
Repeati211 – 232LRR 7Add BLAST22
Repeati235 – 256LRR 8Add BLAST22
Repeati258 – 279LRR 9Add BLAST22
Repeati282 – 303LRR 10Add BLAST22
Repeati306 – 328LRR 11Add BLAST23
Repeati329 – 350LRR 12Add BLAST22
Repeati353 – 374LRR 13Add BLAST22
Repeati375 – 396LRR 14Add BLAST22
Repeati399 – 420LRR 15Add BLAST22
Repeati423 – 446LRR 16Add BLAST24

Sequence similaritiesi

Belongs to the G-protein coupled receptor 1 family.PROSITE-ProRule annotation
Contains 16 LRR (leucine-rich) repeats.Curated
Contains 1 LRRNT domain.Curated

Keywords - Domaini

Leucine-rich repeat, Repeat, Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiKOG0619. Eukaryota.
KOG2087. Eukaryota.
COG4886. LUCA.
GeneTreeiENSGT00760000119088.
HOGENOMiHOG000231829.
HOVERGENiHBG031675.
InParanoidiO75473.
KOiK04308.
OMAiLENIWDC.
OrthoDBiEOG091G0QA0.
PhylomeDBiO75473.
TreeFamiTF316814.

Family and domain databases

Gene3Di3.80.10.10. 3 hits.
InterProiIPR000276. GPCR_Rhodpsn.
IPR017452. GPCR_Rhodpsn_7TM.
IPR002131. Gphrmn_rcpt_fam.
IPR032675. L_dom-like.
IPR001611. Leu-rich_rpt.
IPR003591. Leu-rich_rpt_typical-subtyp.
IPR000372. LRRNT.
[Graphical view]
PfamiPF00560. LRR_1. 1 hit.
PF13855. LRR_8. 4 hits.
PF01462. LRRNT. 1 hit.
[Graphical view]
PRINTSiPR00373. GLYCHORMONER.
PR00237. GPCRRHODOPSN.
SMARTiSM00369. LRR_TYP. 15 hits.
SM00013. LRRNT. 1 hit.
[Graphical view]
SUPFAMiSSF52058. SSF52058. 1 hit.
PROSITEiPS50262. G_PROTEIN_RECEP_F1_2. 1 hit.
PS51450. LRR. 15 hits.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: O75473-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MDTSRLGVLL SLPVLLQLAT GGSSPRSGVL LRGCPTHCHC EPDGRMLLRV
60 70 80 90 100
DCSDLGLSEL PSNLSVFTSY LDLSMNNISQ LLPNPLPSLR FLEELRLAGN
110 120 130 140 150
ALTYIPKGAF TGLYSLKVLM LQNNQLRHVP TEALQNLRSL QSLRLDANHI
160 170 180 190 200
SYVPPSCFSG LHSLRHLWLD DNALTEIPVQ AFRSLSALQA MTLALNKIHH
210 220 230 240 250
IPDYAFGNLS SLVVLHLHNN RIHSLGKKCF DGLHSLETLD LNYNNLDEFP
260 270 280 290 300
TAIRTLSNLK ELGFHSNNIR SIPEKAFVGN PSLITIHFYD NPIQFVGRSA
310 320 330 340 350
FQHLPELRTL TLNGASQITE FPDLTGTANL ESLTLTGAQI SSLPQTVCNQ
360 370 380 390 400
LPNLQVLDLS YNLLEDLPSF SVCQKLQKID LRHNEIYEIK VDTFQQLLSL
410 420 430 440 450
RSLNLAWNKI AIIHPNAFST LPSLIKLDLS SNLLSSFPIT GLHGLTHLKL
460 470 480 490 500
TGNHALQSLI SSENFPELKV IEMPYAYQCC AFGVCENAYK ISNQWNKGDN
510 520 530 540 550
SSMDDLHKKD AGMFQAQDER DLEDFLLDFE EDLKALHSVQ CSPSPGPFKP
560 570 580 590 600
CEHLLDGWLI RIGVWTIAVL ALTCNALVTS TVFRSPLYIS PIKLLIGVIA
610 620 630 640 650
AVNMLTGVSS AVLAGVDAFT FGSFARHGAW WENGVGCHVI GFLSIFASES
660 670 680 690 700
SVFLLTLAAL ERGFSVKYSA KFETKAPFSS LKVIILLCAL LALTMAAVPL
710 720 730 740 750
LGGSKYGASP LCLPLPFGEP STMGYMVALI LLNSLCFLMM TIAYTKLYCN
760 770 780 790 800
LDKGDLENIW DCSMVKHIAL LLFTNCILNC PVAFLSFSSL INLTFISPEV
810 820 830 840 850
IKFILLVVVP LPACLNPLLY ILFNPHFKED LVSLRKQTYV WTRSKHPSLM
860 870 880 890 900
SINSDDVEKQ SCDSTQALVT FTSSSITYDL PPSSVPSPAY PVTESCHLSS

VAFVPCL
Length:907
Mass (Da):99,998
Last modified:November 1, 1998 - v1
Checksum:i822D5C5E6F0D9092
GO
Isoform 2 (identifier: O75473-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     263-286: Missing.

Show »
Length:883
Mass (Da):97,404
Checksum:i50B293FCFCAED440
GO
Isoform 3 (identifier: O75473-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     143-214: Missing.

Show »
Length:835
Mass (Da):92,006
Checksum:i352096C4523C1E2F
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti90R → H in AAC77911 (PubMed:9849958).Curated1
Sequence conflicti212L → W in AAC77911 (PubMed:9849958).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_049411383H → R.Corresponds to variant rs12303775dbSNPEnsembl.1
Natural variantiVAR_049412666V → A.1 PublicationCorresponds to variant rs17109924dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_054782143 – 214Missing in isoform 3. 1 PublicationAdd BLAST72
Alternative sequenceiVSP_037746263 – 286Missing in isoform 2. 1 PublicationAdd BLAST24

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF062006 mRNA. Translation: AAC28019.1.
AF061444 mRNA. Translation: AAC77911.1.
FN820440 mRNA. Translation: CBL95002.2.
AC078860 Genomic DNA. No translation available.
AC090116 Genomic DNA. No translation available.
BC096324 mRNA. Translation: AAH96324.1.
BC096325 mRNA. Translation: AAH96325.1.
BC096326 mRNA. Translation: AAH96326.1.
BC099650 mRNA. Translation: AAH99650.1.
CCDSiCCDS61194.1. [O75473-2]
CCDS61195.1. [O75473-3]
CCDS9000.1. [O75473-1]
PIRiJE0176.
RefSeqiNP_001264155.1. NM_001277226.1. [O75473-2]
NP_001264156.1. NM_001277227.1. [O75473-3]
NP_003658.1. NM_003667.3. [O75473-1]
UniGeneiHs.658889.

Genome annotation databases

EnsembliENST00000266674; ENSP00000266674; ENSG00000139292. [O75473-1]
ENST00000536515; ENSP00000443033; ENSG00000139292. [O75473-3]
ENST00000540815; ENSP00000441035; ENSG00000139292. [O75473-2]
GeneIDi8549.
KEGGihsa:8549.
UCSCiuc001swl.5. human. [O75473-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF062006 mRNA. Translation: AAC28019.1.
AF061444 mRNA. Translation: AAC77911.1.
FN820440 mRNA. Translation: CBL95002.2.
AC078860 Genomic DNA. No translation available.
AC090116 Genomic DNA. No translation available.
BC096324 mRNA. Translation: AAH96324.1.
BC096325 mRNA. Translation: AAH96325.1.
BC096326 mRNA. Translation: AAH96326.1.
BC099650 mRNA. Translation: AAH99650.1.
CCDSiCCDS61194.1. [O75473-2]
CCDS61195.1. [O75473-3]
CCDS9000.1. [O75473-1]
PIRiJE0176.
RefSeqiNP_001264155.1. NM_001277226.1. [O75473-2]
NP_001264156.1. NM_001277227.1. [O75473-3]
NP_003658.1. NM_003667.3. [O75473-1]
UniGeneiHs.658889.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
4BSRX-ray3.20A/B22-543[»]
4BSSX-ray3.20A/B/E/F22-543[»]
4BSTX-ray4.30A/B22-543[»]
4BSUX-ray3.20A/B/E/F22-543[»]
4KNGX-ray2.50A/B32-557[»]
4UFRX-ray2.20A/C32-486[»]
A/C538-544[»]
4UFSX-ray4.80A32-486[»]
A538-544[»]
ProteinModelPortaliO75473.
SMRiO75473.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi114119. 4 interactors.
STRINGi9606.ENSP00000266674.

Chemistry databases

GuidetoPHARMACOLOGYi148.

Protein family/group databases

GPCRDBiSearch...

PTM databases

iPTMnetiO75473.
PhosphoSitePlusiO75473.

Proteomic databases

MaxQBiO75473.
PaxDbiO75473.
PeptideAtlasiO75473.
PRIDEiO75473.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000266674; ENSP00000266674; ENSG00000139292. [O75473-1]
ENST00000536515; ENSP00000443033; ENSG00000139292. [O75473-3]
ENST00000540815; ENSP00000441035; ENSG00000139292. [O75473-2]
GeneIDi8549.
KEGGihsa:8549.
UCSCiuc001swl.5. human. [O75473-1]

Organism-specific databases

CTDi8549.
DisGeNETi8549.
GeneCardsiLGR5.
HGNCiHGNC:4504. LGR5.
HPAiHPA012530.
MIMi606667. gene.
neXtProtiNX_O75473.
OpenTargetsiENSG00000139292.
PharmGKBiPA28894.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0619. Eukaryota.
KOG2087. Eukaryota.
COG4886. LUCA.
GeneTreeiENSGT00760000119088.
HOGENOMiHOG000231829.
HOVERGENiHBG031675.
InParanoidiO75473.
KOiK04308.
OMAiLENIWDC.
OrthoDBiEOG091G0QA0.
PhylomeDBiO75473.
TreeFamiTF316814.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000139292-MONOMER.
ReactomeiR-HSA-4641263. Regulation of FZD by ubiquitination.
SIGNORiO75473.

Miscellaneous databases

GeneWikiiLGR5.
GenomeRNAii8549.
PROiO75473.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000139292.
CleanExiHS_LGR5.
ExpressionAtlasiO75473. baseline and differential.
GenevisibleiO75473. HS.

Family and domain databases

Gene3Di3.80.10.10. 3 hits.
InterProiIPR000276. GPCR_Rhodpsn.
IPR017452. GPCR_Rhodpsn_7TM.
IPR002131. Gphrmn_rcpt_fam.
IPR032675. L_dom-like.
IPR001611. Leu-rich_rpt.
IPR003591. Leu-rich_rpt_typical-subtyp.
IPR000372. LRRNT.
[Graphical view]
PfamiPF00560. LRR_1. 1 hit.
PF13855. LRR_8. 4 hits.
PF01462. LRRNT. 1 hit.
[Graphical view]
PRINTSiPR00373. GLYCHORMONER.
PR00237. GPCRRHODOPSN.
SMARTiSM00369. LRR_TYP. 15 hits.
SM00013. LRRNT. 1 hit.
[Graphical view]
SUPFAMiSSF52058. SSF52058. 1 hit.
PROSITEiPS50262. G_PROTEIN_RECEP_F1_2. 1 hit.
PS51450. LRR. 15 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiLGR5_HUMAN
AccessioniPrimary (citable) accession number: O75473
Secondary accession number(s): D8MCT0
, Q4VAM0, Q4VAM2, Q9UP75
Entry historyi
Integrated into UniProtKB/Swiss-Prot: June 20, 2002
Last sequence update: November 1, 1998
Last modified: November 2, 2016
This is version 152 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Miscellaneous

LGR5 is used as a marker of adult tissue stem cells in the intestine, stomach, hair follicle, and mammary epithelium.1 Publication

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. 7-transmembrane G-linked receptors
    List of 7-transmembrane G-linked receptor entries
  2. Human chromosome 12
    Human chromosome 12: entries, gene names and cross-references to MIM
  3. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  4. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  5. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  6. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  7. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.