Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

CD166 antigen

Gene

Alcam

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Cell adhesion molecule that mediates both heterotypic cell-cell contacts via its interaction with CD6, as well as homotypic cell-cell contacts. Promotes T-cell activation and proliferation via its interactions with CD6 (By similarity). Contributes to the formation and maturation of the immunological synapse via its interactions with CD6 (By similarity). Mediates homotypic interactions with cells that express ALCAM (PubMed:24740813). Mediates attachment of dendritic cells onto endothelial cells via homotypic interaction. Inhibits endothelial cell migration and promotes endothelial tube formation via homotypic interactions (PubMed:23169771). Required for normal organization of the lymph vessel network (PubMed:23169771). Required for normal hematopoietic stem cell engraftment in the bone marrow (PubMed:24740813). Plays a role in hematopoiesis; required for normal numbers of hematopoietic stem cells in bone marrow (PubMed:25730656). Promotes in vitro osteoblast proliferation and differentiation (PubMed:25730656). Promotes neurite extension, axon growth and axon guidance; axons grow preferentially on surfaces that contain ALCAM (By similarity). Mediates outgrowth and pathfinding for retinal ganglion cell axons (PubMed:15345243).By similarity2 Publications

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Adaptive immunity, Cell adhesion, Immunity

Names & Taxonomyi

Protein namesi
Recommended name:
CD166 antigen
Alternative name(s):
Activated leukocyte cell adhesion molecule1 Publication
BEN1 Publication
Protein DM-GRASP1 Publication
CD_antigen: CD166
Gene namesi
Name:Alcam
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 16

Organism-specific databases

MGIiMGI:1313266. Alcam.

Subcellular locationi

Topology

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Topological domaini28 – 527500ExtracellularSequence analysisAdd
BLAST
Transmembranei528 – 54922HelicalSequence analysisAdd
BLAST
Topological domaini550 – 58334CytoplasmicSequence analysisAdd
BLAST

GO - Cellular componenti

  • axon Source: MGI
  • dendrite Source: UniProtKB-SubCell
  • external side of plasma membrane Source: MGI
  • extracellular exosome Source: MGI
  • focal adhesion Source: MGI
  • immunological synapse Source: UniProtKB
  • integral component of plasma membrane Source: UniProtKB
  • intrinsic component of plasma membrane Source: UniProtKB
  • neuronal cell body Source: MGI
  • T cell receptor complex Source: Ensembl
Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Cell projection, Membrane

Pathology & Biotechi

Disruption phenotypei

Mice are born at the expected Mendelian rate, are viable and fertile and display no obvious external phenotype. Unlike wild-type mice, that have tightly fasciculated and smooth nerve bundles, mutant mice have more loosely bundled nerves with many single axons extending out of the main nerve. Eyes from mutant mice display a variable degree of retinal displasia (PubMed:15345243). Besides, lymph nodes from mutant mice display reduced weight and cellularity, but appear otherwise normal (PubMed:23169771). Mutant mice have only half of the normal number of hematopoietic stem cells in their bone marrow (PubMed:24740813, PubMed:25730656). Survival of lethally irradiated mice that receive bone marrow from mutant mice is impaired, due to impaired homotypic cell-cell attachment, impaired engraftment and proliferation of mutant hematopoietic stem cells (PubMed:24740813). Mutant mice are larger and heavier than wild-type and have increased bone mineral density (PubMed:25730656). Mutant spleen has an altered leukocyte composition, with reduced numbers of CD4+ and CD8+ T-cells, B-cells, dendritic cells, neutrophils and macrophages, but no change in the total leukocyte number. Their lungs display reduced numbers of lymph vessel and blood vessel endothelial cells, but no difference in lung weight. Lymph vessels in mesentery and diaphragm are more densely interconnected and show a decreased level of hierarchical vascular organization in mutant mice (PubMed:23169771).3 Publications

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 2727Sequence analysisAdd
BLAST
Chaini28 – 583556CD166 antigenPRO_0000014660Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Disulfide bondi43 ↔ 113PROSITE-ProRule annotation
Glycosylationi95 – 951N-linked (GlcNAc...)1 Publication
Disulfide bondi157 ↔ 220PROSITE-ProRule annotation
Glycosylationi167 – 1671N-linked (GlcNAc...)1 Publication
Glycosylationi265 – 2651N-linked (GlcNAc...)Sequence analysis
Disulfide bondi270 ↔ 313PROSITE-ProRule annotation
Glycosylationi306 – 3061N-linked (GlcNAc...)Sequence analysis
Disulfide bondi354 ↔ 392PROSITE-ProRule annotation
Glycosylationi361 – 3611N-linked (GlcNAc...)Sequence analysis
Disulfide bondi435 ↔ 485PROSITE-ProRule annotation
Glycosylationi457 – 4571N-linked (GlcNAc...)1 Publication
Glycosylationi466 – 4661N-linked (GlcNAc...); atypical1 Publication
Glycosylationi480 – 4801N-linked (GlcNAc...)Sequence analysis
Glycosylationi496 – 4961N-linked (GlcNAc...); atypical1 Publication
Glycosylationi499 – 4991N-linked (GlcNAc...)1 Publication
Glycosylationi518 – 5181N-linked (GlcNAc...); atypical1 Publication

Post-translational modificationi

Glycosylated.By similarity

Keywords - PTMi

Disulfide bond, Glycoprotein

Proteomic databases

EPDiQ61490.
MaxQBiQ61490.
PaxDbiQ61490.
PRIDEiQ61490.

PTM databases

iPTMnetiQ61490.
PhosphoSiteiQ61490.

Expressioni

Tissue specificityi

Detected on brain motor neurons, in differentiating retinal ganglion cells and in adult retina (PubMed:15345243). Detected on leukocytes and on lymphatic endothelial cells (PubMed:23169771). Detected in spleen B cells and T-cells (at protein level) (PubMed:9209500). Detected in adult brain and embryonic spinal cord (PubMed:15345243). Expressed at high levels in the brain, and lung, and at lower levels in the liver, and the kidney, as well as by activated leukocytes (PubMed:9209500).3 Publications

Gene expression databases

BgeeiQ61490.
CleanExiMM_ALCAM.
ExpressionAtlasiQ61490. baseline and differential.
GenevisibleiQ61490. MM.

Interactioni

Subunit structurei

Homodimer (By similarity). Interacts (via extracellular domain) with CD6 (via extracellular domain) (PubMed:9209500, PubMed:16914752). Homodimerization and interaction with CD6 involve the same region and cannot occur simultaneously. The affinity for CD6 is much higher than the affinity for self-association. Interacts (via glycosylated extracellular domain) with LGALS1 and LGALS3. Interaction with LGALS1 or LGALS3 inhibits interaction with CD6.By similarity2 Publications

Protein-protein interaction databases

IntActiQ61490. 1 interaction.
MINTiMINT-4997271.
STRINGi10090.ENSMUSP00000023312.

Structurei

3D structure databases

ProteinModelPortaliQ61490.
SMRiQ61490. Positions 28-378.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini28 – 12093Ig-like V-type 1Add
BLAST
Domaini125 – 234110Ig-like V-type 2Add
BLAST
Domaini245 – 32884Ig-like C2-type 1Add
BLAST
Domaini333 – 40977Ig-like C2-type 2Add
BLAST
Domaini416 – 50186Ig-like C2-type 3Add
BLAST

Domaini

The CD6 binding site is located in the N-terminal Ig-like domain.1 Publication

Sequence similaritiesi

Keywords - Domaini

Immunoglobulin domain, Repeat, Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiENOG410IFQ2. Eukaryota.
ENOG410ZWU9. LUCA.
GeneTreeiENSGT00530000063457.
HOGENOMiHOG000070101.
HOVERGENiHBG050847.
InParanoidiQ61490.
KOiK06547.
OMAiIQWTITG.
OrthoDBiEOG74J97B.
PhylomeDBiQ61490.
TreeFamiTF321859.

Family and domain databases

Gene3Di2.60.40.10. 5 hits.
InterProiIPR013162. CD80_C2-set.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
[Graphical view]
PfamiPF08205. C2-set_2. 1 hit.
[Graphical view]
SMARTiSM00409. IG. 3 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 5 hits.
PROSITEiPS50835. IG_LIKE. 4 hits.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

Q61490-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MASKVSPSCR LVFCLLISAA VLRPGLGWYT VNSAYGDTIV MPCRLDVPQN
60 70 80 90 100
LMFGKWKYEK PDGSPVFIAF RSSTKKSVQY DDVPEYKDRL SLSENYTLSI
110 120 130 140 150
ANAKISDEKR FVCMLVTEDN VFEAPTLVKV FKQPSKPEIV NKAPFLETDQ
160 170 180 190 200
LKKLGDCISR DSYPDGNITW YRNGKVLQPV EGEVAILFKK EIDPGTQLYT
210 220 230 240 250
VTSSLEYKTT RSDIQMPFTC SVTYYGPSGQ KTIYSEQEIF DIYYPTEQVT
260 270 280 290 300
IQVLPPKNAI KEGDNITLQC LGNGNPPPEE FMFYLPGQPE GIRSSNTYTL
310 320 330 340 350
TDVRRNATGD YKCSLIDKRN MAASTTITVH YLDLSLNPSG EVTKQIGDTL
360 370 380 390 400
PVSCTISASR NATVVWMKDN IRLRSSPSFS SLHYQDAGNY VCETALQEVE
410 420 430 440 450
GLKKRESLTL IVEGKPQIKM TKKTDPSGLS KTIICHVEGF PKPAIHWTIT
460 470 480 490 500
GSGSVINQTE ESPYINGRYY SKIIISPEEN VTLTCTAENQ LERTVNSLNV
510 520 530 540 550
SAISIPEHDE ADDISDENRE KVNDQAKLIV GIVVGLLLAA LVAGVVYWLY
560 570 580
MKKSKTASKH VNKDLGNMEE NKKLEENNHK TEA
Length:583
Mass (Da):65,092
Last modified:May 24, 2004 - v3
Checksum:i570AFA8FCAF888F8
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti8 – 81S → C in BAC27159 (PubMed:16141072).Curated
Sequence conflicti227 – 2326PSGQKT → AAGIPA in AAA37528 (PubMed:8089660).Curated
Sequence conflicti339 – 3391S → R (PubMed:9209500).Curated
Sequence conflicti339 – 3391S → R (PubMed:8089660).Curated
Sequence conflicti451 – 4511G → S in BAC27159 (PubMed:16141072).Curated
Sequence conflicti454 – 4541S → F in AAA37528 (PubMed:8089660).Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U95030 mRNA. Translation: AAC06342.1.
AK030851 mRNA. Translation: BAC27159.1.
AK031391 mRNA. Translation: BAC27382.1.
BC027280 mRNA. Translation: AAH27280.1.
L25274 mRNA. Translation: AAA37528.1.
CCDSiCCDS37356.1.
RefSeqiNP_033785.1. NM_009655.2.
UniGeneiMm.288282.

Genome annotation databases

EnsembliENSMUST00000023312; ENSMUSP00000023312; ENSMUSG00000022636.
GeneIDi11658.
KEGGimmu:11658.
UCSCiuc007zli.2. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U95030 mRNA. Translation: AAC06342.1.
AK030851 mRNA. Translation: BAC27159.1.
AK031391 mRNA. Translation: BAC27382.1.
BC027280 mRNA. Translation: AAH27280.1.
L25274 mRNA. Translation: AAA37528.1.
CCDSiCCDS37356.1.
RefSeqiNP_033785.1. NM_009655.2.
UniGeneiMm.288282.

3D structure databases

ProteinModelPortaliQ61490.
SMRiQ61490. Positions 28-378.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

IntActiQ61490. 1 interaction.
MINTiMINT-4997271.
STRINGi10090.ENSMUSP00000023312.

PTM databases

iPTMnetiQ61490.
PhosphoSiteiQ61490.

Proteomic databases

EPDiQ61490.
MaxQBiQ61490.
PaxDbiQ61490.
PRIDEiQ61490.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000023312; ENSMUSP00000023312; ENSMUSG00000022636.
GeneIDi11658.
KEGGimmu:11658.
UCSCiuc007zli.2. mouse.

Organism-specific databases

CTDi214.
MGIiMGI:1313266. Alcam.

Phylogenomic databases

eggNOGiENOG410IFQ2. Eukaryota.
ENOG410ZWU9. LUCA.
GeneTreeiENSGT00530000063457.
HOGENOMiHOG000070101.
HOVERGENiHBG050847.
InParanoidiQ61490.
KOiK06547.
OMAiIQWTITG.
OrthoDBiEOG74J97B.
PhylomeDBiQ61490.
TreeFamiTF321859.

Miscellaneous databases

ChiTaRSiAlcam. mouse.
NextBioi279279.
PROiQ61490.
SOURCEiSearch...

Gene expression databases

BgeeiQ61490.
CleanExiMM_ALCAM.
ExpressionAtlasiQ61490. baseline and differential.
GenevisibleiQ61490. MM.

Family and domain databases

Gene3Di2.60.40.10. 5 hits.
InterProiIPR013162. CD80_C2-set.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
[Graphical view]
PfamiPF08205. C2-set_2. 1 hit.
[Graphical view]
SMARTiSM00409. IG. 3 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 5 hits.
PROSITEiPS50835. IG_LIKE. 4 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Characterization of mouse ALCAM (CD166): the CD6 binding domain is conserved in different homologs and mediates cross-species binding."
    Bowen M.A., Bajorath J., D'Egidio M., Whitney G.S., Palmer D., Kobarg J., Starling G.C., Siadak A.W., Aruffo A.
    Eur. J. Immunol. 27:1469-1478(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], INTERACTION WITH CD6, SUBCELLULAR LOCATION, DOMAIN, TISSUE SPECIFICITY.
    Strain: NFS.
  2. "The transcriptional landscape of the mammalian genome."
    Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.
    , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
    Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Strain: C57BL/6J.
    Tissue: Thymus.
  3. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
  4. "The molecular cloning and characterization of potential chick DM-GRASP homologs in zebrafish and mouse."
    Kanki J.P., Chang S., Kuwada J.Y.
    J. Neurobiol. 25:831-845(1994) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 227-583.
    Strain: BALB/cJ.
    Tissue: Brain.
  5. "Axon fasciculation defects and retinal dysplasias in mice lacking the immunoglobulin superfamily adhesion molecule BEN/ALCAM/SC1."
    Weiner J.A., Koo S.J., Nicolas S., Fraboulet S., Pfaff S.L., Pourquie O., Sanes J.R.
    Mol. Cell. Neurosci. 27:59-69(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, DISRUPTION PHENOTYPE, SUBCELLULAR LOCATION, TISSUE SPECIFICITY.
  6. "CD6 regulates T-cell responses through activation-dependent recruitment of the positive regulator SLP-76."
    Hassan N.J., Simmonds S.J., Clarkson N.G., Hanrahan S., Puklavec M.J., Bomb M., Barclay A.N., Brown M.H.
    Mol. Cell. Biol. 26:6727-6738(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH CD6, SUBCELLULAR LOCATION.
  7. "The mouse C2C12 myoblast cell surface N-linked glycoproteome: identification, glycosite occupancy, and membrane orientation."
    Gundry R.L., Raginski K., Tarasova Y., Tchernyshyov I., Bausch-Fluck D., Elliott S.T., Boheler K.R., Van Eyk J.E., Wollscheid B.
    Mol. Cell. Proteomics 8:2555-2569(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: GLYCOSYLATION [LARGE SCALE ANALYSIS] AT ASN-95; ASN-167; ASN-457; ASN-466; ASN-496; ASN-499 AND ASN-518.
    Tissue: Myoblast.
  8. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Brain, Kidney, Liver, Lung, Pancreas, Spleen and Testis.
  9. "Novel role for ALCAM in lymphatic network formation and function."
    Iolyeva M., Karaman S., Willrodt A.H., Weingartner S., Vigl B., Halin C.
    FASEB J. 27:978-990(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, DISRUPTION PHENOTYPE, SUBCELLULAR LOCATION, TISSUE SPECIFICITY.
  10. Cited for: FUNCTION, DISRUPTION PHENOTYPE.
  11. "Activated leukocyte cell adhesion molecule (ALCAM or CD166) modulates bone phenotype and hematopoiesis."
    Hooker R.A., Chitteti B.R., Egan P.H., Cheng Y.H., Himes E.R., Meijome T., Srour E.F., Fuchs R.K., Kacena M.A.
    J. Musculoskelet. Neuronal Interact. 15:83-94(2015) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, DISRUPTION PHENOTYPE.

Entry informationi

Entry nameiCD166_MOUSE
AccessioniPrimary (citable) accession number: Q61490
Secondary accession number(s): O70136, Q8CDA5, Q8R2T0
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 1, 1997
Last sequence update: May 24, 2004
Last modified: May 11, 2016
This is version 134 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.