Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Protein teashirt

Gene

tsh

Organism
Drosophila melanogaster (Fruit fly)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Homeotic protein that acts downstream of Arm in the Wg cascade during embryogenesis to determine segment identity throughout the entire trunk. Acts cooperatively with other trunk homeotic proteins to repress head homeotic genes and therefore repress head segmental identity. Necessary, in combination with Scr, for the formation of the prothoracic segment. Promotes eye development in the dorsal region of the eye disk and suppresses eye development in the ventral region in combination with Wg-signaling and several early dorso-ventral eye patterning genes. Required for proper development of proximal leg segments. Has differential functions along the dorso-ventral axs of the antennal and leg disks. May play a role in wing hinge development. Possible involvement in chromatin structure for modulation of transcription. Binds DNA and can act as both a transcriptional repressor and activator. Positively regulates its own expression as well as that of Dll. Negatively regulates the expression of mod. Required for Wg-mediated transcriptional repression of Ubx in the midgut. Also represses transcription of lab in the midgut and is necessary for the proper formation of anterior and central midgut structures. Tiptop (tio) and teashirt (tsh) have, on the whole, common activities. Tio and tsh repress each other's expression and tsh has a crucial role for trunk patterning that is in part masked by ectopic expression of tiptop. Both genes share a common activity required for the activation of Ser and svb and the maintenance of en and wg.8 Publications

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri354 – 37825C2H2-type 1PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri466 – 49025C2H2-type 2PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri533 – 55725C2H2-type 3PROSITE-ProRule annotationAdd
BLAST

GO - Molecular functioni

  • DNA binding Source: UniProtKB
  • metal ion binding Source: UniProtKB-KW
  • transcriptional repressor activity, RNA polymerase II core promoter proximal region sequence-specific binding Source: FlyBase

GO - Biological processi

  • compound eye development Source: FlyBase
  • dorsal/ventral pattern formation, imaginal disc Source: UniProtKB
  • epidermis morphogenesis Source: FlyBase
  • eye-antennal disc development Source: UniProtKB
  • head involution Source: FlyBase
  • imaginal disc-derived leg morphogenesis Source: FlyBase
  • imaginal disc-derived wing morphogenesis Source: FlyBase
  • leg disc proximal/distal pattern formation Source: UniProtKB
  • Malpighian tubule stellate cell differentiation Source: FlyBase
  • midgut development Source: UniProtKB
  • negative regulation of salivary gland boundary specification Source: FlyBase
  • negative regulation of transcription, DNA-templated Source: UniProtKB
  • negative regulation of transcription from RNA polymerase II promoter Source: UniProtKB
  • regulation of glucose metabolic process Source: FlyBase
  • regulation of transcription from RNA polymerase II promoter Source: FlyBase
  • salivary gland development Source: FlyBase
  • specification of segmental identity, abdomen Source: FlyBase
  • specification of segmental identity, head Source: FlyBase
  • specification of segmental identity, thorax Source: FlyBase
  • specification of segmental identity, trunk Source: UniProtKB
  • transcription, DNA-templated Source: UniProtKB-KW
  • wing disc proximal/distal pattern formation Source: UniProtKB
  • Wnt signaling pathway Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Activator, Developmental protein, Repressor

Keywords - Biological processi

Transcription, Transcription regulation, Wnt signaling pathway

Keywords - Ligandi

DNA-binding, Metal-binding, Zinc

Enzyme and pathway databases

SignaLinkiP22265.

Names & Taxonomyi

Protein namesi
Recommended name:
Protein teashirt
Gene namesi
Name:tsh
ORF Names:CG1374
OrganismiDrosophila melanogaster (Fruit fly)
Taxonomic identifieri7227 [NCBI]
Taxonomic lineageiEukaryotaMetazoaEcdysozoaArthropodaHexapodaInsectaPterygotaNeopteraEndopterygotaDipteraBrachyceraMuscomorphaEphydroideaDrosophilidaeDrosophilaSophophora
Proteomesi
  • UP000000803 Componenti: Chromosome 2L

Organism-specific databases

FlyBaseiFBgn0003866. tsh.

Subcellular locationi

  • Nucleus
  • Cytoplasm

  • Note: Initially localized in the cytoplasm soon after the blastoderm stage, and becomes nuclear by stage 9.

GO - Cellular componenti

  • cytoplasm Source: UniProtKB
  • nucleus Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Nucleus

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 954954Protein teashirtPRO_0000047061Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei750 – 7501Phosphoserine1 Publication
Modified residuei758 – 7581Phosphoserine1 Publication

Keywords - PTMi

Phosphoprotein

Proteomic databases

PaxDbiP22265.
PRIDEiP22265.

PTM databases

iPTMnetiP22265.

Expressioni

Tissue specificityi

Shows a dynamic expression pattern during embryogenesis. Expressed in the embryonic trunk region (PS 3-13) with expression strongest in the thoracic segments. Expressed in a small group of cells corresponding to the anal tuft from stage 14. Strongly expressed in the embryonic ventral nerve cord. Also expressed in the proximal domain of the leg imaginal disk and in the region of the wing disk that will give rise to the proximal wing hinge. Expressed at high levels in the anterior and central embryonic midgut mesoderm and in the embryonic midgut endoderm. Expressed at a low level in more posterior visceral mesoderm of the gut. From stage 12 onwards, tsh and tio are colocalized in some cells of the CNS, trunk epidermis, hindgut and Malpighian tubules.6 Publications

Developmental stagei

Expressed throughout embryonic, larval and adult development. Not maternally expressed.

Gene expression databases

BgeeiP22265.
GenevisibleiP22265. DM.

Interactioni

Subunit structurei

Binds arm.

Protein-protein interaction databases

BioGridi61373. 14 interactions.
IntActiP22265. 1 interaction.
MINTiMINT-787554.
STRINGi7227.FBpp0085261.

Structurei

3D structure databases

ProteinModelPortaliP22265.
SMRiP22265. Positions 466-567.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi104 – 1074Poly-Ala
Compositional biasi115 – 1228Poly-Ala
Compositional biasi175 – 1806Poly-Glu
Compositional biasi401 – 4077Poly-Pro
Compositional biasi831 – 8355Poly-Asn
Compositional biasi919 – 9246Poly-Ala

Sequence similaritiesi

Contains 3 C2H2-type zinc fingers.PROSITE-ProRule annotation

Zinc finger

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri354 – 37825C2H2-type 1PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri466 – 49025C2H2-type 2PROSITE-ProRule annotationAdd
BLAST
Zinc fingeri533 – 55725C2H2-type 3PROSITE-ProRule annotationAdd
BLAST

Keywords - Domaini

Repeat, Zinc-finger

Phylogenomic databases

eggNOGiENOG410IJ7P. Eukaryota.
ENOG4110HP7. LUCA.
GeneTreeiENSGT00730000113156.
InParanoidiP22265.
KOiK09236.
OMAiERCPSHD.
OrthoDBiEOG74N5G4.
PhylomeDBiP22265.

Family and domain databases

InterProiIPR027008. Teashirt_fam.
IPR026807. Tio/Tsh.
IPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
[Graphical view]
PANTHERiPTHR12487. PTHR12487. 2 hits.
PTHR12487:SF7. PTHR12487:SF7. 2 hits.
PfamiPF12756. zf-C2H2_2. 1 hit.
PF13912. zf-C2H2_6. 1 hit.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 3 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 3 hits.
PS50157. ZINC_FINGER_C2H2_2. 3 hits.
[Graphical view]

Sequencei

Sequence statusi: Complete.

P22265-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MLHEALMLEI YRQALNAGAL PTARPRSTES ANSSERCPSH DSNSSEHGGG
60 70 80 90 100
AGSGGVGHRL DAAALSTGVM PGEGPTTLHS SFPAVPQSLP SQPPSMEAYL
110 120 130 140 150
HMVAAAAQQY GFPLAAAAAA GAGPRLPLPL ANEAAAPFKL PPQASPTASS
160 170 180 190 200
NNSEALDFRT NLYGRAESAE PPASEGEEEE FDDGANNPLD LSVGTRKRGH
210 220 230 240 250
ESEPQLGHIQ VKKMFKSDSP PANSVASPSA SQLLPGVNPY LAAVAAANIF
260 270 280 290 300
RAGQFPDWNS KNDLVVDPLE KMSDIVKGGA SGMGTKEKMH SSKATTPQAA
310 320 330 340 350
SQPPKSPVQP TPNQNSESGG GSGGGAAGSG AVTKARHNIW QSHWQNKGVA
360 370 380 390 400
SSVFRCVWCK QSFPTLEALT THMKDSKHCG VNVPPFGNLP SNNPQPQHHH
410 420 430 440 450
PTPPPPPQNH NLRKHSSGSA SNHSPSANVK NAFQYRGDPP TPLPRKLVRG
460 470 480 490 500
QNVWLGKGVE QAMQILKCMR CGESFRSLGE MTKHMQETQH YTNILSQEQS
510 520 530 540 550
ISIKSGNANA NSDAKESHNS LSSEESRTLS AVLTCKVCDK AFNSLGDLSN
560 570 580 590 600
HMAKNNHYAE PLLQSAGARK RPAPKKREKS LPVRKLLEMK GGSGTTQEDH
610 620 630 640 650
SNEKTSVQGK PGLGPGGGDK NDAALFAERM RQYITGVKAP EEIAKVAAAQ
660 670 680 690 700
LLAKNKSPEL VEQKNGGSAK AAGASSVLSA IEQMFTTSFD TPPRHASLPA
710 720 730 740 750
SSPSNSSTKN TSPVASSILK RLGIDETVDY NKPLIDTNDP YYQHYRYTSS
760 770 780 790 800
ERSGSECSAE ARPRLDAPTP EKQQQGGGHD EESSKPAIKQ EREAESKPVK
810 820 830 840 850
MEIKSEFVDE PNEAEETSKM EAAVVNGSAT NNNNNIVERS SPKTPSSAAS
860 870 880 890 900
PQTRLLPPRS PAESQRSVTP KSPASSHKSY DGSSEGTKKF PSDSLNALSS
910 920 930 940 950
MFDSLGSSGA GANSRAKLAA AAAAGGSESP ENLTAGGNSL AALRQFCVKK

EKTA
Length:954
Mass (Da):100,651
Last modified:August 16, 2005 - v2
Checksum:i33FB3677DC8C4B33
GO

Sequence cautioni

The sequence AAA28983.1 differs from that shown. Reason: Frameshift at position 920. Curated
The sequence AAT94430.1 differs from that shown. Reason: Frameshift at position 258. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti668 – 6736SAKAAG → QQRLR in AAA28983 (PubMed:1846092).Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M57496 mRNA. Translation: AAA28983.1. Frameshift.
AE014134 Genomic DNA. Translation: AAF57236.2.
BT015201 mRNA. Translation: AAT94430.1. Frameshift.
BT030762 mRNA. Translation: ABV82144.1.
PIRiA38437.
RefSeqiNP_523615.2. NM_078891.3.
UniGeneiDm.6988.

Genome annotation databases

EnsemblMetazoaiFBtr0085906; FBpp0085261; FBgn0003866.
GeneIDi35430.
KEGGidme:Dmel_CG1374.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M57496 mRNA. Translation: AAA28983.1. Frameshift.
AE014134 Genomic DNA. Translation: AAF57236.2.
BT015201 mRNA. Translation: AAT94430.1. Frameshift.
BT030762 mRNA. Translation: ABV82144.1.
PIRiA38437.
RefSeqiNP_523615.2. NM_078891.3.
UniGeneiDm.6988.

3D structure databases

ProteinModelPortaliP22265.
SMRiP22265. Positions 466-567.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi61373. 14 interactions.
IntActiP22265. 1 interaction.
MINTiMINT-787554.
STRINGi7227.FBpp0085261.

PTM databases

iPTMnetiP22265.

Proteomic databases

PaxDbiP22265.
PRIDEiP22265.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsemblMetazoaiFBtr0085906; FBpp0085261; FBgn0003866.
GeneIDi35430.
KEGGidme:Dmel_CG1374.

Organism-specific databases

CTDi35430.
FlyBaseiFBgn0003866. tsh.

Phylogenomic databases

eggNOGiENOG410IJ7P. Eukaryota.
ENOG4110HP7. LUCA.
GeneTreeiENSGT00730000113156.
InParanoidiP22265.
KOiK09236.
OMAiERCPSHD.
OrthoDBiEOG74N5G4.
PhylomeDBiP22265.

Enzyme and pathway databases

SignaLinkiP22265.

Miscellaneous databases

GenomeRNAii35430.
PROiP22265.

Gene expression databases

BgeeiP22265.
GenevisibleiP22265. DM.

Family and domain databases

InterProiIPR027008. Teashirt_fam.
IPR026807. Tio/Tsh.
IPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
[Graphical view]
PANTHERiPTHR12487. PTHR12487. 2 hits.
PTHR12487:SF7. PTHR12487:SF7. 2 hits.
PfamiPF12756. zf-C2H2_2. 1 hit.
PF13912. zf-C2H2_6. 1 hit.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 3 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 3 hits.
PS50157. ZINC_FINGER_C2H2_2. 3 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "The gene teashirt is required for the development of Drosophila embryonic trunk segments and encodes a protein with widely spaced zinc finger motifs."
    Fasano L., Roeder L., Core N., Alexandre E., Vola C., Jacq B., Kerridge S.
    Cell 64:63-79(1991) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], FUNCTION.
  2. "The genome sequence of Drosophila melanogaster."
    Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., Sutton G.G., Wortman J.R., Yandell M.D.
    , Zhang Q., Chen L.X., Brandon R.C., Rogers Y.-H.C., Blazej R.G., Champe M., Pfeiffer B.D., Wan K.H., Doyle C., Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C., Baldwin D., Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., Cherry J.M., Cawley S., Dahlke C., Davenport L.B., Davies P., de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., Durbin K.J., Evangelista C.C., Ferraz C., Ferriera S., Fleischmann W., Fosler C., Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M., Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C., Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J.A., Ketchum K.A., Kimmel B.E., Kodira C.D., Kraft C.L., Kravitz S., Kulp D., Lai Z., Lasko P., Lei Y., Levitsky A.A., Li J.H., Li Z., Liang Y., Lin X., Liu X., Mattei B., McIntosh T.C., McLeod M.P., McPherson D., Merkulov G., Milshina N.V., Mobarry C., Morris J., Moshrefi A., Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., Reinert K., Remington K., Saunders R.D.C., Scheeler F., Shen H., Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T.J., Spier E., Spradling A.C., Stapleton M., Strong R., Sun E., Svirskas R., Tector C., Turner R., Venter E., Wang A.H., Wang X., Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S.C., Zhu X., Smith H.O., Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.
    Science 287:2185-2195(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: Berkeley.
  3. Cited for: GENOME REANNOTATION.
    Strain: Berkeley.
  4. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Strain: Berkeley.
    Tissue: Embryo.
  5. "The role of the teashirt gene in trunk segmental identity in Drosophila."
    Roeder L., Vola C., Kerridge S.
    Development 115:1017-1033(1992) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, TISSUE SPECIFICITY.
  6. "Homeotic complex and teashirt genes co-operate to establish trunk segmental identities in Drosophila."
    de Zulueta P., Alexandre E., Jacq B., Kerridge S.
    Development 120:2287-2296(1994) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION.
  7. "Role of the teashirt gene in Drosophila midgut morphogenesis: secreted proteins mediate the action of homeotic genes."
    Mathies L.D., Kerridge S., Scott M.P.
    Development 120:2799-2809(1994) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, TISSUE SPECIFICITY.
  8. "The Drosophila teashirt homeotic protein is a DNA-binding protein and modulo, a HOM-C regulated modifier of variegation, is a likely candidate for being a direct target gene."
    Alexandre E., Graba Y., Fasano L., Gallet A., Perrin L., De Zulueta P., Pradel J., Kerridge S., Jacq B.
    Mech. Dev. 59:191-204(1996) [PubMed] [Europe PMC] [Abstract]
    Cited for: DNA-BINDING, SUBCELLULAR LOCATION, TISSUE SPECIFICITY.
  9. "Trunk-specific modulation of wingless signalling in Drosophila by teashirt binding to armadillo."
    Gallet A., Erkner A., Charroux B., Fasano L., Kerridge S.
    Curr. Biol. 8:893-902(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, INTERACTION WITH ARM, SUBCELLULAR LOCATION.
  10. "Proximal distal axis formation in the Drosophila leg: distinct functions of teashirt and homothorax in the proximal leg."
    Wu J., Cohen S.M.
    Mech. Dev. 94:47-56(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, TISSUE SPECIFICITY.
  11. "Identification of a regulatory allele of teashirt (tsh) in Drosophila melanogaster that affects wing hinge development. An adult-specific tsh enhancer in Drosophila."
    Soanes K.H., MacKay J.O., Core N., Heslip T., Kerridge S., Bell J.B.
    Mech. Dev. 105:145-151(2001) [PubMed] [Europe PMC] [Abstract]
    Cited for: POSSIBLE FUNCTION, TISSUE SPECIFICITY.
  12. "Dorso-ventral asymmetric functions of teashirt in Drosophila eye development depend on spatial cues provided by early DV patterning genes."
    Singh A., Kango-Singh M., Choi K.W., Sun Y.H.
    Mech. Dev. 121:365-370(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION.
  13. "A critical role of teashirt for patterning the ventral epidermis is masked by ectopic expression of tiptop, a paralog of teashirt in Drosophila."
    Laugier E., Yang Z., Fasano L., Kerridge S., Vola C.
    Dev. Biol. 283:446-458(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, TISSUE SPECIFICITY.
  14. "Phosphoproteome analysis of Drosophila melanogaster embryos."
    Zhai B., Villen J., Beausoleil S.A., Mintseris J., Gygi S.P.
    J. Proteome Res. 7:1675-1682(2008) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-750 AND SER-758, IDENTIFICATION BY MASS SPECTROMETRY.
    Tissue: Embryo.

Entry informationi

Entry nameiTSH_DROME
AccessioniPrimary (citable) accession number: P22265
Secondary accession number(s): A8E6H0
, Q6AWP7, Q7KRS3, Q9V9Q0
Entry historyi
Integrated into UniProtKB/Swiss-Prot: August 1, 1991
Last sequence update: August 16, 2005
Last modified: June 8, 2016
This is version 147 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programDrosophila annotation project

Miscellaneousi

Miscellaneous

The tsh tio gene pair seems to have arisen from a recent duplication event: tsh has the dominant role compared to tio.

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Drosophila
    Drosophila: entries, gene names and cross-references to FlyBase
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.