Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Homeobox protein Nkx-3.1

Gene

NKX3-1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcription factor, which binds preferentially the consensus sequence 5'-TAAGT[AG]-3' and can behave as a transcriptional repressor. Plays an important role in normal prostate development, regulating proliferation of glandular epithelium and in the formation of ducts in prostate. Acts as a tumor suppressor controlling prostate carcinogenesis, as shown by the ability to inhibit proliferation and invasion activities of PC-3 prostate cancer cells.1 Publication

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
DNA bindingi124 – 18360HomeoboxPROSITE-ProRule annotationAdd
BLAST

GO - Molecular functioni

  1. androgen receptor activity Source: UniProtKB
  2. core promoter binding Source: UniProtKB
  3. estrogen receptor activity Source: UniProtKB
  4. estrogen receptor binding Source: UniProtKB
  5. histone deacetylase binding Source: UniProtKB
  6. protein kinase activator activity Source: UniProtKB
  7. protein self-association Source: UniProtKB
  8. RNA polymerase II core promoter sequence-specific DNA binding transcription factor activity Source: Ensembl
  9. sequence-specific DNA binding Source: UniProtKB
  10. sequence-specific DNA binding transcription factor activity Source: UniProtKB
  11. transcription factor binding Source: UniProtKB
  12. transcription regulatory region DNA binding Source: UniProtKB
  13. transcription regulatory region sequence-specific DNA binding Source: BHF-UCL

GO - Biological processi

  1. activation of cysteine-type endopeptidase activity involved in apoptotic process Source: UniProtKB
  2. androgen receptor signaling pathway Source: UniProtKB
  3. branching involved in prostate gland morphogenesis Source: UniProtKB
  4. branching morphogenesis of an epithelial tube Source: UniProtKB
  5. cellular response to drug Source: UniProtKB
  6. cellular response to hypoxia Source: UniProtKB
  7. cellular response to interleukin-1 Source: UniProtKB
  8. cellular response to steroid hormone stimulus Source: UniProtKB
  9. cellular response to tumor necrosis factor Source: UniProtKB
  10. dorsal aorta development Source: Ensembl
  11. epithelial cell proliferation involved in salivary gland morphogenesis Source: UniProtKB
  12. heart development Source: Ensembl
  13. male gonad development Source: Ensembl
  14. metanephros development Source: Ensembl
  15. mitotic cell cycle arrest Source: UniProtKB
  16. multicellular organismal development Source: ProtInc
  17. negative regulation of cell proliferation Source: UniProtKB
  18. negative regulation of epithelial cell proliferation Source: UniProtKB
  19. negative regulation of epithelial cell proliferation involved in prostate gland development Source: UniProtKB
  20. negative regulation of estrogen receptor binding Source: UniProtKB
  21. negative regulation of gene expression Source: UniProtKB
  22. negative regulation of insulin-like growth factor receptor signaling pathway Source: UniProtKB
  23. negative regulation of mitotic cell cycle Source: UniProtKB
  24. negative regulation of transcription, DNA-templated Source: UniProtKB
  25. pharyngeal system development Source: Ensembl
  26. positive regulation of androgen secretion Source: UniProtKB
  27. positive regulation of apoptotic signaling pathway Source: UniProtKB
  28. positive regulation of cell death Source: UniProtKB
  29. positive regulation of cell division Source: BHF-UCL
  30. positive regulation of cell proliferation Source: UniProtKB
  31. positive regulation of cysteine-type endopeptidase activity involved in apoptotic process Source: UniProtKB
  32. positive regulation of gene expression Source: UniProtKB
  33. positive regulation of intrinsic apoptotic signaling pathway Source: UniProtKB
  34. positive regulation of mitotic cell cycle Source: BHF-UCL
  35. positive regulation of phosphatidylinositol 3-kinase signaling Source: UniProtKB
  36. positive regulation of protein kinase activity Source: GOC
  37. positive regulation of protein phosphorylation Source: UniProtKB
  38. positive regulation of response to DNA damage stimulus Source: UniProtKB
  39. positive regulation of transcription, DNA-templated Source: UniProtKB
  40. positive regulation of transcription from RNA polymerase II promoter Source: UniProtKB
  41. protein kinase B signaling Source: UniProtKB
  42. regulation of transcription, DNA-templated Source: UniProtKB
  43. response to testosterone Source: UniProtKB
  44. salivary gland development Source: UniProtKB
  45. somitogenesis Source: Ensembl
  46. steroid hormone mediated signaling pathway Source: GOC
Complete GO annotation...

Keywords - Molecular functioni

Repressor

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding

Enzyme and pathway databases

SignaLinkiQ99801.

Names & Taxonomyi

Protein namesi
Recommended name:
Homeobox protein Nkx-3.1
Alternative name(s):
Homeobox protein NK-3 homolog A
Gene namesi
Name:NKX3-1
Synonyms:NKX3.1, NKX3A
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 8

Organism-specific databases

HGNCiHGNC:7838. NKX3-1.

Subcellular locationi

Nucleus PROSITE-ProRule annotation1 Publication

GO - Cellular componenti

  1. intracellular Source: UniProtKB
  2. nucleus Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Keywords - Diseasei

Tumor suppressor

Organism-specific databases

PharmGKBiPA31645.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 234234Homeobox protein Nkx-3.1PRO_0000048945Add
BLAST

Post-translational modificationi

Ubiquitinated by TOPORS; monoubiquitinated at several residues and also polyubiquitinated on single residues.1 Publication

Keywords - PTMi

Ubl conjugation

Proteomic databases

MaxQBiQ99801.
PaxDbiQ99801.
PRIDEiQ99801.

PTM databases

PhosphoSiteiQ99801.

Expressioni

Tissue specificityi

Highly expressed in the prostate and, at a lower level, in the testis.2 Publications

Inductioni

By androgens and, in the LNCaP cell line, by estrogens. Androgenic control may be lost in prostate cancer cells during tumor progression from an androgen-dependent to an androgen-independent phase.3 Publications

Gene expression databases

BgeeiQ99801.
CleanExiHS_NKX3-1.
GenevestigatoriQ99801.

Organism-specific databases

HPAiHPA025693.

Interactioni

Subunit structurei

Interacts with serum response factor (SRF) (By similarity). Interacts with SPDEF. Interacts with WDR77. Interacts with TOPORS which polyubiquitinates NKX3-1 and induces its proteasomal degradation.By similarity3 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
TOP1P113876EBI-1385894,EBI-876302

Protein-protein interaction databases

BioGridi110888. 34 interactions.
IntActiQ99801. 1 interaction.
STRINGi9606.ENSP00000370253.

Structurei

Secondary structure

1
234
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Helixi133 – 14513Combined sources
Helixi151 – 16010Combined sources
Helixi165 – 17814Combined sources
Beta strandi182 – 1854Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
2L9RNMR-A132-189[»]
DisProtiDP00683.
ProteinModelPortaliQ99801.
SMRiQ99801. Positions 134-189.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Sequence similaritiesi

Belongs to the NK-3 homeobox family.Curated
Contains 1 homeobox DNA-binding domain.PROSITE-ProRule annotation

Keywords - Domaini

Homeobox

Phylogenomic databases

eggNOGiNOG238786.
GeneTreeiENSGT00760000118779.
HOGENOMiHOG000231923.
HOVERGENiHBG006689.
InParanoidiQ99801.
KOiK09348.
OMAiGAQRQGG.
OrthoDBiEOG7QZG9Z.
PhylomeDBiQ99801.
TreeFamiTF315720.

Family and domain databases

Gene3Di1.10.10.60. 1 hit.
InterProiIPR017970. Homeobox_CS.
IPR001356. Homeobox_dom.
IPR020479. Homeobox_metazoa.
IPR009057. Homeodomain-like.
[Graphical view]
PfamiPF00046. Homeobox. 1 hit.
[Graphical view]
PRINTSiPR00024. HOMEOBOX.
SMARTiSM00389. HOX. 1 hit.
[Graphical view]
SUPFAMiSSF46689. SSF46689. 1 hit.
PROSITEiPS00027. HOMEOBOX_1. 1 hit.
PS50071. HOMEOBOX_2. 1 hit.
[Graphical view]

Sequences (5)i

Sequence statusi: Complete.

This entry describes 5 isoformsi produced by alternative splicing. AlignAdd to basket

Note: Additional isoforms seem to exist.

Isoform 1 (identifier: Q99801-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MLRVPEPRPG EAKAEGAAPP TPSKPLTSFL IQDILRDGAQ RQGGRTSSQR
60 70 80 90 100
QRDPEPEPEP EPEGGRSRAG AQNDQLSTGP RAAPEEAETL AETEPERHLG
110 120 130 140 150
SYLLDSENTS GALPRLPQTP KQPQKRSRAA FSHTQVIELE RKFSHQKYLS
160 170 180 190 200
APERAHLAKN LKLTETQVKI WFQNRRYKTK RKQLSSELGD LEKHSSLPAL
210 220 230
KEEAFSRASL VSVYNSYPYY PYLYCVGSWS PAFW
Length:234
Mass (Da):26,350
Last modified:September 26, 2001 - v2
Checksum:iC99A0943E15B2A55
GO
Isoform 2 (identifier: Q99801-2) [UniParc]FASTAAdd to basket

Also known as: V2

The sequence of this isoform differs from the canonical sequence as follows:
     8-56: Missing.

Show »
Length:185
Mass (Da):21,099
Checksum:i5A3A3233AA1C05EB
GO
Isoform 3 (identifier: Q99801-3) [UniParc]FASTAAdd to basket

Also known as: V4

The sequence of this isoform differs from the canonical sequence as follows:
     13-87: Missing.

Show »
Length:159
Mass (Da):18,423
Checksum:iAEE13D060B421ED8
GO
Isoform 4 (identifier: Q99801-4) [UniParc]FASTAAdd to basket

Also known as: V3

The sequence of this isoform differs from the canonical sequence as follows:
     15-83: Missing.

Show »
Length:165
Mass (Da):19,049
Checksum:i2B8B0D19FBD83A8D
GO
Isoform 5 (identifier: Q99801-5) [UniParc]FASTAAdd to basket

Also known as: V1

The sequence of this isoform differs from the canonical sequence as follows:
     40-83: Missing.

Show »
Length:190
Mass (Da):21,626
Checksum:i985A34212DCEDA27
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti8 – 81R → W in AAG39737 (PubMed:11137288).Curated
Sequence conflicti85 – 851E → D in AAB38747 (PubMed:9537602).Curated
Sequence conflicti135 – 1351Q → R in AAG39738 (PubMed:11137288).Curated
Sequence conflicti196 – 1961S → F in AAB38747 (PubMed:9537602).Curated
Sequence conflicti224 – 2241Y → H in AAB38747 (PubMed:9537602).Curated
Sequence conflicti234 – 2341W → G in AAB68662 (PubMed:9226374).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti52 – 521R → C.1 Publication
Corresponds to variant rs2228013 [ dbSNP | Ensembl ].
VAR_011612

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei8 – 5649Missing in isoform 2. 1 PublicationVSP_002230Add
BLAST
Alternative sequencei13 – 8775Missing in isoform 3. 1 PublicationVSP_002231Add
BLAST
Alternative sequencei15 – 8369Missing in isoform 4. 1 PublicationVSP_002232Add
BLAST
Alternative sequencei40 – 8344Missing in isoform 5. 1 PublicationVSP_002233Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U91540 mRNA. Translation: AAB68662.1.
U80669 mRNA. Translation: AAB38747.1.
AF247704 mRNA. Translation: AAG09781.1.
AF249669 mRNA. Translation: AAG39735.1.
AF249670 mRNA. Translation: AAG39736.1.
AF249671 mRNA. Translation: AAG39737.1.
AF249672 mRNA. Translation: AAG39738.1.
BC074863 mRNA. Translation: AAH74863.1.
BC074864 mRNA. Translation: AAH74864.1.
CCDSiCCDS59095.1. [Q99801-3]
CCDS6042.1. [Q99801-1]
RefSeqiNP_001243268.1. NM_001256339.1. [Q99801-3]
NP_006158.2. NM_006167.3. [Q99801-1]
UniGeneiHs.55999.

Genome annotation databases

EnsembliENST00000380871; ENSP00000370253; ENSG00000167034. [Q99801-1]
ENST00000523261; ENSP00000429729; ENSG00000167034. [Q99801-3]
GeneIDi4824.
KEGGihsa:4824.
UCSCiuc011kzx.2. human. [Q99801-1]
uc031tao.1. human. [Q99801-3]

Polymorphism databases

DMDMi17377578.

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Web resourcesi

Atlas of Genetics and Cytogenetics in Oncology and Haematology

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U91540 mRNA. Translation: AAB68662.1.
U80669 mRNA. Translation: AAB38747.1.
AF247704 mRNA. Translation: AAG09781.1.
AF249669 mRNA. Translation: AAG39735.1.
AF249670 mRNA. Translation: AAG39736.1.
AF249671 mRNA. Translation: AAG39737.1.
AF249672 mRNA. Translation: AAG39738.1.
BC074863 mRNA. Translation: AAH74863.1.
BC074864 mRNA. Translation: AAH74864.1.
CCDSiCCDS59095.1. [Q99801-3]
CCDS6042.1. [Q99801-1]
RefSeqiNP_001243268.1. NM_001256339.1. [Q99801-3]
NP_006158.2. NM_006167.3. [Q99801-1]
UniGeneiHs.55999.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
2L9RNMR-A132-189[»]
DisProtiDP00683.
ProteinModelPortaliQ99801.
SMRiQ99801. Positions 134-189.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi110888. 34 interactions.
IntActiQ99801. 1 interaction.
STRINGi9606.ENSP00000370253.

PTM databases

PhosphoSiteiQ99801.

Polymorphism databases

DMDMi17377578.

Proteomic databases

MaxQBiQ99801.
PaxDbiQ99801.
PRIDEiQ99801.

Protocols and materials databases

DNASUi4824.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000380871; ENSP00000370253; ENSG00000167034. [Q99801-1]
ENST00000523261; ENSP00000429729; ENSG00000167034. [Q99801-3]
GeneIDi4824.
KEGGihsa:4824.
UCSCiuc011kzx.2. human. [Q99801-1]
uc031tao.1. human. [Q99801-3]

Organism-specific databases

CTDi4824.
GeneCardsiGC08M023536.
HGNCiHGNC:7838. NKX3-1.
HPAiHPA025693.
MIMi602041. gene.
neXtProtiNX_Q99801.
PharmGKBiPA31645.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiNOG238786.
GeneTreeiENSGT00760000118779.
HOGENOMiHOG000231923.
HOVERGENiHBG006689.
InParanoidiQ99801.
KOiK09348.
OMAiGAQRQGG.
OrthoDBiEOG7QZG9Z.
PhylomeDBiQ99801.
TreeFamiTF315720.

Enzyme and pathway databases

SignaLinkiQ99801.

Miscellaneous databases

ChiTaRSiNKX3-1. human.
GeneWikiiNKX3-1.
GenomeRNAii4824.
NextBioi18580.
PROiQ99801.
SOURCEiSearch...

Gene expression databases

BgeeiQ99801.
CleanExiHS_NKX3-1.
GenevestigatoriQ99801.

Family and domain databases

Gene3Di1.10.10.60. 1 hit.
InterProiIPR017970. Homeobox_CS.
IPR001356. Homeobox_dom.
IPR020479. Homeobox_metazoa.
IPR009057. Homeodomain-like.
[Graphical view]
PfamiPF00046. Homeobox. 1 hit.
[Graphical view]
PRINTSiPR00024. HOMEOBOX.
SMARTiSM00389. HOX. 1 hit.
[Graphical view]
SUPFAMiSSF46689. SSF46689. 1 hit.
PROSITEiPS00027. HOMEOBOX_1. 1 hit.
PS50071. HOMEOBOX_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "A novel human prostate-specific, androgen-regulated homeobox gene (NKX3.1) that maps to 8p21, a region frequently deleted in prostate cancer."
    He W.-W., Sciavolino P.J., Wing J., Augustus M., Hudson P., Meissner P.S., Curtis R.T., Shell B.K., Bostwick D.G., Tindall D.J., Gelmann E.P., Abate-Shen C., Carter K.C.
    Genomics 43:69-77(1996) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1), TISSUE SPECIFICITY, INDUCTION BY ANDROGENS.
    Tissue: Prostate.
  2. "Isolation and androgen regulation of the human homeobox cDNA, NKX3.1."
    Prescott J.L., Blok L., Tindall D.J.
    Prostate 35:71-80(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1), TISSUE SPECIFICITY, INDUCTION BY ANDROGENS.
  3. "Full-length cDNA sequence and genomic organization of human NKX3A --alternative mRNA forms and regulation by both androgens and estrogens."
    Korkmaz K.S., Korkmaz C.G., Ragnhildstveit E., Kizildag S., Pretlow T.G., Saatcioglu F.
    Gene 260:25-36(1999) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS 1; 2; 3; 4 AND 5), SUBCELLULAR LOCATION, INDUCTION BY ESTROGENS.
    Tissue: Prostate.
  4. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Tissue: Lung.
  5. "NKX-3.1 interacts with prostate-derived Ets factor and regulates the activity of the PSA promoter."
    Chen H., Nandi A.K., Li X., Bieberich C.J.
    Cancer Res. 62:338-340(2001) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH SPDEF.
  6. "Purification and identification of a novel complex which is involved in androgen receptor-dependent transcription."
    Hosohata K., Li P., Hosohata Y., Qin J., Roeder R.G., Wang Z.
    Mol. Cell. Biol. 23:7019-7029(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH WDR77.
  7. "Ubiquitination by TOPORS regulates the prostate tumor suppressor NKX3.1."
    Guan B., Pungaliya P., Li X., Uquillas C., Mutton L.N., Rubin E.H., Bieberich C.J.
    J. Biol. Chem. 283:4834-4840(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH TOPORS, UBIQUITINATION BY TOPORS.
  8. "Gene expression profiles in the PC-3 human prostate cancer cells induced by NKX3.1."
    Zhang P., Liu W., Zhang J., Guan H., Chen W., Cui X., Liu Q., Jiang A.
    Mol. Biol. Rep. 37:1505-1512(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION AS TUMOR SUPPRESSOR.
  9. "Solution NMR structure of homeobox domain of homeobox protein NKX-3.1 from Homo sapiens, Northeast structural genomics consortium target HR6470A."
    Northeast structural genomics consortium (NESG)
    Submitted (MAR-2011) to the PDB data bank
    Cited for: STRUCTURE BY NMR OF 132-189.
  10. "Coding region of NKX3.1, a prostate-specific homeobox gene on 8p21, is not mutated in human prostate cancers."
    Voeller H.J., Augustus M., Madike V., Bova G.S., Carter K.C., Gelmann E.P.
    Cancer Res. 57:4455-4459(1996) [PubMed] [Europe PMC] [Abstract]
    Cited for: VARIANT CYS-52.
  11. Erratum
    Voeller H.J., Augustus M., Madike V., Bova G.S., Carter K.C., Gelmann E.P.
    Cancer Res. 57:5613-5613(1996)

Entry informationi

Entry nameiNKX31_HUMAN
AccessioniPrimary (citable) accession number: Q99801
Secondary accession number(s): O15465
, Q9H2P4, Q9H2P5, Q9H2P6, Q9H2P7, Q9HBG0
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 15, 1998
Last sequence update: September 26, 2001
Last modified: January 7, 2015
This is version 138 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 8
    Human chromosome 8: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.