Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Serine/threonine-protein phosphatase 6 regulatory ankyrin repeat subunit C

Gene

ANKRD52

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 3 out of 5-Experimental evidence at protein leveli

Functioni

Putative regulatory subunit of protein phosphatase 6 (PP6) that may be involved in the recognition of phosphoprotein substrates.

Names & Taxonomyi

Protein namesi
Recommended name:
Serine/threonine-protein phosphatase 6 regulatory ankyrin repeat subunit C
Short name:
PP6-ARS-C
Short name:
Serine/threonine-protein phosphatase 6 regulatory subunit ARS-C
Alternative name(s):
Ankyrin repeat domain-containing protein 52
Gene namesi
Name:ANKRD52
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 12

Organism-specific databases

HGNCiHGNC:26614. ANKRD52.

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA143485302.

Polymorphism and mutation databases

BioMutaiANKRD52.
DMDMi296439443.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 10761076Serine/threonine-protein phosphatase 6 regulatory ankyrin repeat subunit CPRO_0000244587Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei1028 – 10281PhosphoserineCombined sources
Modified residuei1075 – 10751PhosphoserineBy similarity

Keywords - PTMi

Phosphoprotein

Proteomic databases

EPDiQ8NB46.
MaxQBiQ8NB46.
PaxDbiQ8NB46.
PeptideAtlasiQ8NB46.
PRIDEiQ8NB46.

PTM databases

iPTMnetiQ8NB46.
PhosphoSiteiQ8NB46.

Expressioni

Gene expression databases

BgeeiQ8NB46.
CleanExiHS_ANKRD52.
GenevisibleiQ8NB46. HS.

Organism-specific databases

HPAiHPA043815.

Interactioni

Subunit structurei

Protein phosphatase 6 (PP6) holoenzyme is proposed to be a heterotrimeric complex formed by the catalytic subunit, a SAPS domain-containing subunit (PP6R) and an ankyrin repeat-domain containing regulatory subunit (ARS). Interacts with PPP6R1.1 Publication

Binary interactionsi

WithEntry#Exp.IntActNotes
PPP6R1Q9UPN74EBI-1996119,EBI-359745

Protein-protein interaction databases

BioGridi129540. 35 interactions.
IntActiQ8NB46. 17 interactions.
MINTiMINT-8287684.
STRINGi9606.ENSP00000267116.

Structurei

3D structure databases

ProteinModelPortaliQ8NB46.
SMRiQ8NB46. Positions 9-1037.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Repeati7 – 3630ANK 1Add
BLAST
Repeati40 – 6930ANK 2Add
BLAST
Repeati73 – 10230ANK 3Add
BLAST
Repeati106 – 13530ANK 4Add
BLAST
Repeati139 – 16830ANK 5Add
BLAST
Repeati172 – 20130ANK 6Add
BLAST
Repeati205 – 23430ANK 7Add
BLAST
Repeati238 – 26730ANK 8Add
BLAST
Repeati271 – 30131ANK 9Add
BLAST
Repeati305 – 33430ANK 10Add
BLAST
Repeati338 – 36730ANK 11Add
BLAST
Repeati371 – 40030ANK 12Add
BLAST
Repeati422 – 45130ANK 13Add
BLAST
Repeati455 – 48430ANK 14Add
BLAST
Repeati488 – 54558ANK 15Add
BLAST
Repeati549 – 57931ANK 16Add
BLAST
Repeati584 – 61330ANK 17Add
BLAST
Repeati617 – 64630ANK 18Add
BLAST
Repeati651 – 68030ANK 19Add
BLAST
Repeati687 – 71630ANK 20Add
BLAST
Repeati720 – 74930ANK 21Add
BLAST
Repeati753 – 78230ANK 22Add
BLAST
Repeati790 – 81930ANK 23Add
BLAST
Repeati822 – 85231ANK 24Add
BLAST
Repeati857 – 88630ANK 25Add
BLAST
Repeati890 – 92031ANK 26Add
BLAST
Repeati924 – 95330ANK 27Add
BLAST
Repeati960 – 98930ANK 28Add
BLAST

Sequence similaritiesi

Contains 28 ANK repeats.PROSITE-ProRule annotation

Keywords - Domaini

ANK repeat, Repeat

Phylogenomic databases

eggNOGiKOG0504. Eukaryota.
COG0666. LUCA.
GeneTreeiENSGT00840000129677.
HOGENOMiHOG000033959.
HOVERGENiHBG067697.
InParanoidiQ8NB46.
KOiK15504.
OMAiKNCGIAA.
OrthoDBiEOG7QG436.
PhylomeDBiQ8NB46.
TreeFamiTF312824.

Family and domain databases

Gene3Di1.25.40.20. 4 hits.
InterProiIPR002110. Ankyrin_rpt.
IPR020683. Ankyrin_rpt-contain_dom.
[Graphical view]
PfamiPF00023. Ank. 3 hits.
PF12796. Ank_2. 9 hits.
[Graphical view]
PRINTSiPR01415. ANKYRIN.
SMARTiSM00248. ANK. 28 hits.
[Graphical view]
SUPFAMiSSF48403. SSF48403. 3 hits.
PROSITEiPS50297. ANK_REP_REGION. 1 hit.
PS50088. ANK_REPEAT. 21 hits.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q8NB46-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MGILSITDQP PLVQAIFSRD VEEVRSLLSQ KENINVLDQE RRTPLHAAAY
60 70 80 90 100
VGDVPILQLL LMSGANVNAK DTLWLTPLHR AAASRNEKVL GLLLAHSADV
110 120 130 140 150
NARDKLWQTP LHVAAANRAT KCAEALAPLL SSLNVADRSG RSALHHAVHS
160 170 180 190 200
GHLETVNLLL NKGASLNVCD KKERQPLHWA AFLGHLEVLK LLVARGADLG
210 220 230 240 250
CKDRKGYGLL HTAAASGQIE VVKYLLRMGA EIDEPNAFGN TALHIACYLG
260 270 280 290 300
QDAVAIELVN AGANVNQPND KGFTPLHVAA VSTNGALCLE LLVNNGADVN
310 320 330 340 350
YQSKEGKSPL HMAAIHGRFT RSQILIQNGS EIDCADKFGN TPLHVAARYG
360 370 380 390 400
HELLISTLMT NGADTARRGI HDMFPLHLAV LFGFSDCCRK LLSSGQLYSI
410 420 430 440 450
VSSLSNEHVL SAGFDINTPD NLGRTCLHAA ASGGNVECLN LLLSSGADLR
460 470 480 490 500
RRDKFGRTPL HYAAANGSYQ CAVTLVTAGA GVNEADCKGC SPLHYAAASD
510 520 530 540 550
TYRRAEPHTP SSHDAEEDEP LKESRRKEAF FCLEFLLDNG ADPSLRDRQG
560 570 580 590 600
YTAVHYAAAY GNRQNLELLL EMSFNCLEDV ESTIPVSPLH LAAYNGHCEA
610 620 630 640 650
LKTLAETLVN LDVRDHKGRT ALFLATERGS TECVEVLTAH GASALIKERK
660 670 680 690 700
RKWTPLHAAA ASGHTDSLHL LIDSGERADI TDVMDAYGQT PLMLAIMNGH
710 720 730 740 750
VDCVHLLLEK GSTADAADLR GRTALHRGAV TGCEDCLAAL LDHDAFVLCR
760 770 780 790 800
DFKGRTPIHL ASACGHTAVL RTLLQAALST DPLDAGVDYS GYSPMHWASY
810 820 830 840 850
TGHEDCLELL LEHSPFSYLE GNPFTPLHCA VINNQDSTTE MLLGALGAKI
860 870 880 890 900
VNSRDAKGRT PLHAAAFADN VSGLRMLLQH QAEVNATDHT GRTALMTAAE
910 920 930 940 950
NGQTAAVEFL LYRGKADLTV LDENKNTALH LACSKGHEKC ALMILAETQD
960 970 980 990 1000
LGLINATNSA LQMPLHIAAR NGLASVVQAL LSHGATVLAV DEEGHTPALA
1010 1020 1030 1040 1050
CAPNKDVADC LALILSTMKP FPPKDAVSPF SFSLLKNCSI AAAKTVGGCG
1060 1070
ALPHGASCPY SQERPGAIGL DGCYSE
Length:1,076
Mass (Da):115,077
Last modified:May 18, 2010 - v3
Checksum:i8CEE6E151956EAFE
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti686 – 6861A → V in AK091555 (PubMed:14702039).Curated
Sequence conflicti856 – 8561A → T in AK091555 (PubMed:14702039).Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB190990 mRNA. Translation: BAG16262.1.
AC073896 Genomic DNA. No translation available.
CH471054 Genomic DNA. Translation: EAW96919.1.
AK091555 mRNA. No translation available.
CCDSiCCDS44920.1.
RefSeqiNP_775866.2. NM_173595.3.
UniGeneiHs.524506.

Genome annotation databases

EnsembliENST00000267116; ENSP00000267116; ENSG00000139645.
GeneIDi283373.
KEGGihsa:283373.
UCSCiuc001skm.5. human.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB190990 mRNA. Translation: BAG16262.1.
AC073896 Genomic DNA. No translation available.
CH471054 Genomic DNA. Translation: EAW96919.1.
AK091555 mRNA. No translation available.
CCDSiCCDS44920.1.
RefSeqiNP_775866.2. NM_173595.3.
UniGeneiHs.524506.

3D structure databases

ProteinModelPortaliQ8NB46.
SMRiQ8NB46. Positions 9-1037.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi129540. 35 interactions.
IntActiQ8NB46. 17 interactions.
MINTiMINT-8287684.
STRINGi9606.ENSP00000267116.

PTM databases

iPTMnetiQ8NB46.
PhosphoSiteiQ8NB46.

Polymorphism and mutation databases

BioMutaiANKRD52.
DMDMi296439443.

Proteomic databases

EPDiQ8NB46.
MaxQBiQ8NB46.
PaxDbiQ8NB46.
PeptideAtlasiQ8NB46.
PRIDEiQ8NB46.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000267116; ENSP00000267116; ENSG00000139645.
GeneIDi283373.
KEGGihsa:283373.
UCSCiuc001skm.5. human.

Organism-specific databases

CTDi283373.
GeneCardsiANKRD52.
H-InvDBHIX0010731.
HGNCiHGNC:26614. ANKRD52.
HPAiHPA043815.
neXtProtiNX_Q8NB46.
PharmGKBiPA143485302.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0504. Eukaryota.
COG0666. LUCA.
GeneTreeiENSGT00840000129677.
HOGENOMiHOG000033959.
HOVERGENiHBG067697.
InParanoidiQ8NB46.
KOiK15504.
OMAiKNCGIAA.
OrthoDBiEOG7QG436.
PhylomeDBiQ8NB46.
TreeFamiTF312824.

Miscellaneous databases

ChiTaRSiANKRD52. human.
GenomeRNAii283373.
PROiQ8NB46.

Gene expression databases

BgeeiQ8NB46.
CleanExiHS_ANKRD52.
GenevisibleiQ8NB46. HS.

Family and domain databases

Gene3Di1.25.40.20. 4 hits.
InterProiIPR002110. Ankyrin_rpt.
IPR020683. Ankyrin_rpt-contain_dom.
[Graphical view]
PfamiPF00023. Ank. 3 hits.
PF12796. Ank_2. 9 hits.
[Graphical view]
PRINTSiPR01415. ANKYRIN.
SMARTiSM00248. ANK. 28 hits.
[Graphical view]
SUPFAMiSSF48403. SSF48403. 3 hits.
PROSITEiPS50297. ANK_REP_REGION. 1 hit.
PS50088. ANK_REPEAT. 21 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Identification of up-regulated genes in gastric cancer."
    Jinawath N., Furukawa Y., Nakamura Y.
    Submitted (SEP-2004) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
  2. "The finished DNA sequence of human chromosome 12."
    Scherer S.E., Muzny D.M., Buhay C.J., Chen R., Cree A., Ding Y., Dugan-Rocha S., Gill R., Gunaratne P., Harris R.A., Hawes A.C., Hernandez J., Hodgson A.V., Hume J., Jackson A., Khan Z.M., Kovar-Smith C., Lewis L.R.
    , Lozado R.J., Metzker M.L., Milosavljevic A., Miner G.R., Montgomery K.T., Morgan M.B., Nazareth L.V., Scott G., Sodergren E., Song X.-Z., Steffen D., Lovering R.C., Wheeler D.A., Worley K.C., Yuan Y., Zhang Z., Adams C.Q., Ansari-Lari M.A., Ayele M., Brown M.J., Chen G., Chen Z., Clerc-Blankenburg K.P., Davis C., Delgado O., Dinh H.H., Draper H., Gonzalez-Garay M.L., Havlak P., Jackson L.R., Jacob L.S., Kelly S.H., Li L., Li Z., Liu J., Liu W., Lu J., Maheshwari M., Nguyen B.-V., Okwuonu G.O., Pasternak S., Perez L.M., Plopper F.J.H., Santibanez J., Shen H., Tabor P.E., Verduzco D., Waldron L., Wang Q., Williams G.A., Zhang J., Zhou J., Allen C.C., Amin A.G., Anyalebechi V., Bailey M., Barbaria J.A., Bimage K.E., Bryant N.P., Burch P.E., Burkett C.E., Burrell K.L., Calderon E., Cardenas V., Carter K., Casias K., Cavazos I., Cavazos S.R., Ceasar H., Chacko J., Chan S.N., Chavez D., Christopoulos C., Chu J., Cockrell R., Cox C.D., Dang M., Dathorne S.R., David R., Davis C.M., Davy-Carroll L., Deshazo D.R., Donlin J.E., D'Souza L., Eaves K.A., Egan A., Emery-Cohen A.J., Escotto M., Flagg N., Forbes L.D., Gabisi A.M., Garza M., Hamilton C., Henderson N., Hernandez O., Hines S., Hogues M.E., Huang M., Idlebird D.G., Johnson R., Jolivet A., Jones S., Kagan R., King L.M., Leal B., Lebow H., Lee S., LeVan J.M., Lewis L.C., London P., Lorensuhewa L.M., Loulseged H., Lovett D.A., Lucier A., Lucier R.L., Ma J., Madu R.C., Mapua P., Martindale A.D., Martinez E., Massey E., Mawhiney S., Meador M.G., Mendez S., Mercado C., Mercado I.C., Merritt C.E., Miner Z.L., Minja E., Mitchell T., Mohabbat F., Mohabbat K., Montgomery B., Moore N., Morris S., Munidasa M., Ngo R.N., Nguyen N.B., Nickerson E., Nwaokelemeh O.O., Nwokenkwo S., Obregon M., Oguh M., Oragunye N., Oviedo R.J., Parish B.J., Parker D.N., Parrish J., Parks K.L., Paul H.A., Payton B.A., Perez A., Perrin W., Pickens A., Primus E.L., Pu L.-L., Puazo M., Quiles M.M., Quiroz J.B., Rabata D., Reeves K., Ruiz S.J., Shao H., Sisson I., Sonaike T., Sorelle R.P., Sutton A.E., Svatek A.F., Svetz L.A., Tamerisa K.S., Taylor T.R., Teague B., Thomas N., Thorn R.D., Trejos Z.Y., Trevino B.K., Ukegbu O.N., Urban J.B., Vasquez L.I., Vera V.A., Villasana D.M., Wang L., Ward-Moore S., Warren J.T., Wei X., White F., Williamson A.L., Wleczyk R., Wooden H.S., Wooden S.H., Yen J., Yoon L., Yoon V., Zorrilla S.E., Nelson D., Kucherlapati R., Weinstock G., Gibbs R.A.
    Nature 440:346-351(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  3. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  4. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 510-1076.
    Tissue: Brain.
  5. "A probability-based approach for high-throughput protein phosphorylation analysis and site localization."
    Beausoleil S.A., Villen J., Gerber S.A., Rush J., Gygi S.P.
    Nat. Biotechnol. 24:1285-1292(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-1028, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  6. "Protein phosphatase 6 regulatory subunits composed of ankyrin repeat domains."
    Stefansson B., Ohama T., Daugherty A.E., Brautigan D.L.
    Biochemistry 47:1442-1451(2008) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH PPP6R1.
  7. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  8. "Lys-N and trypsin cover complementary parts of the phosphoproteome in a refined SCX-based approach."
    Gauci S., Helbig A.O., Slijper M., Krijgsveld J., Heck A.J., Mohammed S.
    Anal. Chem. 81:4493-4501(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  9. "Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis."
    Olsen J.V., Vermeulen M., Santamaria A., Kumar C., Miller M.L., Jensen L.J., Gnad F., Cox J., Jensen T.S., Nigg E.A., Brunak S., Mann M.
    Sci. Signal. 3:RA3-RA3(2010) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-1028, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  10. "Toward a comprehensive characterization of a human cancer cell phosphoproteome."
    Zhou H., Di Palma S., Preisinger C., Peng M., Polat A.N., Heck A.J., Mohammed S.
    J. Proteome Res. 12:260-271(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-1028, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma and Erythroleukemia.

Entry informationi

Entry nameiANR52_HUMAN
AccessioniPrimary (citable) accession number: Q8NB46
Secondary accession number(s): A6NE79, B1Q2K2
Entry historyi
Integrated into UniProtKB/Swiss-Prot: June 27, 2006
Last sequence update: May 18, 2010
Last modified: July 6, 2016
This is version 111 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 12
    Human chromosome 12: entries, gene names and cross-references to MIM
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.