Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Adenylate cyclase type 5

Gene

Adcy5

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

This is a membrane-bound, calcium-inhibitable adenylyl cyclase.Curated

Catalytic activityi

ATP = 3',5'-cyclic AMP + diphosphate.Curated

Cofactori

Mg2+By similarityNote: Binds 2 magnesium ions per subunit.By similarity

Enzyme regulationi

Inhibition by calcium in the submicromolar concentration range. Phosphorylation by RAF1 results in its activation.By similarity

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Metal bindingi475 – 4751Magnesium 1PROSITE-ProRule annotation
Metal bindingi475 – 4751Magnesium 2PROSITE-ProRule annotation
Metal bindingi476 – 4761Magnesium 2; via carbonyl oxygenPROSITE-ProRule annotation

GO - Molecular functioni

  • adenylate cyclase activity Source: MGI
  • adenylate cyclase binding Source: BHF-UCL
  • ATP binding Source: UniProtKB-KW
  • metal ion binding Source: UniProtKB-KW
  • protein heterodimerization activity Source: BHF-UCL

GO - Biological processi

  • adenosine receptor signaling pathway Source: MGI
  • adenylate cyclase-activating dopamine receptor signaling pathway Source: MGI
  • adenylate cyclase-activating G-protein coupled receptor signaling pathway Source: MGI
  • adenylate cyclase-inhibiting dopamine receptor signaling pathway Source: MGI
  • cAMP biosynthetic process Source: MGI
  • intracellular signal transduction Source: InterPro
  • locomotory behavior Source: MGI
  • neuromuscular process controlling balance Source: MGI
Complete GO annotation...

Keywords - Molecular functioni

Lyase

Keywords - Biological processi

cAMP biosynthesis

Keywords - Ligandi

ATP-binding, Magnesium, Metal-binding, Nucleotide-binding

Enzyme and pathway databases

BRENDAi4.6.1.1. 3474.
ReactomeiREACT_272374. Adrenaline,noradrenaline inhibits insulin secretion.
REACT_278699. Hedgehog 'off' state.
REACT_286837. Vasopressin regulates renal water homeostasis via Aquaporins.
REACT_295338. PKA activation in glucagon signalling.
REACT_296380. G alpha (z) signalling events.
REACT_297425. Adenylate cyclase activating pathway.
REACT_297430. Glucagon-like Peptide-1 (GLP1) regulates insulin secretion.
REACT_309686. Adenylate cyclase inhibitory pathway.
REACT_310977. PKA activation.
REACT_313192. G alpha (s) signalling events.
REACT_331048. G alpha (i) signalling events.
REACT_345203. Glucagon signaling in metabolic regulation.

Names & Taxonomyi

Protein namesi
Recommended name:
Adenylate cyclase type 5 (EC:4.6.1.1)
Alternative name(s):
ATP pyrophosphate-lyase 5
Adenylate cyclase type V
Adenylyl cyclase 5
Gene namesi
Name:Adcy5Imported
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
ProteomesiUP000000589 Componenti: Chromosome 16

Organism-specific databases

MGIiMGI:99673. Adcy5.

Subcellular locationi

Topology

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Topological domaini1 – 196196CytoplasmicSequence AnalysisAdd
BLAST
Transmembranei197 – 21721HelicalSequence AnalysisAdd
BLAST
Transmembranei243 – 26321HelicalSequence AnalysisAdd
BLAST
Transmembranei269 – 28921HelicalSequence AnalysisAdd
BLAST
Transmembranei300 – 32021HelicalSequence AnalysisAdd
BLAST
Transmembranei326 – 34621HelicalSequence AnalysisAdd
BLAST
Transmembranei375 – 39521HelicalSequence AnalysisAdd
BLAST
Topological domaini396 – 763368CytoplasmicSequence AnalysisAdd
BLAST
Transmembranei764 – 78421HelicalSequence AnalysisAdd
BLAST
Transmembranei790 – 81021HelicalSequence AnalysisAdd
BLAST
Transmembranei837 – 85721HelicalSequence AnalysisAdd
BLAST
Topological domaini858 – 91053ExtracellularSequence AnalysisAdd
BLAST
Transmembranei911 – 93121HelicalSequence AnalysisAdd
BLAST
Transmembranei936 – 95621HelicalSequence AnalysisAdd
BLAST
Transmembranei985 – 100521HelicalSequence AnalysisAdd
BLAST
Topological domaini1006 – 1262257CytoplasmicSequence AnalysisAdd
BLAST

GO - Cellular componenti

  • integral component of membrane Source: UniProtKB-KW
  • intracellular Source: GOC
  • membrane Source: MGI
  • plasma membrane Source: BHF-UCL
  • primary cilium Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cell projection, Cilium, Membrane

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 12621262Adenylate cyclase type 5PRO_0000195695Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Glycosylationi239 – 2391N-linked (GlcNAc...)Sequence Analysis
Modified residuei667 – 6671PhosphoserineBy similarity
Modified residuei755 – 7551PhosphoserineBy similarity
Glycosylationi834 – 8341N-linked (GlcNAc...)Sequence Analysis
Glycosylationi871 – 8711N-linked (GlcNAc...)Sequence Analysis
Glycosylationi888 – 8881N-linked (GlcNAc...)Sequence Analysis
Glycosylationi973 – 9731N-linked (GlcNAc...)Sequence Analysis
Modified residuei1012 – 10121PhosphothreonineBy similarity

Post-translational modificationi

Phosphorylated by RAF1.By similarity

Keywords - PTMi

Glycoprotein, Phosphoprotein

Proteomic databases

MaxQBiP84309.
PaxDbiP84309.
PRIDEiP84309.

PTM databases

PhosphoSiteiP84309.

Expressioni

Gene expression databases

BgeeiP84309.
CleanExiMM_ADCY5.
GenevisibleiP84309. MM.

Interactioni

Subunit structurei

Part of a complex containing AKAP5, ADCY6, PDE4C and PKD2. Interacts with RAF1 (By similarity).By similarity

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000110563.

Structurei

3D structure databases

ProteinModelPortaliP84309.
SMRiP84309. Positions 455-644.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini470 – 597128Guanylate cyclase 1PROSITE-ProRule annotationAdd
BLAST
Domaini1072 – 1211140Guanylate cyclase 2PROSITE-ProRule annotationAdd
BLAST

Coiled coil

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Coiled coili1019 – 104527Sequence AnalysisAdd
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi64 – 674Poly-Gln
Compositional biasi141 – 1477Poly-Ala

Sequence similaritiesi

Belongs to the adenylyl cyclase class-4/guanylyl cyclase family.PROSITE-ProRule annotation
Contains 2 guanylate cyclase domains.PROSITE-ProRule annotation

Keywords - Domaini

Coiled coil, Repeat, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiCOG2114.
GeneTreeiENSGT00760000119042.
HOGENOMiHOG000006941.
HOVERGENiHBG050458.
InParanoidiP84309.
KOiK08045.
OMAiQGFCGSP.
OrthoDBiEOG7B8S30.
PhylomeDBiP84309.
TreeFamiTF313845.

Family and domain databases

Gene3Di3.30.70.1230. 2 hits.
InterProiIPR001054. A/G_cyclase.
IPR018297. A/G_cyclase_CS.
IPR030672. Adcy.
IPR009398. Adcy_conserved_dom.
IPR029787. Nucleotide_cyclase.
[Graphical view]
PfamiPF06327. DUF1053. 1 hit.
PF00211. Guanylate_cyc. 2 hits.
[Graphical view]
PIRSFiPIRSF039050. Ade_cyc. 1 hit.
SMARTiSM00044. CYCc. 2 hits.
[Graphical view]
SUPFAMiSSF55073. SSF55073. 2 hits.
PROSITEiPS00452. GUANYLATE_CYCLASE_1. 2 hits.
PS50125. GUANYLATE_CYCLASE_2. 2 hits.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P84309-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MSGSKSVSPP GYAAQTAASP APRGGPEHRA AWGEADSRAN GYPHAPGGST
60 70 80 90 100
RGSTKRSGGA VTPQQQQRLA SRWRGGDDDE DPPLSGDDPL AGGFGFSFRS
110 120 130 140 150
KSAWQERGGD DGGRGSRRQR RGAAGGGSTR APPAGGSGSS AAAAAAAGGT
160 170 180 190 200
EVRPRSVELG LEERRGKGRA AEELEPGTGI VEDGDGSEDG GSSVASGSGT
210 220 230 240 250
GAVLSLGACC LALLQIFRSK KFPSDKLERL YQRYFFRLNQ SSLTMLMAVL
260 270 280 290 300
VLVCLVMLAF HAARPPLQIA YLAVLAAAVG VILIMAVLCN RAAFHQDHMG
310 320 330 340 350
LACYALIAVV LAVQVVGLLL PQPRSASEGI WWTVFFIYTI YTLLPVRMRA
360 370 380 390 400
AVLSGVLLSA LHLAISLHTN SQDQFLLKQL VSNVLIFSCT NIVGVCTHYP
410 420 430 440 450
AEVSQRQAFQ ETRECIQARL HSQRENQQQE RLLLSVLPRH VAMEMKADIN
460 470 480 490 500
AKQEDMMFHK IYIQKHDNVS ILFADIEGFT SLASQCTAQE LVMTLNELFA
510 520 530 540 550
RFDKLAAENH CLRIKILGDC YYCVSGLPEA RADHAHCCVE MGMDMIEAIS
560 570 580 590 600
LVREVTGVNV NMRVGIHSGR VHCGVLGLRK WQFDVWSNDV TLANHMEAGG
610 620 630 640 650
KAGRIHITKA TLNYLNGDYE VEPGCGGDRN AYLKEHSIET FLILSCTQKR
660 670 680 690 700
KEEKAMIAKM NRQRTNSIGH NPPHWGAERP FYNHLGGNQV SKEMKRMGFE
710 720 730 740 750
DPKDKNAQES ANPEDEVDEF LGRAIDARSI DRLRSEHVRK FLLTFREPDL
760 770 780 790 800
EKKYSKQVDD RFGAYVACAS LVFLFICFVQ ITIVPHSLFM LSFYLSCFLL
810 820 830 840 850
LALVVFVSVI YACVKLFPTP LQTLSRKIVR SKKNSTLVGV FTITLVFLSA
860 870 880 890 900
FVNMFMCNSK NLVGCLAEEH NITVNQVNAC HVMESAFNYS LGDEQGFCGS
910 920 930 940 950
PQPNCNFPEY FTYSVLLSLL ACSVFLQISC IGKLVLMLAI EFIYVLIVEV
960 970 980 990 1000
PGVTLFDNAD LLVTANAIDF SNNGTSQCPE HATKVALKVV TPIIISVFVL
1010 1020 1030 1040 1050
ALYLHAQQVE STARLDFLWK LQATEEKEEM EELQAYNRRL LHNILPKDVA
1060 1070 1080 1090 1100
AHFLARERRN DELYYQSCEC VAVMFASIAN FSEFYVELEA NNEGVECLRL
1110 1120 1130 1140 1150
LNEIIADFDE IISEDRFRQL EKIKTIGSTY MAASGLNDST YDKAGKTHIK
1160 1170 1180 1190 1200
AIADFAMKLM DQMKYINEHS FNNFQMKIGL NIGPVVAGVI GARKPQYDIW
1210 1220 1230 1240 1250
GNTVNVASRM DSTGVPDRIQ VTTDMYQVLA ANTYQLECRG VVKVKGKGEM
1260
MTYFLNGGPP LS
Length:1,262
Mass (Da):139,122
Last modified:January 9, 2007 - v2
Checksum:i342966E7CA67E28D
GO
Isoform 2 (identifier: P84309-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1259-1262: PPLS → LGHDGVVGKL...GSEQKKIFIK

Note: No experimental confirmation available.
Show »
Length:1,348
Mass (Da):148,383
Checksum:iB4997E3A2BC14305
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti25 – 251G → S in AAH90846 (PubMed:15489334).Curated
Sequence conflicti686 – 6861G → D in BAE28048 (PubMed:16141072).Curated
Sequence conflicti924 – 9241V → M in AAH90846 (PubMed:15489334).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1259 – 12624PPLS → LGHDGVVGKLKAGLGVSMEL KGLLFHCGEVTPPHNVWGTG TGRRVACAILSPHLHAQRQC PVRETGLLTREARGHQARSS GSEQKKIFIK in isoform 2. 1 PublicationVSP_022224

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK147649 mRNA. Translation: BAE28048.1.
AK160942 mRNA. Translation: BAE36104.1.
BC035550 mRNA. No translation available.
BC090846 mRNA. Translation: AAH90846.1.
CCDSiCCDS37322.1. [P84309-1]
RefSeqiNP_001012783.3. NM_001012765.4. [P84309-1]
UniGeneiMm.41137.

Genome annotation databases

EnsembliENSMUST00000114913; ENSMUSP00000110563; ENSMUSG00000022840. [P84309-1]
GeneIDi224129.
KEGGimmu:224129.
UCSCiuc007zbj.1. mouse. [P84309-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK147649 mRNA. Translation: BAE28048.1.
AK160942 mRNA. Translation: BAE36104.1.
BC035550 mRNA. No translation available.
BC090846 mRNA. Translation: AAH90846.1.
CCDSiCCDS37322.1. [P84309-1]
RefSeqiNP_001012783.3. NM_001012765.4. [P84309-1]
UniGeneiMm.41137.

3D structure databases

ProteinModelPortaliP84309.
SMRiP84309. Positions 455-644.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000110563.

PTM databases

PhosphoSiteiP84309.

Proteomic databases

MaxQBiP84309.
PaxDbiP84309.
PRIDEiP84309.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000114913; ENSMUSP00000110563; ENSMUSG00000022840. [P84309-1]
GeneIDi224129.
KEGGimmu:224129.
UCSCiuc007zbj.1. mouse. [P84309-1]

Organism-specific databases

CTDi111.
MGIiMGI:99673. Adcy5.

Phylogenomic databases

eggNOGiCOG2114.
GeneTreeiENSGT00760000119042.
HOGENOMiHOG000006941.
HOVERGENiHBG050458.
InParanoidiP84309.
KOiK08045.
OMAiQGFCGSP.
OrthoDBiEOG7B8S30.
PhylomeDBiP84309.
TreeFamiTF313845.

Enzyme and pathway databases

BRENDAi4.6.1.1. 3474.
ReactomeiREACT_272374. Adrenaline,noradrenaline inhibits insulin secretion.
REACT_278699. Hedgehog 'off' state.
REACT_286837. Vasopressin regulates renal water homeostasis via Aquaporins.
REACT_295338. PKA activation in glucagon signalling.
REACT_296380. G alpha (z) signalling events.
REACT_297425. Adenylate cyclase activating pathway.
REACT_297430. Glucagon-like Peptide-1 (GLP1) regulates insulin secretion.
REACT_309686. Adenylate cyclase inhibitory pathway.
REACT_310977. PKA activation.
REACT_313192. G alpha (s) signalling events.
REACT_331048. G alpha (i) signalling events.
REACT_345203. Glucagon signaling in metabolic regulation.

Miscellaneous databases

NextBioi377103.
PROiP84309.
SOURCEiSearch...

Gene expression databases

BgeeiP84309.
CleanExiMM_ADCY5.
GenevisibleiP84309. MM.

Family and domain databases

Gene3Di3.30.70.1230. 2 hits.
InterProiIPR001054. A/G_cyclase.
IPR018297. A/G_cyclase_CS.
IPR030672. Adcy.
IPR009398. Adcy_conserved_dom.
IPR029787. Nucleotide_cyclase.
[Graphical view]
PfamiPF06327. DUF1053. 1 hit.
PF00211. Guanylate_cyc. 2 hits.
[Graphical view]
PIRSFiPIRSF039050. Ade_cyc. 1 hit.
SMARTiSM00044. CYCc. 2 hits.
[Graphical view]
SUPFAMiSSF55073. SSF55073. 2 hits.
PROSITEiPS00452. GUANYLATE_CYCLASE_1. 2 hits.
PS50125. GUANYLATE_CYCLASE_2. 2 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "The transcriptional landscape of the mammalian genome."
    Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.
    , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
    Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Strain: C57BL/6J.
    Tissue: Head.
  2. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 2).
    Strain: C57BL/61 Publication.
    Tissue: Brain and Retina1 Publication.
  3. "Polycystin-2 and phosphodiesterase 4C are components of a ciliary A-kinase anchoring protein complex that is disrupted in cystic kidney diseases."
    Choi Y.H., Suzuki A., Hajarnis S., Ma Z., Chapin H.C., Caplan M.J., Pontoglio M., Somlo S., Igarashi P.
    Proc. Natl. Acad. Sci. U.S.A. 108:10679-10684(2011) [PubMed] [Europe PMC] [Abstract]
    Cited for: SUBCELLULAR LOCATION, INTERACTION WITH AKAP5; ADCY6; PDE4C AND PKD2.

Entry informationi

Entry nameiADCY5_MOUSE
AccessioniPrimary (citable) accession number: P84309
Secondary accession number(s): Q3TU67, Q3UH09, Q5BL06
Entry historyi
Integrated into UniProtKB/Swiss-Prot: December 21, 2004
Last sequence update: January 9, 2007
Last modified: June 24, 2015
This is version 110 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.