Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Serine protease HTRA1

Gene

Htra1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Serine protease with a variety of targets, including extracellular matrix proteins such as fibronectin. HTRA1-generated fibronectin fragments further induce synovial cells to up-regulate MMP1 and MMP3 production. May also degrade proteoglycans, such as aggrecan, decorin and fibromodulin. Through cleavage of proteoglycans, may release soluble FGF-glycosaminoglycan complexes that promote the range and intensity of FGF signals in the extracellular space. Regulates the availability of insulin-like growth factors (IGFs) by cleaving IGF-binding proteins. Inhibits signaling mediated by TGF-beta family members. This activity requires the integrity of the catalytic site, but it is unclear whether it leads to the proteolytic degradation of TGF-beta proteins themselves (PubMed:18551132) or not (PubMed:14973287). By acting on TGF-beta signaling, may regulate many physiological processes, including retinal angiogenesis and neuronal survival and maturation during development. Intracellularly, degrades TSC2, leading to the activation of TSC2 downstream targets.4 Publications

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sitei169 – 1691Involved in trimer stabilizationBy similarity
Sitei171 – 1711Involved in trimer stabilizationBy similarity
Active sitei220 – 2201Charge relay systemBy similarity
Active sitei250 – 2501Charge relay systemBy similarity
Sitei278 – 2781Involved in trimer stabilizationBy similarity
Active sitei328 – 3281Charge relay systemBy similarity

GO - Molecular functioni

  • serine-type endopeptidase activity Source: InterPro
  • serine-type peptidase activity Source: UniProtKB

GO - Biological processi

  • chorionic trophoblast cell differentiation Source: MGI
  • dentinogenesis Source: Ensembl
  • negative regulation of BMP signaling pathway Source: MGI
  • negative regulation of defense response to virus Source: Ensembl
  • negative regulation of transforming growth factor beta receptor signaling pathway Source: MGI
  • placenta development Source: MGI
  • positive regulation of epithelial cell proliferation Source: Ensembl
  • proteolysis Source: UniProtKB
  • regulation of cell growth Source: InterPro
Complete GO annotation...

Keywords - Molecular functioni

Hydrolase, Protease, Serine protease

Keywords - Ligandi

Growth factor binding

Protein family/group databases

MEROPSiS01.277.

Names & Taxonomyi

Protein namesi
Recommended name:
Serine protease HTRA1 (EC:3.4.21.-)
Alternative name(s):
High-temperature requirement A serine peptidase 1
Serine protease 11
Gene namesi
Name:Htra1
Synonyms:Htra, Prss11
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 7

Organism-specific databases

MGIiMGI:1929076. Htra1.

Subcellular locationi

  • Cell membrane By similarity
  • Secreted By similarity
  • Cytoplasmcytosol By similarity

  • Note: Predominantly secreted. Also found associated with the plasma membrane.By similarity

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Cytoplasm, Membrane, Secreted

Pathology & Biotechi

Disruption phenotypei

Mutants mice exhibit reduced retinal capillary density, as compared to wild type animals, in all 3 retinal layers, nerve fiber layer, as well as inner and outer plexiform layers.1 Publication

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi171 – 1711F → D: Loss of efficient trimer formation. 1 Publication
Mutagenesisi328 – 3281S → A: Loss of enzymatic activity. No effect on BMP4-binding. 1 Publication

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 2222Sequence analysisAdd
BLAST
Chaini23 – 480458Serine protease HTRA1PRO_0000026944Add
BLAST

Proteomic databases

MaxQBiQ9R118.
PaxDbiQ9R118.
PRIDEiQ9R118.

PTM databases

PhosphoSiteiQ9R118.

Expressioni

Tissue specificityi

In the brain, mainly expressed in cortical areas both in glial cells and neurons (at protein level). In bones, deposited in the matrix, with higher level in newly formed bone compared to fully calcified bone (at protein level). Also expressed in the tendons (at protein level). In the articular cartilage, detected only in the deepest zone of the joint cartilage. Not detected in the chondrocytes of the growth plate (at protein level). In an experimental arthritis model, at early disease stages, up-regulated in articular chondrocytes in the deep layers of the cartilage (at protein level). As arthritis progresses, chondrocyte expression expands toward the surface.3 Publications

Developmental stagei

First detected at 10.5 dpc. At 11.5 dpc, in the developing heart, expressed in the atrioventricular endocardial cushion and the outflow tract (at protein level). At 14.5 dpc, strong expression in the outflow tracts, including valves. In the developing skeleton, expressed at 12.5 dpc in the vertebral column and limbs. At 14.5 dpc, expressed in rudiments of tendons and ligaments along the vertebrae, as well as in mesenchymal cells surrounding precartilage condensations. Not detected in precartilage condensations, nor in chondrocytes, but strongly expressed in ossification centers. At 17.5 dpc, in the hind limb, significant expression persists in tendons and ligaments, but expression in the forming joints is reduced. At this stage, weakly detected in the thin layer of articular surfaces. Postnatally, in long bones, expressed by terminally differentiated hypertrophic chondrocytes that are committed to degeneration and eventually replaced by bone, as well as by osteoblasts at late differentiation stages and by mature osteocytes. In the developing brain, expressed in specific regions of the neuroepithelium in the forebrain and hindbrain adjacent to the forming choroid plexus. From 17.5 dpc till birth, expressed in neurogenic areas including ventricular zones (at protein level). At 12.5 and 14.5 dpc, expressed in Muellerian duct cells and in the surrounding mesenchyme in both male and female gonads. In the lung, detected in the mesenchymal cells. Expressed at 12.5 dpc in abdominal skin, both in epidermis and dermis. Also expressed in the epithelium of developing whiskers at 14.5 dpc. At later stages, localized in the basal layer of epidermis and in the invading epidermal cells that formed the whisker rudiments (at protein level). 9 days after birth, detected in the whisker outer root sheet (at protein level).3 Publications

Gene expression databases

BgeeiQ9R118.
CleanExiMM_HTRA1.
GenevisibleiQ9R118. MM.

Interactioni

Subunit structurei

Forms homotrimers. In the presence of substrate, may form higher-order multimers in a PDZ-independent manner (By similarity). Interacts with TGF-beta family members, including BMP4, TGFB1, TGFB2, activin A and GDF5.By similarity1 Publication

Protein-protein interaction databases

BioGridi207847. 5 interactions.
STRINGi10090.ENSMUSP00000006367.

Structurei

3D structure databases

ProteinModelPortaliQ9R118.
SMRiQ9R118. Positions 37-154, 160-370, 380-480.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini33 – 10068IGFBP N-terminalPROSITE-ProRule annotationAdd
BLAST
Domaini98 – 15760Kazal-likePROSITE-ProRule annotationAdd
BLAST
Domaini365 – 467103PDZPROSITE-ProRule annotationAdd
BLAST

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni204 – 364161Serine proteaseAdd
BLAST

Domaini

The IGFBP N-terminal domain mediates interaction with TSC2 substrate.By similarity

Sequence similaritiesi

Belongs to the peptidase S1C family.Curated
Contains 1 IGFBP N-terminal domain.PROSITE-ProRule annotation
Contains 1 Kazal-like domain.PROSITE-ProRule annotation
Contains 1 PDZ (DHR) domain.PROSITE-ProRule annotation

Keywords - Domaini

Signal

Phylogenomic databases

eggNOGiKOG1320. Eukaryota.
COG0265. LUCA.
GeneTreeiENSGT00510000046315.
HOGENOMiHOG000223641.
HOVERGENiHBG052044.
InParanoidiQ9R118.
KOiK08784.
OMAiGLCVCAS.
OrthoDBiEOG7V1FR7.
TreeFamiTF323480.

Family and domain databases

Gene3Di2.30.42.10. 1 hit.
InterProiIPR009030. Growth_fac_rcpt_.
IPR000867. IGFBP-like.
IPR002350. Kazal_dom.
IPR001478. PDZ.
IPR009003. Peptidase_S1_PA.
IPR001940. Peptidase_S1C.
[Graphical view]
PfamiPF00219. IGFBP. 1 hit.
PF07648. Kazal_2. 1 hit.
PF00595. PDZ. 1 hit.
[Graphical view]
PRINTSiPR00834. PROTEASES2C.
SMARTiSM00121. IB. 1 hit.
SM00280. KAZAL. 1 hit.
SM00228. PDZ. 1 hit.
[Graphical view]
SUPFAMiSSF50156. SSF50156. 1 hit.
SSF50494. SSF50494. 1 hit.
SSF57184. SSF57184. 1 hit.
PROSITEiPS51323. IGFBP_N_2. 1 hit.
PS51465. KAZAL_2. 1 hit.
PS50106. PDZ. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

Q9R118-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MQSLRTTLLS LLLLLLAAPS LALPSGTGRS APAATVCPEH CDPTRCAPPP
60 70 80 90 100
TDCEGGRVRD ACGCCEVCGA LEGAACGLQE GPCGEGLQCV VPFGVPASAT
110 120 130 140 150
VRRRAQAGLC VCASSEPVCG SDAKTYTNLC QLRAASRRSE KLRQPPVIVL
160 170 180 190 200
QRGACGQGQE DPNSLRHKYN FIADVVEKIA PAVVHIELYR KLPFSKREVP
210 220 230 240 250
VASGSGFIVS EDGLIVTNAH VVTNKNRVKV ELKNGATYEA KIKDVDEKAD
260 270 280 290 300
IALIKIDHQG KLPVLLLGRS SELRPGEFVV AIGSPFSLQN TVTTGIVSTT
310 320 330 340 350
QRGGKELGLR NSDMDYIQTD AIINYGNSGG PLVNLDGEVI GINTLKVTAG
360 370 380 390 400
ISFAIPSDKI KKFLTESHDR QAKGKAVTKK KYIGIRMMSL TSSKAKELKD
410 420 430 440 450
RHRDFPDVLS GAYIIEVIPD TPAEAGGLKE NDVIISINGQ SVVTANDVSD
460 470 480
VIKKENTLNM VVRRGNEDIV ITVIPEEIDP
Length:480
Mass (Da):51,214
Last modified:July 27, 2011 - v2
Checksum:i92BDDA85CF5B12B7
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti91 – 911V → L in AAD49422 (PubMed:14973287).Curated
Sequence conflicti143 – 1431R → P in AAD49422 (PubMed:14973287).Curated
Sequence conflicti179 – 1791I → F in AAD49422 (PubMed:14973287).Curated
Sequence conflicti182 – 1821A → D in AAD49422 (PubMed:14973287).Curated
Sequence conflicti185 – 1862HI → KH in AAD49422 (PubMed:14973287).Curated
Sequence conflicti241 – 2411K → I in AAD49422 (PubMed:14973287).Curated
Sequence conflicti259 – 2591Q → K in BAC41168 (PubMed:16141072).Curated
Sequence conflicti366 – 3661E → Q in AAH13516 (PubMed:15489334).Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF172994 mRNA. Translation: AAD49422.1.
AF179369 mRNA. Translation: AAD52682.1.
CH466531 Genomic DNA. Translation: EDL17689.1.
BC013516 mRNA. Translation: AAH13516.1.
AK090320 mRNA. Translation: BAC41168.1.
AK090321 mRNA. Translation: BAC41169.1.
CCDSiCCDS21908.1.
RefSeqiNP_062510.2. NM_019564.3.
UniGeneiMm.30156.

Genome annotation databases

EnsembliENSMUST00000006367; ENSMUSP00000006367; ENSMUSG00000006205.
GeneIDi56213.
KEGGimmu:56213.
UCSCiuc009kau.2. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF172994 mRNA. Translation: AAD49422.1.
AF179369 mRNA. Translation: AAD52682.1.
CH466531 Genomic DNA. Translation: EDL17689.1.
BC013516 mRNA. Translation: AAH13516.1.
AK090320 mRNA. Translation: BAC41168.1.
AK090321 mRNA. Translation: BAC41169.1.
CCDSiCCDS21908.1.
RefSeqiNP_062510.2. NM_019564.3.
UniGeneiMm.30156.

3D structure databases

ProteinModelPortaliQ9R118.
SMRiQ9R118. Positions 37-154, 160-370, 380-480.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi207847. 5 interactions.
STRINGi10090.ENSMUSP00000006367.

Protein family/group databases

MEROPSiS01.277.

PTM databases

PhosphoSiteiQ9R118.

Proteomic databases

MaxQBiQ9R118.
PaxDbiQ9R118.
PRIDEiQ9R118.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000006367; ENSMUSP00000006367; ENSMUSG00000006205.
GeneIDi56213.
KEGGimmu:56213.
UCSCiuc009kau.2. mouse.

Organism-specific databases

CTDi5654.
MGIiMGI:1929076. Htra1.

Phylogenomic databases

eggNOGiKOG1320. Eukaryota.
COG0265. LUCA.
GeneTreeiENSGT00510000046315.
HOGENOMiHOG000223641.
HOVERGENiHBG052044.
InParanoidiQ9R118.
KOiK08784.
OMAiGLCVCAS.
OrthoDBiEOG7V1FR7.
TreeFamiTF323480.

Miscellaneous databases

ChiTaRSiHtra1. mouse.
NextBioi312058.
PROiQ9R118.
SOURCEiSearch...

Gene expression databases

BgeeiQ9R118.
CleanExiMM_HTRA1.
GenevisibleiQ9R118. MM.

Family and domain databases

Gene3Di2.30.42.10. 1 hit.
InterProiIPR009030. Growth_fac_rcpt_.
IPR000867. IGFBP-like.
IPR002350. Kazal_dom.
IPR001478. PDZ.
IPR009003. Peptidase_S1_PA.
IPR001940. Peptidase_S1C.
[Graphical view]
PfamiPF00219. IGFBP. 1 hit.
PF07648. Kazal_2. 1 hit.
PF00595. PDZ. 1 hit.
[Graphical view]
PRINTSiPR00834. PROTEASES2C.
SMARTiSM00121. IB. 1 hit.
SM00280. KAZAL. 1 hit.
SM00228. PDZ. 1 hit.
[Graphical view]
SUPFAMiSSF50156. SSF50156. 1 hit.
SSF50494. SSF50494. 1 hit.
SSF57184. SSF57184. 1 hit.
PROSITEiPS51323. IGFBP_N_2. 1 hit.
PS51465. KAZAL_2. 1 hit.
PS50106. PDZ. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. Cited for: NUCLEOTIDE SEQUENCE [MRNA], FUNCTION, INTERACTION WITH BMP4; TGFB2; TGFB1; ACTIVIN A AND GDF5, TISSUE SPECIFICITY, DEVELOPMENTAL STAGE, MUTAGENESIS OF SER-328.
    Strain: ICR.
    Tissue: Brain.
  2. "Mouse insulin-like growth factor binding protein 5-directed endopeptidase: structural assessment, evolutionary analysis, ovarian expression, hormonal regulation and cellular localization."
    Hourvitz A., Hennebold J.D., King G., Negishi H., Erickson G.F., Roby J.A., Mayo K.E., Adashi E.Y.
    Submitted (AUG-1999) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
    Strain: C57BL/6J.
    Tissue: Ovary.
  3. Mural R.J., Adams M.D., Myers E.W., Smith H.O., Venter J.C.
    Submitted (JUL-2005) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  4. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
  5. "The transcriptional landscape of the mammalian genome."
    Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.
    , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
    Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 73-480.
    Strain: C57BL/6J.
  6. "Binding of proteins to the PDZ domain regulates proteolytic activity of HtrA1 serine protease."
    Murwantoko I., Yano M., Ueta Y., Murasaki A., Kanda H., Oka C., Kawaichi M.
    Biochem. J. 381:895-904(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: MUTAGENESIS OF PHE-171.
  7. "Expression of mouse HtrA1 serine protease in normal bone and cartilage and its upregulation in joint cartilage damaged by experimental arthritis."
    Tsuchiya A., Yano M., Tocharus J., Kojima H., Fukumoto M., Kawaichi M., Oka C.
    Bone 37:323-336(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, TISSUE SPECIFICITY, DEVELOPMENTAL STAGE.
  8. "HtrA1-dependent proteolysis of TGF-beta controls both neuronal maturation and developmental survival."
    Launay S., Maubert E., Lebeurrier N., Tennstaedt A., Campioni M., Docagne F., Gabriel C., Dauphinot L., Potier M.C., Ehrmann M., Baldi A., Vivien D.
    Cell Death Differ. 15:1408-1416(2008) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, TISSUE SPECIFICITY, DEVELOPMENTAL STAGE.
  9. "High temperature requirement factor A1 (HTRA1) gene regulates angiogenesis through transforming growth factor-beta family member growth differentiation factor 6."
    Zhang L., Lim S.L., Du H., Zhang M., Kozak I., Hannum G., Wang X., Ouyang H., Hughes G., Zhao L., Zhu X., Lee C., Su Z., Zhou X., Shaw R., Geum D., Wei X., Zhu J.
    , Ideker T., Oka C., Wang N., Yang Z., Shaw P.X., Zhang K.
    J. Biol. Chem. 287:1520-1526(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: DISRUPTION PHENOTYPE, FUNCTION.

Entry informationi

Entry nameiHTRA1_MOUSE
AccessioniPrimary (citable) accession number: Q9R118
Secondary accession number(s): Q8BN04
, Q8BN05, Q91WS3, Q9QZK6
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 18, 2001
Last sequence update: July 27, 2011
Last modified: February 17, 2016
This is version 129 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. Peptidase families
    Classification of peptidase families and list of entries
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.