Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Cytokine-inducible SH2-containing protein

Gene

CISH

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

SOCS family proteins form part of a classical negative feedback system that regulates cytokine signal transduction. CIS is involved in the negative regulation of cytokines that signal through the JAK-STAT5 pathway such as erythropoietin, prolactin and interleukin 3 (IL3) receptor. Inhibits STAT5 trans-activation by suppressing its tyrosine phosphorylation. May be a substrate-recognition component of a SCF-like ECS (Elongin BC-CUL2/5-SOCS-box protein) E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins (By similarity).By similarity

Pathwayi

GO - Molecular functioni

  1. protein kinase inhibitor activity Source: GO_Central

GO - Biological processi

  1. cytokine-mediated signaling pathway Source: GO_Central
  2. intracellular signal transduction Source: UniProtKB
  3. JAK-STAT cascade involved in growth hormone signaling pathway Source: Reactome
  4. negative regulation of insulin receptor signaling pathway Source: GO_Central
  5. negative regulation of JAK-STAT cascade Source: GO_Central
  6. negative regulation of protein kinase activity Source: GO_Central
  7. protein kinase C-activating G-protein coupled receptor signaling pathway Source: Ensembl
  8. protein ubiquitination Source: UniProtKB-UniPathway
  9. regulation of cell growth Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Signal transduction inhibitor

Keywords - Biological processi

Growth regulation, Ubl conjugation pathway

Enzyme and pathway databases

ReactomeiREACT_111133. Growth hormone receptor signaling.
SignaLinkiQ9NSE2.
UniPathwayiUPA00143.

Names & Taxonomyi

Protein namesi
Recommended name:
Cytokine-inducible SH2-containing protein
Short name:
CIS
Alternative name(s):
CIS-1
Protein G18
Suppressor of cytokine signaling
Short name:
SOCS
Gene namesi
Name:CISH
Synonyms:G18
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 3

Organism-specific databases

HGNCiHGNC:1984. CISH.

Subcellular locationi

GO - Cellular componenti

  1. cytoplasm Source: GO_Central
  2. cytosol Source: Reactome
  3. plasma membrane Source: Ensembl
Complete GO annotation...

Pathology & Biotechi

Organism-specific databases

MIMi607948. phenotype.
611162. phenotype.
614383. phenotype.
PharmGKBiPA26521.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 258258Cytokine-inducible SH2-containing proteinPRO_0000181231Add
BLAST

Post-translational modificationi

Association with EPOR may target the protein for proteolysis by the ubiquitin-dependent proteasome pathway. CIS is mainly monubiquitinated (37 kDa form) but may also exist in a polyubiquitinated form (45 kDa).1 Publication

Keywords - PTMi

Ubl conjugation

Proteomic databases

MaxQBiQ9NSE2.
PaxDbiQ9NSE2.
PRIDEiQ9NSE2.

PTM databases

PhosphoSiteiQ9NSE2.

Expressioni

Tissue specificityi

Expressed in various epithelial tissues. Abundantly expressed in liver and kidney, and to a lesser extent in lung. The tissue distribution of isoforms 1 and 1B is distinct.1 Publication

Inductioni

By a subset of cytokines including EPO/erythropoietin.

Gene expression databases

BgeeiQ9NSE2.
CleanExiHS_CISH.
GenevestigatoriQ9NSE2.

Organism-specific databases

HPAiCAB034039.
HPA040812.

Interactioni

Subunit structurei

Stably associated with the tyrosine-phosphorylated IL3 receptor beta chain and tyrosine-phosphorylated EPO receptor (EPOR).

Binary interactionsi

WithEntry#Exp.IntActNotes
EGFRP005333EBI-617866,EBI-297353
ERBB2P046264EBI-617866,EBI-641062

Protein-protein interaction databases

BioGridi107574. 23 interactions.
IntActiQ9NSE2. 9 interactions.
MINTiMINT-1191268.
STRINGi9606.ENSP00000409346.

Structurei

3D structure databases

ProteinModelPortaliQ9NSE2.
SMRiQ9NSE2. Positions 61-258.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini82 – 16382SH2PROSITE-ProRule annotationAdd
BLAST
Domaini209 – 25749SOCS boxPROSITE-ProRule annotationAdd
BLAST

Sequence similaritiesi

Contains 1 SH2 domain.PROSITE-ProRule annotation
Contains 1 SOCS box domain.PROSITE-ProRule annotation

Keywords - Domaini

SH2 domain

Phylogenomic databases

eggNOGiNOG301272.
GeneTreeiENSGT00760000119136.
HOGENOMiHOG000236320.
HOVERGENiHBG002457.
InParanoidiQ9NSE2.
KOiK04701.
OMAiCRLVINR.
OrthoDBiEOG71RXKX.
PhylomeDBiQ9NSE2.
TreeFamiTF321368.

Family and domain databases

Gene3Di3.30.505.10. 2 hits.
InterProiIPR028425. CIS.
IPR000980. SH2.
IPR028413. SOCS.
IPR001496. SOCS_C.
[Graphical view]
PANTHERiPTHR10385. PTHR10385. 1 hit.
PTHR10385:SF7. PTHR10385:SF7. 1 hit.
PfamiPF00017. SH2. 1 hit.
PF07525. SOCS_box. 1 hit.
[Graphical view]
PRINTSiPR00401. SH2DOMAIN.
SMARTiSM00252. SH2. 1 hit.
SM00253. SOCS. 1 hit.
SM00969. SOCS_box. 1 hit.
[Graphical view]
SUPFAMiSSF55550. SSF55550. 1 hit.
PROSITEiPS50001. SH2. 1 hit.
PS50225. SOCS. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9NSE2-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MVLCVQGPRP LLAVERTGQR PLWAPSLELP KPVMQPLPAG AFLEEVAEGT
60 70 80 90 100
PAQTESEPKV LDPEEDLLCI AKTFSYLRES GWYWGSITAS EARQHLQKMP
110 120 130 140 150
EGTFLVRDST HPSYLFTLSV KTTRGPTNVR IEYADSSFRL DSNCLSRPRI
160 170 180 190 200
LAFPDVVSLV QHYVASCTAD TRSDSPDPAP TPALPMPKED APSDPALPAP
210 220 230 240 250
PPATAVHLKL VQPFVRRSSA RSLQHLCRLV INRLVADVDC LPLPRRMADY

LRQYPFQL
Length:258
Mass (Da):28,663
Last modified:September 30, 2000 - v1
Checksum:i0BA97804E3B0F608
GO
Isoform 1B (identifier: Q9NSE2-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-7: MVLCVQG → MGPPLTAHPLPSR

Show »
Length:264
Mass (Da):29,288
Checksum:i0C1C5791EAFE56DC
GO
Isoform 1C (identifier: Q9NSE2-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-7: MVLCVQG → MYLEHTSHCPHHDDDTAMDTPLPR

Show »
Length:275
Mass (Da):30,734
Checksum:i6F9A0020F2D27A19
GO

Sequence cautioni

The sequence AAD28471.2 differs from that shown. Reason: Erroneous initiation. Translation N-terminally extended.Curated

Polymorphismi

Note: CISH polymorphisms are involved in susceptibilty to malaria [MIMi:611162].
Note: Genetic variations in CISH are involved in susceptibilty to tuberculosis [MIMi:607948].
Note: Genetic variations in CISH are associated with susceptibility to bacterial invasion of the blood and define the bacteremia susceptibility locus 2 (BACTS2) [MIMi:614383].

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 77MVLCVQG → MGPPLTAHPLPSR in isoform 1B. 1 PublicationVSP_006194
Alternative sequencei1 – 77MVLCVQG → MYLEHTSHCPHHDDDTAMDT PLPR in isoform 1C. 1 PublicationVSP_006195

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D83532 mRNA. Translation: BAA92328.1.
AF035947 mRNA. Translation: AAF97410.1.
AF132297 mRNA. Translation: AAD28471.2. Different initiation.
AK313850 mRNA. Translation: BAG36578.1.
AC096920 Genomic DNA. No translation available.
CH471055 Genomic DNA. Translation: EAW65128.1.
CH471055 Genomic DNA. Translation: EAW65129.1.
BC031590 mRNA. Translation: AAH31590.1.
BC064354 mRNA. Translation: AAH64354.1.
CCDSiCCDS2831.1. [Q9NSE2-1]
CCDS46834.1. [Q9NSE2-3]
PIRiJC7512.
RefSeqiNP_037456.5. NM_013324.5. [Q9NSE2-3]
NP_659508.1. NM_145071.2. [Q9NSE2-1]
XP_005264903.1. XM_005264846.2. [Q9NSE2-3]
UniGeneiHs.655334.

Genome annotation databases

EnsembliENST00000348721; ENSP00000294173; ENSG00000114737. [Q9NSE2-1]
ENST00000443053; ENSP00000409346; ENSG00000114737. [Q9NSE2-3]
GeneIDi1154.
KEGGihsa:1154.
UCSCiuc003dax.3. human. [Q9NSE2-1]
uc010hlq.3. human.

Polymorphism databases

DMDMi13124022.

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D83532 mRNA. Translation: BAA92328.1.
AF035947 mRNA. Translation: AAF97410.1.
AF132297 mRNA. Translation: AAD28471.2. Different initiation.
AK313850 mRNA. Translation: BAG36578.1.
AC096920 Genomic DNA. No translation available.
CH471055 Genomic DNA. Translation: EAW65128.1.
CH471055 Genomic DNA. Translation: EAW65129.1.
BC031590 mRNA. Translation: AAH31590.1.
BC064354 mRNA. Translation: AAH64354.1.
CCDSiCCDS2831.1. [Q9NSE2-1]
CCDS46834.1. [Q9NSE2-3]
PIRiJC7512.
RefSeqiNP_037456.5. NM_013324.5. [Q9NSE2-3]
NP_659508.1. NM_145071.2. [Q9NSE2-1]
XP_005264903.1. XM_005264846.2. [Q9NSE2-3]
UniGeneiHs.655334.

3D structure databases

ProteinModelPortaliQ9NSE2.
SMRiQ9NSE2. Positions 61-258.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi107574. 23 interactions.
IntActiQ9NSE2. 9 interactions.
MINTiMINT-1191268.
STRINGi9606.ENSP00000409346.

PTM databases

PhosphoSiteiQ9NSE2.

Polymorphism databases

DMDMi13124022.

Proteomic databases

MaxQBiQ9NSE2.
PaxDbiQ9NSE2.
PRIDEiQ9NSE2.

Protocols and materials databases

DNASUi1154.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000348721; ENSP00000294173; ENSG00000114737. [Q9NSE2-1]
ENST00000443053; ENSP00000409346; ENSG00000114737. [Q9NSE2-3]
GeneIDi1154.
KEGGihsa:1154.
UCSCiuc003dax.3. human. [Q9NSE2-1]
uc010hlq.3. human.

Organism-specific databases

CTDi1154.
GeneCardsiGC03M050618.
HGNCiHGNC:1984. CISH.
HPAiCAB034039.
HPA040812.
MIMi602441. gene.
607948. phenotype.
611162. phenotype.
614383. phenotype.
neXtProtiNX_Q9NSE2.
PharmGKBiPA26521.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiNOG301272.
GeneTreeiENSGT00760000119136.
HOGENOMiHOG000236320.
HOVERGENiHBG002457.
InParanoidiQ9NSE2.
KOiK04701.
OMAiCRLVINR.
OrthoDBiEOG71RXKX.
PhylomeDBiQ9NSE2.
TreeFamiTF321368.

Enzyme and pathway databases

UniPathwayiUPA00143.
ReactomeiREACT_111133. Growth hormone receptor signaling.
SignaLinkiQ9NSE2.

Miscellaneous databases

ChiTaRSiCISH. human.
GeneWikiiCISH.
GenomeRNAii1154.
NextBioi4792.
PROiQ9NSE2.
SOURCEiSearch...

Gene expression databases

BgeeiQ9NSE2.
CleanExiHS_CISH.
GenevestigatoriQ9NSE2.

Family and domain databases

Gene3Di3.30.505.10. 2 hits.
InterProiIPR028425. CIS.
IPR000980. SH2.
IPR028413. SOCS.
IPR001496. SOCS_C.
[Graphical view]
PANTHERiPTHR10385. PTHR10385. 1 hit.
PTHR10385:SF7. PTHR10385:SF7. 1 hit.
PfamiPF00017. SH2. 1 hit.
PF07525. SOCS_box. 1 hit.
[Graphical view]
PRINTSiPR00401. SH2DOMAIN.
SMARTiSM00252. SH2. 1 hit.
SM00253. SOCS. 1 hit.
SM00969. SOCS_box. 1 hit.
[Graphical view]
SUPFAMiSSF55550. SSF55550. 1 hit.
PROSITEiPS50001. SH2. 1 hit.
PS50225. SOCS. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Molecular cloning of CISH, chromosome assignment to 3p21.3, and analysis of expression in fetal and adult tissues."
    Uchida K., Yoshimura A., Inazawa J., Yanagisawa K., Osada H., Masuda A., Saito T., Takahashi T., Miyajima A., Takahashi K.
    Cytogenet. Cell Genet. 78:209-212(1996) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
    Tissue: Fetal lung.
  2. "Cloning and characterization of CIS 1b (cytokine inducible SH2-containing protein 1b), an alternative splicing form of CIS 1 gene."
    Jiang C., Yu L., Zhao Y., Zhang M., Liu Q., Mao N., Geng Z., Zhao S.
    DNA Seq. 11:149-154(1999) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1B).
    Tissue: Placenta.
  3. "The human G18 is an ortholog of the rodent gene CIS-2 and is located in 3p21.3 and homozygously deleted in lung cancer."
    Wei M.-H., Minna J.D., Lerman M.I.
    Submitted (FEB-2000) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1C).
  4. "cDNA representational difference analysis of human neutrophils stimulated by GM-CSF."
    Yousefi S., Cooper P.R., Mueck B., Potter S.L., Jarai G.
    Biochem. Biophys. Res. Commun. 277:401-409(1999) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1), TISSUE SPECIFICITY.
  5. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Tissue: Placenta.
  6. "The DNA sequence, annotation and analysis of human chromosome 3."
    Muzny D.M., Scherer S.E., Kaul R., Wang J., Yu J., Sudbrak R., Buhay C.J., Chen R., Cree A., Ding Y., Dugan-Rocha S., Gill R., Gunaratne P., Harris R.A., Hawes A.C., Hernandez J., Hodgson A.V., Hume J.
    , Jackson A., Khan Z.M., Kovar-Smith C., Lewis L.R., Lozado R.J., Metzker M.L., Milosavljevic A., Miner G.R., Morgan M.B., Nazareth L.V., Scott G., Sodergren E., Song X.-Z., Steffen D., Wei S., Wheeler D.A., Wright M.W., Worley K.C., Yuan Y., Zhang Z., Adams C.Q., Ansari-Lari M.A., Ayele M., Brown M.J., Chen G., Chen Z., Clendenning J., Clerc-Blankenburg K.P., Chen R., Chen Z., Davis C., Delgado O., Dinh H.H., Dong W., Draper H., Ernst S., Fu G., Gonzalez-Garay M.L., Garcia D.K., Gillett W., Gu J., Hao B., Haugen E., Havlak P., He X., Hennig S., Hu S., Huang W., Jackson L.R., Jacob L.S., Kelly S.H., Kube M., Levy R., Li Z., Liu B., Liu J., Liu W., Lu J., Maheshwari M., Nguyen B.-V., Okwuonu G.O., Palmeiri A., Pasternak S., Perez L.M., Phelps K.A., Plopper F.J., Qiang B., Raymond C., Rodriguez R., Saenphimmachak C., Santibanez J., Shen H., Shen Y., Subramanian S., Tabor P.E., Verduzco D., Waldron L., Wang J., Wang J., Wang Q., Williams G.A., Wong G.K.-S., Yao Z., Zhang J., Zhang X., Zhao G., Zhou J., Zhou Y., Nelson D., Lehrach H., Reinhardt R., Naylor S.L., Yang H., Olson M., Weinstock G., Gibbs R.A.
    Nature 440:1194-1198(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  7. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  8. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Tissue: Colon and Skin.
  9. "Proteasomes regulate erythropoietin receptor and signal transducer and activator of transcription 5 (STAT5) activation. Possible involvement of the ubiquitinated CIS protein."
    Verdier F., Chretien S., Muller O., Varlet P., Yoshimura A., Gisselbrecht S., Lacombe C., Mayeux P.
    J. Biol. Chem. 273:28185-28190(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH EPOR, UBIQUITINATION.
  10. Cited for: POLYMORPHISM, INVOLVEMENT IN SUSCEPTIBILITY TO TUBERCULOSIS; MALARIA AND BACTS2.

Entry informationi

Entry nameiCISH_HUMAN
AccessioniPrimary (citable) accession number: Q9NSE2
Secondary accession number(s): B2R9N1
, G5E9R1, Q9NS38, Q9Y5R1
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 20, 2001
Last sequence update: September 30, 2000
Last modified: February 3, 2015
This is version 126 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 3
    Human chromosome 3: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. PATHWAY comments
    Index of metabolic and biosynthesis pathways
  4. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.