Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Stimulator of interferon genes protein

Gene

Tmem173

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Facilitator of innate immune signaling that acts as a sensor of cytosolic DNA from bacteria and viruses and promotes the production of type I interferon (IFN-alpha and IFN-beta). Innate immune response is triggered in response to non-CpG double-stranded DNA from viruses and bacteria delivered to the cytoplasm. Acts by recognizing and binding cyclic di-GMP (c-di-GMP), a second messenger produced by bacteria, and cyclic GMP-AMP (cGAMP), a messenger produced in response to DNA virus in the cytosol: upon binding of c-di-GMP or cGAMP, autoinhibition is alleviated and TMEM173/STING is able to activate both NF-kappa-B and IRF3 transcription pathways to induce expression of type I interferon and exert a potent anti-viral state. May be involved in translocon function, the translocon possibly being able to influence the induction of type I interferons. May be involved in transduction of apoptotic signals via its association with the major histocompatibility complex class II (MHC-II). Mediates death signaling via activation of the extracellular signal-regulated kinase (ERK) pathway. Exhibits 2',3' phosphodiester linkage-specific ligand recognition. Can bind both 2'-3' linked cGAMP and 3'-3' linked cGAMP but is preferentially activated by 2'-3' linked cGAMP (PubMed:26300263).10 Publications

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Binding sitei262c-di-GMPCombined sources1 Publication1

GO - Molecular functioni

  • cyclic-di-GMP binding Source: UniProtKB
  • cyclic-GMP-AMP binding Source: UniProtKB
  • protein homodimerization activity Source: UniProtKB
  • protein kinase binding Source: MGI
  • transcription factor binding Source: MGI
  • ubiquitin protein ligase binding Source: UniProtKB

GO - Biological processi

  • activation of innate immune response Source: UniProtKB
  • apoptotic process Source: UniProtKB-KW
  • cellular response to exogenous dsRNA Source: MGI
  • cellular response to interferon-beta Source: MGI
  • cellular response to organic cyclic compound Source: UniProtKB
  • defense response to virus Source: UniProtKB
  • innate immune response Source: UniProtKB
  • interferon-beta production Source: UniProtKB
  • positive regulation of defense response to virus by host Source: MGI
  • positive regulation of protein binding Source: MGI
  • positive regulation of protein import into nucleus, translocation Source: MGI
  • positive regulation of transcription factor import into nucleus Source: MGI
  • positive regulation of transcription from RNA polymerase II promoter Source: MGI
  • positive regulation of type I interferon production Source: UniProtKB
Complete GO annotation...

Keywords - Biological processi

Apoptosis, Immunity, Innate immunity

Keywords - Ligandi

Nucleotide-binding

Enzyme and pathway databases

ReactomeiR-MMU-1834941. STING mediated induction of host immune responses.
R-MMU-3134975. Regulation of innate immune responses to cytosolic DNA.
R-MMU-3249367. STAT6-mediated induction of chemokines.
R-MMU-3270619. IRF3-mediated induction of type I IFN.
R-MMU-6798695. Neutrophil degranulation.

Names & Taxonomyi

Protein namesi
Recommended name:
Stimulator of interferon genes protein2 Publications
Short name:
mSTING2 Publications
Alternative name(s):
Endoplasmic reticulum interferon stimulator1 Publication
Short name:
ERIS1 Publication
Mediator of IRF3 activation
Short name:
MMITA
Transmembrane protein 173
Gene namesi
Name:Tmem173
Synonyms:Eris Mita1 Publication, Mpys1 Publication, Sting2 Publications
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 18

Organism-specific databases

MGIiMGI:1919762. Tmem173.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini1 – 20CytoplasmicSequence analysisAdd BLAST20
Transmembranei21 – 41Helical; Name=1Sequence analysisAdd BLAST21
Topological domaini42 – 46ExtracellularSequence analysis5
Transmembranei47 – 67Helical; Name=2Sequence analysisAdd BLAST21
Topological domaini68 – 86CytoplasmicSequence analysisAdd BLAST19
Transmembranei87 – 106Helical; Name=3Sequence analysisAdd BLAST20
Topological domaini107 – 114ExtracellularSequence analysis8
Transmembranei115 – 135Helical; Name=4Sequence analysisAdd BLAST21
Topological domaini136 – 378CytoplasmicSequence analysisAdd BLAST243

GO - Cellular componenti

  • endoplasmic reticulum Source: MGI
  • endoplasmic reticulum membrane Source: UniProtKB
  • Golgi apparatus Source: MGI
  • integral component of membrane Source: UniProtKB-KW
  • mitochondrial outer membrane Source: MGI
  • perinuclear region of cytoplasm Source: UniProtKB
  • peroxisome Source: MGI
  • plasma membrane Source: UniProtKB-SubCell
Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Cytoplasm, Endoplasmic reticulum, Membrane, Mitochondrion, Mitochondrion outer membrane

Pathology & Biotechi

Disruption phenotypei

Defects in innate immunity. Death within 7 days of herpes simplex virus 1 (HSV-1) infection. In addition, mice show a remarkable reduction in cytotoxic T-cell responses after plasmid DNA vaccination. Cells fail to induce type I interferon production in response to dsDNA and infection with herpes simplex virus 1 (HSV-1) and L.monocytogenes that deliver DNA to the host cytosol.1 Publication

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi161S → A: Decrease in cGAMP-binding. 1 Publication1
Mutagenesisi229I → A, G or T: Strongly decreases affinity for the synthetic compound 5,6-dimethylxanthenone 4-acetic acid (DMXAA). 1 Publication1
Mutagenesisi239Y → S: Strong decrease in cGAMP-binding. 1 Publication1
Mutagenesisi241N → A: Strong decrease in cGAMP-binding. 1 Publication1

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00002711171 – 378Stimulator of interferon genes proteinAdd BLAST378

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Cross-linki150Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in ubiquitin)
Modified residuei357Phosphoserine; by TBK1By similarity1

Post-translational modificationi

Phosphorylated on Ser-357 by TBK1, leading to activation and production of IFN-beta (By similarity). Phosphorylated on tyrosine residues upon MHC-II aggregation.By similarity1 Publication
Ubiquitinated. 'Lys-63'-linked ubiquitination mediated by TRIM56 at Lys-150 promotes homodimerization and recruitment of the antiviral kinase TBK1 and subsequent production of IFN-beta. 'Lys-48'-linked polyubiquitination at Lys-150 occurring after viral infection is mediated by RNF5 and leads to proteasomal degradation (By similarity).By similarity

Keywords - PTMi

Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiQ3TBT3.
MaxQBiQ3TBT3.
PaxDbiQ3TBT3.
PeptideAtlasiQ3TBT3.
PRIDEiQ3TBT3.

PTM databases

PhosphoSitePlusiQ3TBT3.
SwissPalmiQ3TBT3.

Expressioni

Tissue specificityi

Present in spleen and thymus tissue. Also present in dendritic cells (at protein level).1 Publication

Developmental stagei

Expressed throughout the B-cell lineage prior to the plasma cell stage but occurs at highest levels in mature B-cells. Highly expressed in cells representing mature stages of B-cells but weakly expressed in pre-B cells, immature B-cells, and memory B-cell stages. Not detected in plasma cells.1 Publication

Gene expression databases

BgeeiENSMUSG00000024349.
CleanExiMM_TMEM173.
GenevisibleiQ3TBT3. MM.

Interactioni

Subunit structurei

Homodimer; 'Lys-63'-linked ubiquitination at Lys-150 is required for homodimerization (PubMed:18559423). Interacts with TBK1; when homodimer, leading to subsequent production of IFN-beta (By similarity). Interacts with DDX58/RIG-I, MAVS and SSR2. Interacts with RNF5. Associates with the MHC-II complex. Interacts with IFIT1 and IFIT2 (By similarity).By similarity1 Publication

Binary interactionsi

WithEntry#Exp.IntActNotes
Ddx41Q91VN64EBI-3862093,EBI-2551902
Tbk1Q9WUN23EBI-3862093,EBI-764193

GO - Molecular functioni

Protein-protein interaction databases

BioGridi215410. 2 interactors.
DIPiDIP-59959N.
IntActiQ3TBT3. 103 interactors.
STRINGi10090.ENSMUSP00000111393.

Structurei

Secondary structure

1378
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Helixi156 – 164Combined sources9
Helixi167 – 170Combined sources4
Turni171 – 173Combined sources3
Helixi174 – 184Combined sources11
Turni185 – 189Combined sources5
Helixi192 – 194Combined sources3
Beta strandi195 – 202Combined sources8
Helixi211 – 213Combined sources3
Beta strandi218 – 223Combined sources6
Beta strandi227 – 231Combined sources5
Beta strandi234 – 239Combined sources6
Beta strandi242 – 248Combined sources7
Beta strandi251 – 260Combined sources10
Helixi262 – 272Combined sources11
Turni274 – 276Combined sources3
Helixi280 – 298Combined sources19
Helixi304 – 307Combined sources4
Beta strandi308 – 313Combined sources6
Beta strandi317 – 319Combined sources3
Helixi324 – 333Combined sources10

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
4JC5X-ray2.75A/B149-348[»]
4KBYX-ray2.36A/B138-344[»]
4KC0X-ray2.20A/B138-344[»]
4LOJX-ray1.77A/B154-340[»]
4LOKX-ray2.07A/B154-340[»]
4LOLX-ray2.43A/B154-340[»]
4YP1X-ray2.65A/B138-344[»]
ProteinModelPortaliQ3TBT3.
SMRiQ3TBT3.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni152 – 339c-di-GMP-binding domain (CBD)1 PublicationAdd BLAST188
Regioni161 – 166c-di-GMP bindingCombined sources1 Publication6
Regioni237 – 240c-di-GMP bindingCombined sources1 Publication4
Regioni339 – 378C-terminal tail (CTT)By similarityAdd BLAST40

Domaini

The c-di-GMP-binding domain (CBD) forms a homodimer via hydrophobic interactions and binds both the cyclic diguanylate monophosphate (c-di-GMP) and the cyclic GMP-AMP (cGAMP) messengers. In absence of c-di-GMP or cGAMP, the protein is autoinhibited by an intramolecular interaction between the CBD and the C-terminal tail (CTT). Binding of c-di-GMP or cGAMP to the CBD releases the autoinhibition by displacing the CTT, leading to activate both NF-kappa-B and IRF3 transcription pathways to induce expression of type I interferon. The N-terminal part of the CBD region was initially though to contain a fifth transmembrane region (TM5) but is part of the folded, soluble CBD (By similarity).By similarity

Sequence similaritiesi

Belongs to the TMEM173 family.Curated

Keywords - Domaini

Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiENOG410IH2R. Eukaryota.
ENOG4111M85. LUCA.
GeneTreeiENSGT00390000008582.
HOVERGENiHBG094065.
InParanoidiQ3TBT3.
KOiK12654.
OMAiTWMLALL.
OrthoDBiEOG091G08B2.
PhylomeDBiQ3TBT3.
TreeFamiTF324444.

Family and domain databases

CDDicd12146. STING_C. 1 hit.
InterProiIPR029158. STING.
IPR033952. STING_C.
[Graphical view]
PfamiPF15009. TMEM173. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q3TBT3-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MPYSNLHPAI PRPRGHRSKY VALIFLVASL MILWVAKDPP NHTLKYLALH
60 70 80 90 100
LASHELGLLL KNLCCLAEEL CHVQSRYQGS YWKAVRACLG CPIHCMAMIL
110 120 130 140 150
LSSYFYFLQN TADIYLSWMF GLLVLYKSLS MLLGLQSLTP AEVSAVCEEK
160 170 180 190 200
KLNVAHGLAW SYYIGYLRLI LPGLQARIRM FNQLHNNMLS GAGSRRLYIL
210 220 230 240 250
FPLDCGVPDN LSVVDPNIRF RDMLPQQNID RAGIKNRVYS NSVYEILENG
260 270 280 290 300
QPAGVCILEY ATPLQTLFAM SQDAKAGFSR EDRLEQAKLF CRTLEEILED
310 320 330 340 350
VPESRNNCRL IVYQEPTDGN SFSLSQEVLR HIRQEEKEEV TMNAPMTSVA
360 370
PPPSVLSQEP RLLISGMDQP LPLRTDLI
Length:378
Mass (Da):42,830
Last modified:November 25, 2008 - v2
Checksum:i656ED19097ACE4C8
GO
Isoform 2 (identifier: Q3TBT3-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MIVESFGASGNPVGPCHFWSLYGVLLGVHWSVLHLGTFRGIRSAGLWLLM

Note: No experimental confirmation available.
Show »
Length:427
Mass (Da):48,151
Checksum:iCC362C808C035BE9
GO
Isoform 3 (identifier: Q3TBT3-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     76-116: Missing.

Note: No experimental confirmation available.
Show »
Length:337
Mass (Da):38,036
Checksum:i9F25E302E7E0FCE8
GO

Sequence cautioni

The sequence AAH27757 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.Curated
The sequence BAC37010 differs from that shown. Reason: Erroneous termination at position 203. Translated as Leu.Curated
The sequence BAE42563 differs from that shown. Reason: Frameshift at position 377.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti11P → Q in BAE27042 (PubMed:16141072).Curated1
Sequence conflicti39P → S in BAB27972 (PubMed:16141072).Curated1
Sequence conflicti98M → V in BAE42563 (PubMed:16141072).Curated1
Sequence conflicti111T → N in BAC37010 (PubMed:16141072).Curated1
Sequence conflicti210N → D in BAE34068 (PubMed:16141072).Curated1
Sequence conflicti210N → D in BAE42310 (PubMed:16141072).Curated1
Sequence conflicti210N → D in BAE42224 (PubMed:16141072).Curated1
Sequence conflicti210N → D in BAE32222 (PubMed:16141072).Curated1
Sequence conflicti210N → D in BAE34517 (PubMed:16141072).Curated1
Sequence conflicti315E → K in BAC37010 (PubMed:16141072).Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0222841M → MIVESFGASGNPVGPCHFWS LYGVLLGVHWSVLHLGTFRG IRSAGLWLLM in isoform 2. 1 Publication1
Alternative sequenceiVSP_02228576 – 116Missing in isoform 3. 1 PublicationAdd BLAST41

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
FJ222242 mRNA. Translation: ACI46649.1.
DQ910493 mRNA. Translation: ABI78935.1.
AK012006 mRNA. Translation: BAB27972.1.
AK077788 mRNA. Translation: BAC37010.1. Sequence problems.
AK089405 mRNA. Translation: BAC40870.1.
AK146284 mRNA. Translation: BAE27042.1.
AK153868 mRNA. Translation: BAE32222.1.
AK157370 mRNA. Translation: BAE34068.1.
AK158458 mRNA. Translation: BAE34517.1.
AK170724 mRNA. Translation: BAE41981.1.
AK171065 mRNA. Translation: BAE42224.1.
AK171203 mRNA. Translation: BAE42310.1.
AK171612 mRNA. Translation: BAE42563.1. Frameshift.
CH466557 Genomic DNA. Translation: EDK97142.1.
BC027757 mRNA. Translation: AAH27757.1. Different initiation.
BC046640 mRNA. Translation: AAH46640.1.
CCDSiCCDS50253.1. [Q3TBT3-1]
RefSeqiNP_001276520.1. NM_001289591.1. [Q3TBT3-2]
NP_001276521.1. NM_001289592.1. [Q3TBT3-3]
NP_082537.1. NM_028261.1. [Q3TBT3-1]
UniGeneiMm.45995.

Genome annotation databases

EnsembliENSMUST00000115728; ENSMUSP00000111393; ENSMUSG00000024349. [Q3TBT3-1]
GeneIDi72512.
KEGGimmu:72512.
UCSCiuc008emt.3. mouse. [Q3TBT3-1]
uc008emu.3. mouse. [Q3TBT3-3]
uc008emv.3. mouse. [Q3TBT3-2]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
FJ222242 mRNA. Translation: ACI46649.1.
DQ910493 mRNA. Translation: ABI78935.1.
AK012006 mRNA. Translation: BAB27972.1.
AK077788 mRNA. Translation: BAC37010.1. Sequence problems.
AK089405 mRNA. Translation: BAC40870.1.
AK146284 mRNA. Translation: BAE27042.1.
AK153868 mRNA. Translation: BAE32222.1.
AK157370 mRNA. Translation: BAE34068.1.
AK158458 mRNA. Translation: BAE34517.1.
AK170724 mRNA. Translation: BAE41981.1.
AK171065 mRNA. Translation: BAE42224.1.
AK171203 mRNA. Translation: BAE42310.1.
AK171612 mRNA. Translation: BAE42563.1. Frameshift.
CH466557 Genomic DNA. Translation: EDK97142.1.
BC027757 mRNA. Translation: AAH27757.1. Different initiation.
BC046640 mRNA. Translation: AAH46640.1.
CCDSiCCDS50253.1. [Q3TBT3-1]
RefSeqiNP_001276520.1. NM_001289591.1. [Q3TBT3-2]
NP_001276521.1. NM_001289592.1. [Q3TBT3-3]
NP_082537.1. NM_028261.1. [Q3TBT3-1]
UniGeneiMm.45995.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
4JC5X-ray2.75A/B149-348[»]
4KBYX-ray2.36A/B138-344[»]
4KC0X-ray2.20A/B138-344[»]
4LOJX-ray1.77A/B154-340[»]
4LOKX-ray2.07A/B154-340[»]
4LOLX-ray2.43A/B154-340[»]
4YP1X-ray2.65A/B138-344[»]
ProteinModelPortaliQ3TBT3.
SMRiQ3TBT3.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi215410. 2 interactors.
DIPiDIP-59959N.
IntActiQ3TBT3. 103 interactors.
STRINGi10090.ENSMUSP00000111393.

PTM databases

PhosphoSitePlusiQ3TBT3.
SwissPalmiQ3TBT3.

Proteomic databases

EPDiQ3TBT3.
MaxQBiQ3TBT3.
PaxDbiQ3TBT3.
PeptideAtlasiQ3TBT3.
PRIDEiQ3TBT3.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000115728; ENSMUSP00000111393; ENSMUSG00000024349. [Q3TBT3-1]
GeneIDi72512.
KEGGimmu:72512.
UCSCiuc008emt.3. mouse. [Q3TBT3-1]
uc008emu.3. mouse. [Q3TBT3-3]
uc008emv.3. mouse. [Q3TBT3-2]

Organism-specific databases

CTDi340061.
MGIiMGI:1919762. Tmem173.

Phylogenomic databases

eggNOGiENOG410IH2R. Eukaryota.
ENOG4111M85. LUCA.
GeneTreeiENSGT00390000008582.
HOVERGENiHBG094065.
InParanoidiQ3TBT3.
KOiK12654.
OMAiTWMLALL.
OrthoDBiEOG091G08B2.
PhylomeDBiQ3TBT3.
TreeFamiTF324444.

Enzyme and pathway databases

ReactomeiR-MMU-1834941. STING mediated induction of host immune responses.
R-MMU-3134975. Regulation of innate immune responses to cytosolic DNA.
R-MMU-3249367. STAT6-mediated induction of chemokines.
R-MMU-3270619. IRF3-mediated induction of type I IFN.
R-MMU-6798695. Neutrophil degranulation.

Miscellaneous databases

PROiQ3TBT3.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000024349.
CleanExiMM_TMEM173.
GenevisibleiQ3TBT3. MM.

Family and domain databases

CDDicd12146. STING_C. 1 hit.
InterProiIPR029158. STING.
IPR033952. STING_C.
[Graphical view]
PfamiPF15009. TMEM173. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiSTING_MOUSE
AccessioniPrimary (citable) accession number: Q3TBT3
Secondary accession number(s): A7YGY9
, Q3TAV5, Q3TYP5, Q3TZY8, Q3UJW3, Q8C227, Q8C5Q3, Q8K393, Q9CZY7
Entry historyi
Integrated into UniProtKB/Swiss-Prot: January 9, 2007
Last sequence update: November 25, 2008
Last modified: November 30, 2016
This is version 100 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Miscellaneous

Was named MPYS because the protein sequence begins by Met-Pro-Tyr-Ser residues.1 Publication
Contrary to human and rat TMEM173/STING, mouse TMEM173/STING mediates not only responses to cyclic nucleotide signaling molecules, but is also strongly activated by antiviral and anticancer molecules, such as 5,6-dimethylxanthenone 4-acetic acid (DMXAA) and 10-carboxymethyl-9-acridanone (CMA).1 Publication

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.