Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Granzyme B

Gene

GZMB

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

This enzyme is necessary for target cell lysis in cell-mediated immune responses. It cleaves after Asp. Seems to be linked to an activation cascade of caspases (aspartate-specific cysteine proteases) responsible for apoptosis execution. Cleaves caspase-3, -7, -9 and 10 to give rise to active enzymes mediating apoptosis.

Catalytic activityi

Preferential cleavage: -Asp-|-Xaa- >> -Asn-|-Xaa- > -Met-|-Xaa-, -Ser-|-Xaa-.1 Publication

Enzyme regulationi

Inactivated by the serine protease inhibitor diisopropylfluorophosphate.1 Publication

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Active sitei64 – 641Charge relay system
Active sitei108 – 1081Charge relay system
Active sitei203 – 2031Charge relay system
Sitei228 – 2281Mediates preference for Asp-containing substratesBy similarity

GO - Molecular functioni

  • serine-type endopeptidase activity Source: UniProtKB
  • serine-type peptidase activity Source: ProtInc

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Hydrolase, Protease, Serine protease

Keywords - Biological processi

Apoptosis, Cytolysis

Enzyme and pathway databases

ReactomeiREACT_163910. NOTCH2 intracellular domain regulates transcription.
REACT_701. Activation, myristolyation of BID and translocation to mitochondria.

Protein family/group databases

MEROPSiS01.010.

Names & Taxonomyi

Protein namesi
Recommended name:
Granzyme B (EC:3.4.21.79)
Alternative name(s):
C11
CTLA-1
Cathepsin G-like 1
Short name:
CTSGL1
Cytotoxic T-lymphocyte proteinase 2
Short name:
Lymphocyte protease
Fragmentin-2
Granzyme-2
Human lymphocyte protein
Short name:
HLP
SECT
T-cell serine protease 1-3E
Gene namesi
Name:GZMB
Synonyms:CGL1, CSPB, CTLA1, GRB
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 14

Organism-specific databases

HGNCiHGNC:4709. GZMB.

Subcellular locationi

  • Cytoplasmic granule 1 Publication

  • Note: Cytoplasmic granules of cytolytic T-lymphocytes and natural killer cells.

GO - Cellular componenti

  • cytoplasm Source: HPA
  • cytosol Source: Reactome
  • immunological synapse Source: UniProtKB
  • intracellular membrane-bounded organelle Source: HPA
  • membrane Source: UniProtKB
  • nucleus Source: UniProtKB
Complete GO annotation...

Pathology & Biotechi

Polymorphism and mutation databases

BioMutaiGZMB.
DMDMi317373361.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 18181 PublicationAdd
BLAST
Propeptidei19 – 202Activation peptide4 PublicationsPRO_0000027399
Chaini21 – 247227Granzyme BPRO_0000027400Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Disulfide bondi49 ↔ 65
Glycosylationi71 – 711N-linked (GlcNAc...)
Glycosylationi104 – 1041N-linked (GlcNAc...)Sequence Analysis
Disulfide bondi142 ↔ 209
Disulfide bondi173 ↔ 188

Keywords - PTMi

Disulfide bond, Glycoprotein, Zymogen

Proteomic databases

PaxDbiP10144.
PRIDEiP10144.

PTM databases

PhosphoSiteiP10144.

Miscellaneous databases

PMAP-CutDBP10144.

Expressioni

Inductioni

By staphylococcal enterotoxin A (SEA) in peripheral blood leukocytes.

Gene expression databases

BgeeiP10144.
CleanExiHS_GZMB.
ExpressionAtlasiP10144. baseline and differential.
GenevisibleiP10144. HS.

Organism-specific databases

HPAiCAB000376.
HPA003418.

Interactioni

Binary interactionsi

WithEntry#Exp.IntActNotes
PRF1P142223EBI-2505785,EBI-724466
SRGNP101242EBI-2505785,EBI-744915

Protein-protein interaction databases

BioGridi109257. 30 interactions.
IntActiP10144. 6 interactions.
MINTiMINT-4528791.
STRINGi9606.ENSP00000216341.

Structurei

Secondary structure

1
247
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi35 – 417Combined sources
Beta strandi46 – 5510Combined sources
Beta strandi58 – 614Combined sources
Helixi63 – 653Combined sources
Beta strandi68 – 758Combined sources
Turni79 – 824Combined sources
Beta strandi87 – 9610Combined sources
Turni102 – 1043Combined sources
Beta strandi110 – 1167Combined sources
Beta strandi141 – 1488Combined sources
Beta strandi150 – 1545Combined sources
Beta strandi161 – 1677Combined sources
Helixi170 – 1767Combined sources
Turni177 – 1804Combined sources
Turni183 – 1853Combined sources
Beta strandi186 – 1905Combined sources
Beta strandi205 – 2095Combined sources
Beta strandi212 – 2209Combined sources
Beta strandi228 – 2325Combined sources
Helixi233 – 2364Combined sources
Helixi237 – 2459Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1FQ3X-ray3.10A/B21-247[»]
1IAUX-ray2.00A21-247[»]
ProteinModelPortaliP10144.
SMRiP10144. Positions 21-246.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP10144.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini21 – 245225Peptidase S1PROSITE-ProRule annotationAdd
BLAST

Sequence similaritiesi

Belongs to the peptidase S1 family. Granzyme subfamily.PROSITE-ProRule annotation
Contains 1 peptidase S1 domain.PROSITE-ProRule annotation

Keywords - Domaini

Signal

Phylogenomic databases

eggNOGiCOG5640.
GeneTreeiENSGT00760000118895.
HOGENOMiHOG000251820.
HOVERGENiHBG013304.
InParanoidiP10144.
KOiK01353.
PhylomeDBiP10144.
TreeFamiTF333630.

Family and domain databases

InterProiIPR001254. Peptidase_S1.
IPR018114. Peptidase_S1_AS.
IPR001314. Peptidase_S1A.
IPR009003. Trypsin-like_Pept_dom.
[Graphical view]
PfamiPF00089. Trypsin. 1 hit.
[Graphical view]
PRINTSiPR00722. CHYMOTRYPSIN.
SMARTiSM00020. Tryp_SPc. 1 hit.
[Graphical view]
SUPFAMiSSF50494. SSF50494. 1 hit.
PROSITEiPS50240. TRYPSIN_DOM. 1 hit.
PS00134. TRYPSIN_HIS. 1 hit.
PS00135. TRYPSIN_SER. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P10144-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MQPILLLLAF LLLPRADAGE IIGGHEAKPH SRPYMAYLMI WDQKSLKRCG
60 70 80 90 100
GFLIRDDFVL TAAHCWGSSI NVTLGAHNIK EQEPTQQFIP VKRPIPHPAY
110 120 130 140 150
NPKNFSNDIM LLQLERKAKR TRAVQPLRLP SNKAQVKPGQ TCSVAGWGQT
160 170 180 190 200
APLGKHSHTL QEVKMTVQED RKCESDLRHY YDSTIELCVG DPEIKKTSFK
210 220 230 240
GDSGGPLVCN KVAQGIVSYG RNNGMPPRAC TKVSSFVHWI KKTMKRY
Length:247
Mass (Da):27,716
Last modified:January 11, 2011 - v2
Checksum:iC652271918EF24F9
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti32 – 332RP → PR AA sequence (PubMed:8258716).Curated
Sequence conflicti72 – 721V → G in AAA52118 (PubMed:3261871).Curated
Sequence conflicti212 – 2121V → C in AAB59528 (PubMed:2788607).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti55 – 551R → Q.7 Publications
Corresponds to variant rs8192917 [ dbSNP | Ensembl ].
VAR_018371
Natural varianti94 – 941P → A.2 Publications
Corresponds to variant rs11539752 [ dbSNP | Ensembl ].
VAR_047409
Natural varianti247 – 2471Y → H.2 Publications
Corresponds to variant rs2236338 [ dbSNP | Ensembl ].
VAR_018381

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M17016 mRNA. Translation: AAA36627.1.
J03189 mRNA. Translation: AAA36603.1.
J04071 mRNA. Translation: AAA52118.1.
J03072 Genomic DNA. Translation: AAB59528.1.
M38193 Genomic DNA. Translation: AAA67124.1.
M28879 Genomic DNA. Translation: AAA75490.1.
AL136018 Genomic DNA. No translation available.
BC030195 mRNA. Translation: AAH30195.1.
CCDSiCCDS9633.1.
PIRiA61021.
RefSeqiNP_004122.2. NM_004131.4.
UniGeneiHs.1051.

Genome annotation databases

EnsembliENST00000216341; ENSP00000216341; ENSG00000100453.
GeneIDi3002.
KEGGihsa:3002.
UCSCiuc001wps.2. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M17016 mRNA. Translation: AAA36627.1.
J03189 mRNA. Translation: AAA36603.1.
J04071 mRNA. Translation: AAA52118.1.
J03072 Genomic DNA. Translation: AAB59528.1.
M38193 Genomic DNA. Translation: AAA67124.1.
M28879 Genomic DNA. Translation: AAA75490.1.
AL136018 Genomic DNA. No translation available.
BC030195 mRNA. Translation: AAH30195.1.
CCDSiCCDS9633.1.
PIRiA61021.
RefSeqiNP_004122.2. NM_004131.4.
UniGeneiHs.1051.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1FQ3X-ray3.10A/B21-247[»]
1IAUX-ray2.00A21-247[»]
ProteinModelPortaliP10144.
SMRiP10144. Positions 21-246.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi109257. 30 interactions.
IntActiP10144. 6 interactions.
MINTiMINT-4528791.
STRINGi9606.ENSP00000216341.

Chemistry

BindingDBiP10144.
ChEMBLiCHEMBL2316.

Protein family/group databases

MEROPSiS01.010.

PTM databases

PhosphoSiteiP10144.

Polymorphism and mutation databases

BioMutaiGZMB.
DMDMi317373361.

Proteomic databases

PaxDbiP10144.
PRIDEiP10144.

Protocols and materials databases

DNASUi3002.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000216341; ENSP00000216341; ENSG00000100453.
GeneIDi3002.
KEGGihsa:3002.
UCSCiuc001wps.2. human.

Organism-specific databases

CTDi3002.
GeneCardsiGC14M025100.
H-InvDBHIX0011578.
HGNCiHGNC:4709. GZMB.
HPAiCAB000376.
HPA003418.
MIMi123910. gene.
neXtProtiNX_P10144.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiCOG5640.
GeneTreeiENSGT00760000118895.
HOGENOMiHOG000251820.
HOVERGENiHBG013304.
InParanoidiP10144.
KOiK01353.
PhylomeDBiP10144.
TreeFamiTF333630.

Enzyme and pathway databases

ReactomeiREACT_163910. NOTCH2 intracellular domain regulates transcription.
REACT_701. Activation, myristolyation of BID and translocation to mitochondria.

Miscellaneous databases

ChiTaRSiGZMB. human.
EvolutionaryTraceiP10144.
GeneWikiiGZMB.
GenomeRNAii3002.
NextBioi11904.
PMAP-CutDBP10144.
PROiP10144.
SOURCEiSearch...

Gene expression databases

BgeeiP10144.
CleanExiHS_GZMB.
ExpressionAtlasiP10144. baseline and differential.
GenevisibleiP10144. HS.

Family and domain databases

InterProiIPR001254. Peptidase_S1.
IPR018114. Peptidase_S1_AS.
IPR001314. Peptidase_S1A.
IPR009003. Trypsin-like_Pept_dom.
[Graphical view]
PfamiPF00089. Trypsin. 1 hit.
[Graphical view]
PRINTSiPR00722. CHYMOTRYPSIN.
SMARTiSM00020. Tryp_SPc. 1 hit.
[Graphical view]
SUPFAMiSSF50494. SSF50494. 1 hit.
PROSITEiPS50240. TRYPSIN_DOM. 1 hit.
PS00134. TRYPSIN_HIS. 1 hit.
PS00135. TRYPSIN_SER. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Induction of mRNA for a serine protease and a beta-thromboglobulin-like protein in mitogen-stimulated human leukocytes."
    Schmid J., Weissmann C.
    J. Immunol. 139:250-256(1987) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], VARIANT GLN-55.
  2. "Structure and differential mechanisms of regulation of expression of a serine esterase gene in activated human T lymphocytes."
    Caputo A., Fahey D., Lloyd C., Vozab R., McCairns E., Rowe P.B.
    J. Biol. Chem. 263:6363-6369(1988) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], VARIANT ALA-94.
  3. "Molecular cloning of an inducible serine esterase gene from human cytotoxic lymphocytes."
    Trapani J.A., Klein J.L., White P.C., Dupont B.
    Proc. Natl. Acad. Sci. U.S.A. 85:6924-6928(1988) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], VARIANT GLN-55.
  4. "Genomic organization and chromosomal assignment for a serine protease gene (CSPB) expressed by human cytotoxic lymphocytes."
    Klein J.L., Shows T.B., Dupont B., Trapani J.A.
    Genomics 5:110-117(1989) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA], VARIANT GLN-55.
  5. "Nucleotide sequence and genomic organization of a human T lymphocyte serine protease gene."
    Caputo A., Sauer D.E., Rowe P.B.
    J. Immunol. 145:737-744(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA], VARIANT GLN-55.
  6. "Structural organization of the hCTLA-1 gene encoding human granzyme B."
    Haddad P., Clement M.-V., Bernard O., Larsen C.-J., Degos L., Sasportes M., Mathieu-Mahul D.
    Gene 87:265-271(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA], VARIANTS GLN-55 AND ALA-94.
  7. "The DNA sequence and analysis of human chromosome 14."
    Heilig R., Eckenberg R., Petit J.-L., Fonknechten N., Da Silva C., Cattolico L., Levy M., Barbe V., De Berardinis V., Ureta-Vidal A., Pelletier E., Vico V., Anthouard V., Rowen L., Madan A., Qin S., Sun H., Du H.
    , Pepin K., Artiguenave F., Robert C., Cruaud C., Bruels T., Jaillon O., Friedlander L., Samson G., Brottier P., Cure S., Segurens B., Aniere F., Samain S., Crespeau H., Abbasi N., Aiach N., Boscus D., Dickhoff R., Dors M., Dubois I., Friedman C., Gouyvenoux M., James R., Madan A., Mairey-Estrada B., Mangenot S., Martins N., Menard M., Oztas S., Ratcliffe A., Shaffer T., Trask B., Vacherie B., Bellemere C., Belser C., Besnard-Gonnet M., Bartol-Mavel D., Boutard M., Briez-Silla S., Combette S., Dufosse-Laurent V., Ferron C., Lechaplais C., Louesse C., Muselet D., Magdelenat G., Pateau E., Petit E., Sirvain-Trukniewicz P., Trybou A., Vega-Czarny N., Bataille E., Bluet E., Bordelais I., Dubois M., Dumont C., Guerin T., Haffray S., Hammadi R., Muanga J., Pellouin V., Robert D., Wunderle E., Gauguet G., Roy A., Sainte-Marthe L., Verdier J., Verdier-Discala C., Hillier L.W., Fulton L., McPherson J., Matsuda F., Wilson R., Scarpelli C., Gyapay G., Wincker P., Saurin W., Quetier F., Waterston R., Hood L., Weissenbach J.
    Nature 421:601-607(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  8. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA], VARIANTS GLN-55 AND HIS-247.
    Tissue: Pancreas.
  9. "Isolation of a cDNA clone encoding a novel form of granzyme B from human NK cells and mapping to chromosome 14."
    Dahl C.A., Bach F.H., Chan W., Huebner K., Russo G., Croce C.M., Herfurth T., Cairns J.S.
    Hum. Genet. 84:465-470(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 1-23.
  10. "Characterization of three serine esterases isolated from human IL-2 activated killer cells."
    Hameed A., Lowrey D.M., Lichtenheld M., Podack E.R.
    J. Immunol. 141:3142-3147(1988) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 21-40, CHARACTERIZATION.
  11. "Characterization of granzymes A and B isolated from granules of cloned human cytotoxic T lymphocytes."
    Kraehenbuhl O., Rey C., Jenne D.E., Lanzavecchia A., Groscurth P., Carrel S., Tschopp J.
    J. Immunol. 141:3471-3477(1988) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 21-40, CHARACTERIZATION.
  12. "Human granzyme B degrades aggrecan proteoglycan in matrix synthesized by chondrocytes."
    Froelich C.J., Zhang X., Turbov J., Hudig D., Winkler U., Hanna W.L.
    J. Immunol. 151:7161-7171(1993) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 21-39, CATALYTIC ACTIVITY, ENZYME REGULATION, SUBCELLULAR LOCATION.
    Tissue: Lymphocyte.
  13. "Human cytotoxic lymphocyte granzyme B. Its purification from granules and the characterization of substrate and inhibitor specificity."
    Poe M., Blake J.T., Boulton D.A., Gammon M., Sigal N.H., Wu J.K., Zweerink H.J.
    J. Biol. Chem. 266:98-103(1991) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 21-38.
  14. "Signal peptide prediction based on analysis of experimentally verified cleavage sites."
    Zhang Z., Henzel W.J.
    Protein Sci. 13:2819-2824(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 19-33.
  15. "Crystal structure of the caspase activator human granzyme B, a proteinase highly specific for an Asp-P1 residue."
    Estebanez-Perpina E., Fuentes-Prior P., Belorgey D., Braun M., Kiefersauer R., Maskos K., Huber R., Rubin H., Bode W.
    Biol. Chem. 381:1203-1214(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (3.1 ANGSTROMS) OF 21-247.
  16. "The three-dimensional structure of human granzyme B compared to caspase-3, key mediators of cell death with cleavage specificity for aspartic acid in P1."
    Rotonda J., Garcia-Calvo M., Bull H.G., Geissler W.M., McKeever B.M., Willoughby C.A., Thornberry N.A., Becker J.W.
    Chem. Biol. 8:357-368(2001) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (2.0 ANGSTROMS) OF 21-247.
  17. "Catalog of 680 variations among eight cytochrome p450 (CYP) genes, nine esterase genes, and two other genes in the Japanese population."
    Saito S., Iida A., Sekine A., Kawauchi S., Higuchi S., Ogawa C., Nakamura Y.
    J. Hum. Genet. 48:249-270(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: VARIANTS GLN-55 AND HIS-247.

Entry informationi

Entry nameiGRAB_HUMAN
AccessioniPrimary (citable) accession number: P10144
Secondary accession number(s): Q8N1D2, Q9UCC1
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 1, 1989
Last sequence update: January 11, 2011
Last modified: June 24, 2015
This is version 174 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 14
    Human chromosome 14: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. Peptidase families
    Classification of peptidase families and list of entries
  7. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.