Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Hepatocyte growth factor

Gene

Hgf

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Potent mitogen for mature parenchymal hepatocyte cells, seems to be a hepatotrophic factor, and acts as a growth factor for a broad spectrum of tissues and cell types. Activating ligand for the receptor tyrosine kinase MET by binding to it and promoting its dimerization.1 Publication

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Growth factor, Serine protease homolog

Enzyme and pathway databases

ReactomeiREACT_276590. Interleukin-7 signaling.
REACT_307071. Platelet degranulation.

Protein family/group databases

MEROPSiS01.982.

Names & Taxonomyi

Protein namesi
Recommended name:
Hepatocyte growth factor
Alternative name(s):
Hepatopoietin-A
Scatter factor
Short name:
SF
Cleaved into the following 2 chains:
Gene namesi
Name:Hgf
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
ProteomesiUP000000589 Componenti: Chromosome 5

Organism-specific databases

MGIiMGI:96079. Hgf.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 3232By similarityAdd
BLAST
Chaini33 – 495463Hepatocyte growth factor alpha chainPRO_0000028093Add
BLAST
Chaini496 – 728233Hepatocyte growth factor beta chainPRO_0000028094Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei33 – 331Pyrrolidone carboxylic acidBy similarity
Disulfide bondi71 ↔ 97
Disulfide bondi75 ↔ 85
Disulfide bondi129 ↔ 207
Disulfide bondi150 ↔ 190
Disulfide bondi178 ↔ 202
Disulfide bondi212 ↔ 289By similarity
Disulfide bondi233 ↔ 272By similarity
Disulfide bondi261 ↔ 284By similarity
Glycosylationi295 – 2951N-linked (GlcNAc...)Sequence Analysis
Disulfide bondi306 ↔ 384By similarity
Disulfide bondi327 ↔ 366By similarity
Disulfide bondi355 ↔ 378By similarity
Disulfide bondi392 ↔ 470By similarity
Glycosylationi403 – 4031N-linked (GlcNAc...)Sequence Analysis
Disulfide bondi413 ↔ 453By similarity
Disulfide bondi441 ↔ 465By similarity
Disulfide bondi488 ↔ 607Interchain (between alpha and beta chains)PROSITE-ProRule annotation
Disulfide bondi520 ↔ 536By similarity
Glycosylationi569 – 5691N-linked (GlcNAc...)Sequence Analysis
Disulfide bondi615 ↔ 682By similarity
Disulfide bondi645 ↔ 661By similarity
Glycosylationi656 – 6561N-linked (GlcNAc...)Sequence Analysis
Disulfide bondi672 ↔ 700By similarity

Keywords - PTMi

Disulfide bond, Glycoprotein, Pyrrolidone carboxylic acid

Proteomic databases

MaxQBiQ08048.
PRIDEiQ08048.

PTM databases

PhosphoSiteiQ08048.

Expressioni

Gene expression databases

BgeeiQ08048.
CleanExiMM_HGF.
GenevestigatoriQ08048.

Interactioni

Subunit structurei

Dimer of an alpha chain and a beta chain linked by a disulfide bond. Interacts with SRPX2; the interaction increases HGF mitogenic activity.

Protein-protein interaction databases

DIPiDIP-46456N.

Structurei

Secondary structure

1
728
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Helixi40 – 423Combined sources
Beta strandi43 – 5210Combined sources
Beta strandi55 – 573Combined sources
Beta strandi61 – 644Combined sources
Helixi68 – 7710Combined sources
Turni78 – 803Combined sources
Beta strandi87 – 915Combined sources
Turni92 – 954Combined sources
Beta strandi96 – 1016Combined sources
Beta strandi106 – 1083Combined sources
Beta strandi110 – 12213Combined sources
Helixi123 – 1253Combined sources
Beta strandi129 – 1335Combined sources
Beta strandi149 – 1513Combined sources
Beta strandi157 – 1593Combined sources
Turni165 – 1673Combined sources
Beta strandi188 – 1947Combined sources
Beta strandi198 – 2014Combined sources
Turni207 – 2093Combined sources
Beta strandi240 – 2423Combined sources
Helixi248 – 2503Combined sources
Turni252 – 2554Combined sources
Beta strandi271 – 2755Combined sources
Beta strandi280 – 2834Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
2QJ4X-ray2.50A/B29-210[»]
3HMRX-ray2.00A31-127[»]
4IUAX-ray3.05A/B/C/D/E/F/G/H31-290[»]
ProteinModelPortaliQ08048.
SMRiQ08048. Positions 37-725.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ08048.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini38 – 12487PANPROSITE-ProRule annotationAdd
BLAST
Domaini129 – 20779Kringle 1PROSITE-ProRule annotationAdd
BLAST
Domaini212 – 28978Kringle 2PROSITE-ProRule annotationAdd
BLAST
Domaini306 – 38479Kringle 3PROSITE-ProRule annotationAdd
BLAST
Domaini392 – 47079Kringle 4PROSITE-ProRule annotationAdd
BLAST
Domaini496 – 724229Peptidase S1PROSITE-ProRule annotationAdd
BLAST

Sequence similaritiesi

Belongs to the peptidase S1 family. Plasminogen subfamily.PROSITE-ProRule annotation
Contains 4 kringle domains.PROSITE-ProRule annotation
Contains 1 PAN domain.PROSITE-ProRule annotation
Contains 1 peptidase S1 domain.PROSITE-ProRule annotation

Keywords - Domaini

Kringle, Repeat, Signal

Phylogenomic databases

eggNOGiCOG5640.
GeneTreeiENSGT00760000119133.
HOGENOMiHOG000112892.
HOVERGENiHBG004381.
InParanoidiQ08048.
KOiK05460.
OMAiGSESPWC.
OrthoDBiEOG75B84T.
PhylomeDBiQ08048.
TreeFamiTF329901.

Family and domain databases

InterProiIPR027284. Hepatocyte_GF.
IPR024174. HGF-like.
IPR000001. Kringle.
IPR013806. Kringle-like.
IPR018056. Kringle_CS.
IPR003014. PAN-1_domain.
IPR003609. Pan_app.
IPR001254. Peptidase_S1.
IPR001314. Peptidase_S1A.
IPR009003. Trypsin-like_Pept_dom.
[Graphical view]
PfamiPF00051. Kringle. 4 hits.
PF00024. PAN_1. 1 hit.
PF00089. Trypsin. 1 hit.
[Graphical view]
PIRSFiPIRSF500183. Hepatocyte_GF. 1 hit.
PIRSF001152. HGF_MST1. 1 hit.
PRINTSiPR00722. CHYMOTRYPSIN.
SMARTiSM00130. KR. 4 hits.
SM00473. PAN_AP. 1 hit.
SM00020. Tryp_SPc. 1 hit.
[Graphical view]
SUPFAMiSSF50494. SSF50494. 1 hit.
SSF57440. SSF57440. 4 hits.
PROSITEiPS00021. KRINGLE_1. 4 hits.
PS50070. KRINGLE_2. 4 hits.
PS50948. PAN. 1 hit.
PS50240. TRYPSIN_DOM. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform Long (identifier: Q08048-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MMWGTKLLPV LLLQHVLLHL LLLHVAIPYA EGQKKRRNTL HEFKKSAKTT
60 70 80 90 100
LTKEDPLLKI KTKKVNSADE CANRCIRNRG FTFTCKAFVF DKSRKRCYWY
110 120 130 140 150
PFNSMSSGVK KGFGHEFDLY ENKDYIRNCI IGKGGSYKGT VSITKSGIKC
160 170 180 190 200
QPWNSMIPHE HSFLPSSYRG KDLQENYCRN PRGEEGGPWC FTSNPEVRYE
210 220 230 240 250
VCDIPQCSEV ECMTCNGESY RGPMDHTESG KTCQRWDQQT PHRHKFLPER
260 270 280 290 300
YPDKGFDDNY CRNPDGKPRP WCYTLDPDTP WEYCAIKTCA HSAVNETDVP
310 320 330 340 350
METTECIQGQ GEGYRGTSNT IWNGIPCQRW DSQYPHKHDI TPENFKCKDL
360 370 380 390 400
RENYCRNPDG AESPWCFTTD PNIRVGYCSQ IPKCDVSSGQ DCYRGNGKNY
410 420 430 440 450
MGNLSKTRSG LTCSMWDKNM EDLHRHIFWE PDASKLNKNY CRNPDDDAHG
460 470 480 490 500
PWCYTGNPLI PWDYCPISRC EGDTTPTIVN LDHPVISCAK TKQLRVVNGI
510 520 530 540 550
PTQTTVGWMV SLKYRNKHIC GGSLIKESWV LTARQCFPAR NKDLKDYEAW
560 570 580 590 600
LGIHDVHERG EEKRKQILNI SQLVYGPEGS DLVLLKLARP AILDNFVSTI
610 620 630 640 650
DLPSYGCTIP EKTTCSIYGW GYTGLINADG LLRVAHLYIM GNEKCSQHHQ
660 670 680 690 700
GKVTLNESEL CAGAEKIGSG PCEGDYGGPL ICEQHKMRMV LGVIVPGRGC
710 720
AIPNRPGIFV RVAYYAKWIH KVILTYKL
Length:728
Mass (Da):82,945
Last modified:November 1, 1995 - v1
Checksum:iA0381FC497534328
GO
Isoform Short (identifier: Q08048-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     163-167: Missing.

Show »
Length:723
Mass (Da):82,413
Checksum:i7BC19A16C399EDD2
GO
Isoform NK1 (identifier: Q08048-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     210-211: VE → GK
     212-728: Missing.

Show »
Length:211
Mass (Da):24,308
Checksum:i758BD0687A835F48
GO

Sequence cautioni

The sequence CAA51054.1 differs from that shown. Reason: Erroneous initiation. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti189 – 1891W → K AA sequence (PubMed:2144630).Curated
Sequence conflicti344 – 3441N → K in AAB31855 (PubMed:8081873).Curated
Sequence conflicti378 – 3781C → E AA sequence (PubMed:2144630).Curated
Sequence conflicti380 – 3801Q → E AA sequence (PubMed:2144630).Curated
Sequence conflicti479 – 4791V → L in AAB31855 (PubMed:8081873).Curated
Sequence conflicti513 – 5131K → L AA sequence (PubMed:2142751).Curated
Sequence conflicti518 – 5181H → T AA sequence (PubMed:1831975).Curated
Sequence conflicti564 – 5641R → H in CAA51054 (PubMed:8241272).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei163 – 1675Missing in isoform Short. 1 PublicationVSP_005408
Alternative sequencei210 – 2112VE → GK in isoform NK1. 1 PublicationVSP_044345
Alternative sequencei212 – 728517Missing in isoform NK1. 1 PublicationVSP_044346Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D10212 mRNA. Translation: BAA01064.1.
D10213 mRNA. Translation: BAA01065.1.
S71816 mRNA. Translation: AAB31855.1.
X72307 mRNA. Translation: CAA51054.1. Different initiation.
AF042856 mRNA. Translation: AAC40051.1.
X84046 mRNA. Translation: CAA58865.1.
CH466586 Genomic DNA. Translation: EDL03238.1.
BC119228 mRNA. Translation: AAI19229.1.
X81630 Genomic DNA. Translation: CAA57286.1.
CCDSiCCDS19097.1. [Q08048-1]
PIRiJC2117. A60185.
RefSeqiNP_001276387.1. NM_001289458.1. [Q08048-1]
NP_001276388.1. NM_001289459.1. [Q08048-1]
NP_001276389.1. NM_001289460.1. [Q08048-3]
NP_001276390.1. NM_001289461.1. [Q08048-2]
NP_034557.3. NM_010427.5. [Q08048-1]
XP_006535691.1. XM_006535628.2. [Q08048-1]
UniGeneiMm.267078.

Genome annotation databases

EnsembliENSMUST00000030683; ENSMUSP00000030683; ENSMUSG00000028864. [Q08048-1]
GeneIDi15234.
KEGGimmu:15234.
UCSCiuc008wne.1. mouse.
uc008wnf.1. mouse. [Q08048-1]
uc008wnj.1. mouse. [Q08048-2]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
D10212 mRNA. Translation: BAA01064.1.
D10213 mRNA. Translation: BAA01065.1.
S71816 mRNA. Translation: AAB31855.1.
X72307 mRNA. Translation: CAA51054.1. Different initiation.
AF042856 mRNA. Translation: AAC40051.1.
X84046 mRNA. Translation: CAA58865.1.
CH466586 Genomic DNA. Translation: EDL03238.1.
BC119228 mRNA. Translation: AAI19229.1.
X81630 Genomic DNA. Translation: CAA57286.1.
CCDSiCCDS19097.1. [Q08048-1]
PIRiJC2117. A60185.
RefSeqiNP_001276387.1. NM_001289458.1. [Q08048-1]
NP_001276388.1. NM_001289459.1. [Q08048-1]
NP_001276389.1. NM_001289460.1. [Q08048-3]
NP_001276390.1. NM_001289461.1. [Q08048-2]
NP_034557.3. NM_010427.5. [Q08048-1]
XP_006535691.1. XM_006535628.2. [Q08048-1]
UniGeneiMm.267078.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
2QJ4X-ray2.50A/B29-210[»]
3HMRX-ray2.00A31-127[»]
4IUAX-ray3.05A/B/C/D/E/F/G/H31-290[»]
ProteinModelPortaliQ08048.
SMRiQ08048. Positions 37-725.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

DIPiDIP-46456N.

Protein family/group databases

MEROPSiS01.982.

PTM databases

PhosphoSiteiQ08048.

Proteomic databases

MaxQBiQ08048.
PRIDEiQ08048.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000030683; ENSMUSP00000030683; ENSMUSG00000028864. [Q08048-1]
GeneIDi15234.
KEGGimmu:15234.
UCSCiuc008wne.1. mouse.
uc008wnf.1. mouse. [Q08048-1]
uc008wnj.1. mouse. [Q08048-2]

Organism-specific databases

CTDi3082.
MGIiMGI:96079. Hgf.

Phylogenomic databases

eggNOGiCOG5640.
GeneTreeiENSGT00760000119133.
HOGENOMiHOG000112892.
HOVERGENiHBG004381.
InParanoidiQ08048.
KOiK05460.
OMAiGSESPWC.
OrthoDBiEOG75B84T.
PhylomeDBiQ08048.
TreeFamiTF329901.

Enzyme and pathway databases

ReactomeiREACT_276590. Interleukin-7 signaling.
REACT_307071. Platelet degranulation.

Miscellaneous databases

EvolutionaryTraceiQ08048.
NextBioi287831.
PROiQ08048.
SOURCEiSearch...

Gene expression databases

BgeeiQ08048.
CleanExiMM_HGF.
GenevestigatoriQ08048.

Family and domain databases

InterProiIPR027284. Hepatocyte_GF.
IPR024174. HGF-like.
IPR000001. Kringle.
IPR013806. Kringle-like.
IPR018056. Kringle_CS.
IPR003014. PAN-1_domain.
IPR003609. Pan_app.
IPR001254. Peptidase_S1.
IPR001314. Peptidase_S1A.
IPR009003. Trypsin-like_Pept_dom.
[Graphical view]
PfamiPF00051. Kringle. 4 hits.
PF00024. PAN_1. 1 hit.
PF00089. Trypsin. 1 hit.
[Graphical view]
PIRSFiPIRSF500183. Hepatocyte_GF. 1 hit.
PIRSF001152. HGF_MST1. 1 hit.
PRINTSiPR00722. CHYMOTRYPSIN.
SMARTiSM00130. KR. 4 hits.
SM00473. PAN_AP. 1 hit.
SM00020. Tryp_SPc. 1 hit.
[Graphical view]
SUPFAMiSSF50494. SSF50494. 1 hit.
SSF57440. SSF57440. 4 hits.
PROSITEiPS00021. KRINGLE_1. 4 hits.
PS50070. KRINGLE_2. 4 hits.
PS50948. PAN. 1 hit.
PS50240. TRYPSIN_DOM. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Identification of mouse mammary fibroblast-derived mammary growth factor as hepatocyte growth factor."
    Sasaki M., Nishio M., Sasaki T., Enami J.
    Biochem. Biophys. Res. Commun. 199:772-779(1994) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS LONG AND SHORT), PROTEIN SEQUENCE OF 496-504.
    Tissue: Mammary fibroblast.
  2. "Structure, genetic mapping, and expression of the mouse Hgf/scatter factor gene."
    Lee C.C., Kozak C.A., Yamada K.M.
    Cell Adhes. Commun. 1:101-111(1993) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
    Tissue: Liver.
  3. "Molecular cloning and characterization of cDNA encoding mouse hepatocyte growth factor."
    Liu Y., Michalopoulos G.K., Zarnegar R.
    Biochim. Biophys. Acta 1216:299-303(1993) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
    Tissue: Liver.
  4. "NK1, a natural splice variant of hepatocyte growth factor/scatter factor, is a partial agonist in vivo."
    Jakubczak J.L., LaRochelle W.J., Merlino G.
    Mol. Cell. Biol. 18:1275-1283(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM NK1).
    Tissue: Lung.
  5. Sharpe M.J.S., Lane K., Gherardi E.
    Submitted (JAN-1999) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM LONG).
    Strain: Swiss.
    Tissue: Fibroblast.
  6. Mural R.J., Adams M.D., Myers E.W., Smith H.O., Venter J.C.
    Submitted (SEP-2005) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  7. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM LONG).
    Tissue: Brain.
  8. "Characterisation of the scatter factor/hepatocyte growth factor gene promoter: positive and negative regulatory elements direct gene expression to mesenchymal cells."
    Plaschke-Schluetter A., Behrens J., Gherardi E., Birchmeier W.
    J. Biol. Chem. 270:830-836(1995) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1-30.
    Strain: 129.
  9. "Purified scatter factor stimulates epithelial and vascular endothelial cell migration."
    Rosen E.M., Meromsky L., Setter E., Vinter D.W., Goldberg I.D.
    Proc. Soc. Exp. Biol. Med. 195:34-43(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 184-197; 357-367; 375-383 AND 653-666.
  10. Cited for: PROTEIN SEQUENCE OF 496-519.
  11. "Purification and characterization of biologically active scatter factor from ras-transformed NIH 3T3 conditioned medium."
    Coffer A., Fellows J., Young S., Pappin D., Rahman D.
    Biochem. J. 278:35-41(1991) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 496-519.
  12. "A mechanistic basis for converting a receptor tyrosine kinase agonist to an antagonist."
    Tolbert W.D., Daugherty J., Gao C., Xie Q., Miranti C., Gherardi E., Woude G.V., Xu H.E.
    Proc. Natl. Acad. Sci. U.S.A. 104:14592-14597(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (2.5 ANGSTROMS) OF 29-210, DISULFIDE BONDS.
  13. Cited for: X-RAY CRYSTALLOGRAPHY (2.0 ANGSTROMS) OF 31-127, DISULFIDE BONDS, FUNCTION.

Entry informationi

Entry nameiHGF_MOUSE
AccessioniPrimary (citable) accession number: Q08048
Secondary accession number(s): O55027
, Q53WS5, Q61662, Q64007, Q6LBE6
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 1, 1995
Last sequence update: November 1, 1995
Last modified: April 1, 2015
This is version 148 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Caution

Has lost two of the three essential catalytic residues and so probably has no enzymatic activity.Curated

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. Peptidase families
    Classification of peptidase families and list of entries
  4. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.