Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Very-long-chain 3-oxoacyl-CoA reductase

Gene

HSD17B12

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Catalyzes the second of the four reactions of the long-chain fatty acids elongation cycle. This endoplasmic reticulum-bound enzymatic process, allows the addition of two carbons to the chain of long- and very long-chain fatty acids/VLCFAs per cycle. This enzyme has a 3-ketoacyl-CoA reductase activity, reducing 3-ketoacyl-CoA to 3-hydroxyacyl-CoA, within each cycle of fatty acid elongation. Thereby, it may participate to the production of VLCFAs of different chain lengths that are involved in multiple biological processes as precursors of membrane lipids and lipid mediators. May also catalyze the transformation of estrone (E1) into estradiol (E2) and play a role in estrogen formation.2 Publications

Catalytic activityi

A very-long-chain (3R)-3-hydroxyacyl-CoA + NADP+ = a very-long-chain 3-oxoacyl-CoA + NADPH.1 Publication
17-beta-estradiol + NAD(P)+ = estrone + NAD(P)H.1 Publication

Kineticsi

  1. KM=3.5 µM for estrone1 Publication

    Pathway:ifatty acid biosynthesis

    This protein is involved in the pathway fatty acid biosynthesis, which is part of Lipid metabolism.1 Publication
    View all proteins of this organism that are known to be involved in the pathway fatty acid biosynthesis and in Lipid metabolism.

    Pathway:iestrogen biosynthesis

    This protein is involved in the pathway estrogen biosynthesis, which is part of Steroid biosynthesis.1 Publication
    View all proteins of this organism that are known to be involved in the pathway estrogen biosynthesis and in Steroid biosynthesis.

    Sites

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Binding sitei189 – 1891SubstrateBy similarity
    Active sitei202 – 2021Proton acceptorPROSITE-ProRule annotation

    Regions

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Nucleotide bindingi50 – 7930NADPBy similarityAdd
    BLAST

    GO - Molecular functioni

    GO - Biological processi

    Complete GO annotation...

    Keywords - Molecular functioni

    Oxidoreductase

    Keywords - Biological processi

    Lipid biosynthesis, Lipid metabolism, Steroid biosynthesis

    Keywords - Ligandi

    NADP

    Enzyme and pathway databases

    BRENDAi1.1.1.62. 2681.
    ReactomeiREACT_11059. Androgen biosynthesis.
    REACT_380. Synthesis of very long-chain fatty acyl-CoAs.
    SABIO-RKQ53GQ0.
    UniPathwayiUPA00094.
    UPA00769.

    Names & Taxonomyi

    Protein namesi
    Recommended name:
    Very-long-chain 3-oxoacyl-CoA reductaseCurated (EC:1.1.1.3301 Publication)
    Alternative name(s):
    17-beta-hydroxysteroid dehydrogenase 121 Publication
    Short name:
    17-beta-HSD 121 Publication
    3-ketoacyl-CoA reductase1 Publication
    Short name:
    KAR1 Publication
    Estradiol 17-beta-dehydrogenase 121 Publication (EC:1.1.1.621 Publication)
    Short chain dehydrogenase/reductase family 12C member 1
    Gene namesi
    Name:HSD17B12Imported
    Synonyms:SDR12C1
    OrganismiHomo sapiens (Human)
    Taxonomic identifieri9606 [NCBI]
    Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
    ProteomesiUP000005640 Componenti: Chromosome 11

    Organism-specific databases

    HGNCiHGNC:18646. HSD17B12.

    Subcellular locationi

    Topology

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Transmembranei4 – 2421HelicalSequence AnalysisAdd
    BLAST
    Transmembranei182 – 20221HelicalSequence AnalysisAdd
    BLAST
    Transmembranei271 – 29121HelicalSequence AnalysisAdd
    BLAST

    GO - Cellular componenti

    Complete GO annotation...

    Keywords - Cellular componenti

    Endoplasmic reticulum, Membrane

    Pathology & Biotechi

    Mutagenesis

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Mutagenesisi196 – 1961V → W: No effect. 1 Publication
    Mutagenesisi234 – 2341F → A: Allows the conversion of androstenedione to testosterone. 1 Publication

    Organism-specific databases

    PharmGKBiPA38618.

    Polymorphism and mutation databases

    BioMutaiHSD17B12.
    DMDMi158931120.

    PTM / Processingi

    Molecule processing

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Chaini1 – 312312Very-long-chain 3-oxoacyl-CoA reductasePRO_0000248368Add
    BLAST

    Proteomic databases

    MaxQBiQ53GQ0.
    PaxDbiQ53GQ0.
    PRIDEiQ53GQ0.

    PTM databases

    PhosphoSiteiQ53GQ0.

    Expressioni

    Tissue specificityi

    Expressed in most tissues tested. Highly expressed in the ovary and mammary. Expressed in platelets.3 Publications

    Gene expression databases

    BgeeiQ53GQ0.
    CleanExiHS_HSD17B12.
    ExpressionAtlasiQ53GQ0. baseline and differential.
    GenevisibleiQ53GQ0. HS.

    Organism-specific databases

    HPAiHPA016427.

    Interactioni

    Subunit structurei

    Interacts with ELOVL1 and LASS2.1 Publication

    Binary interactionsi

    WithEntry#Exp.IntActNotes
    UBQLN1Q9UMX03EBI-2963255,EBI-741480
    UBQLN1Q9UMX0-23EBI-2963255,EBI-10173939

    Protein-protein interaction databases

    BioGridi119328. 13 interactions.
    IntActiQ53GQ0. 8 interactions.
    STRINGi9606.ENSP00000278353.

    Structurei

    3D structure databases

    ProteinModelPortaliQ53GQ0.
    SMRiQ53GQ0. Positions 53-280.
    ModBaseiSearch...
    MobiDBiSearch...

    Family & Domainsi

    Motif

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Motifi308 – 3125Di-lysine motif

    Domaini

    The di-lysine motif confers endoplasmic reticulum localization for type I membrane proteins.By similarity

    Sequence similaritiesi

    Keywords - Domaini

    Transmembrane, Transmembrane helix

    Phylogenomic databases

    eggNOGiCOG0300.
    GeneTreeiENSGT00390000010069.
    HOGENOMiHOG000039237.
    HOVERGENiHBG005478.
    InParanoidiQ53GQ0.
    KOiK10251.
    OMAiTERSEMS.
    PhylomeDBiQ53GQ0.
    TreeFamiTF314591.

    Family and domain databases

    Gene3Di3.40.50.720. 1 hit.
    InterProiIPR002198. DH_sc/Rdtase_SDR.
    IPR002347. Glc/ribitol_DH.
    IPR016040. NAD(P)-bd_dom.
    IPR020904. Sc_DH/Rdtase_CS.
    [Graphical view]
    PfamiPF00106. adh_short. 1 hit.
    [Graphical view]
    PIRSFiPIRSF000126. 11-beta-HSD1. 1 hit.
    PRINTSiPR00081. GDHRDH.
    PR00080. SDRFAMILY.
    PROSITEiPS00061. ADH_SHORT. 1 hit.
    [Graphical view]

    Sequences (2)i

    Sequence statusi: Complete.

    This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

    Isoform 1 (identifier: Q53GQ0-1) [UniParc]FASTAAdd to basket

    This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

    « Hide

            10         20         30         40         50
    MESALPAAGF LYWVGAGTVA YLALRISYSL FTALRVWGVG NEAGVGPGLG
    60 70 80 90 100
    EWAVVTGSTD GIGKSYAEEL AKHGMKVVLI SRSKDKLDQV SSEIKEKFKV
    110 120 130 140 150
    ETRTIAVDFA SEDIYDKIKT GLAGLEIGIL VNNVGMSYEY PEYFLDVPDL
    160 170 180 190 200
    DNVIKKMINI NILSVCKMTQ LVLPGMVERS KGAILNISSG SGMLPVPLLT
    210 220 230 240 250
    IYSATKTFVD FFSQCLHEEY RSKGVFVQSV LPYFVATKLA KIRKPTLDKP
    260 270 280 290 300
    SPETFVKSAI KTVGLQSRTN GYLIHALMGS IISNLPSWIY LKIVMNMNKS
    310
    TRAHYLKKTK KN
    Length:312
    Mass (Da):34,324
    Last modified:October 2, 2007 - v2
    Checksum:i8518336D7F514E50
    GO
    Isoform 2 (identifier: Q53GQ0-2) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         95-98: KEKF → SNYT
         99-312: Missing.

    Note: No experimental confirmation available.
    Show »
    Length:98
    Mass (Da):10,342
    Checksum:i2EEE844E7FDC97A1
    GO

    Sequence cautioni

    The sequence AK027882 differs from that shown. Reason: Frameshift at position 92. Curated

    Natural variant

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Natural varianti280 – 2801S → L.3 Publications
    Corresponds to variant rs11555762 [ dbSNP | Ensembl ].
    VAR_027277

    Alternative sequence

    Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
    Alternative sequencei95 – 984KEKF → SNYT in isoform 2. 1 PublicationVSP_056380
    Alternative sequencei99 – 312214Missing in isoform 2. 1 PublicationVSP_056381Add
    BLAST

    Sequence databases

    Select the link destinations:
    EMBLi
    GenBanki
    DDBJi
    Links Updated
    AF078850 mRNA. Translation: AAD44482.1.
    AK027882 mRNA. No translation available.
    AK074952 mRNA. Translation: BAG52039.1.
    AK075216 mRNA. Translation: BAG52086.1.
    AK222881 mRNA. Translation: BAD96601.1.
    AK292625 mRNA. Translation: BAF85314.1.
    AC023085 Genomic DNA. No translation available.
    AC068205 Genomic DNA. No translation available.
    AC087521 Genomic DNA. No translation available.
    CH471064 Genomic DNA. Translation: EAW68082.1.
    CH471064 Genomic DNA. Translation: EAW68087.1.
    CH471064 Genomic DNA. Translation: EAW68088.1.
    BC012043 mRNA. Translation: AAH12043.1.
    BC012536 mRNA. Translation: AAH12536.1.
    CCDSiCCDS7905.1. [Q53GQ0-1]
    RefSeqiNP_057226.1. NM_016142.2. [Q53GQ0-1]
    UniGeneiHs.132513.

    Genome annotation databases

    EnsembliENST00000278353; ENSP00000278353; ENSG00000149084.
    ENST00000395700; ENSP00000379052; ENSG00000149084. [Q53GQ0-2]
    GeneIDi51144.
    KEGGihsa:51144.
    UCSCiuc001mxq.4. human. [Q53GQ0-1]

    Keywords - Coding sequence diversityi

    Alternative splicing, Polymorphism

    Cross-referencesi

    Sequence databases

    Select the link destinations:
    EMBLi
    GenBanki
    DDBJi
    Links Updated
    AF078850 mRNA. Translation: AAD44482.1.
    AK027882 mRNA. No translation available.
    AK074952 mRNA. Translation: BAG52039.1.
    AK075216 mRNA. Translation: BAG52086.1.
    AK222881 mRNA. Translation: BAD96601.1.
    AK292625 mRNA. Translation: BAF85314.1.
    AC023085 Genomic DNA. No translation available.
    AC068205 Genomic DNA. No translation available.
    AC087521 Genomic DNA. No translation available.
    CH471064 Genomic DNA. Translation: EAW68082.1.
    CH471064 Genomic DNA. Translation: EAW68087.1.
    CH471064 Genomic DNA. Translation: EAW68088.1.
    BC012043 mRNA. Translation: AAH12043.1.
    BC012536 mRNA. Translation: AAH12536.1.
    CCDSiCCDS7905.1. [Q53GQ0-1]
    RefSeqiNP_057226.1. NM_016142.2. [Q53GQ0-1]
    UniGeneiHs.132513.

    3D structure databases

    ProteinModelPortaliQ53GQ0.
    SMRiQ53GQ0. Positions 53-280.
    ModBaseiSearch...
    MobiDBiSearch...

    Protein-protein interaction databases

    BioGridi119328. 13 interactions.
    IntActiQ53GQ0. 8 interactions.
    STRINGi9606.ENSP00000278353.

    Chemistry

    ChEMBLiCHEMBL5998.

    PTM databases

    PhosphoSiteiQ53GQ0.

    Polymorphism and mutation databases

    BioMutaiHSD17B12.
    DMDMi158931120.

    Proteomic databases

    MaxQBiQ53GQ0.
    PaxDbiQ53GQ0.
    PRIDEiQ53GQ0.

    Protocols and materials databases

    DNASUi51144.
    Structural Biology KnowledgebaseSearch...

    Genome annotation databases

    EnsembliENST00000278353; ENSP00000278353; ENSG00000149084.
    ENST00000395700; ENSP00000379052; ENSG00000149084. [Q53GQ0-2]
    GeneIDi51144.
    KEGGihsa:51144.
    UCSCiuc001mxq.4. human. [Q53GQ0-1]

    Organism-specific databases

    CTDi51144.
    GeneCardsiGC11P043577.
    HGNCiHGNC:18646. HSD17B12.
    HPAiHPA016427.
    MIMi609574. gene.
    neXtProtiNX_Q53GQ0.
    PharmGKBiPA38618.
    GenAtlasiSearch...

    Phylogenomic databases

    eggNOGiCOG0300.
    GeneTreeiENSGT00390000010069.
    HOGENOMiHOG000039237.
    HOVERGENiHBG005478.
    InParanoidiQ53GQ0.
    KOiK10251.
    OMAiTERSEMS.
    PhylomeDBiQ53GQ0.
    TreeFamiTF314591.

    Enzyme and pathway databases

    UniPathwayiUPA00094.
    UPA00769.
    BRENDAi1.1.1.62. 2681.
    ReactomeiREACT_11059. Androgen biosynthesis.
    REACT_380. Synthesis of very long-chain fatty acyl-CoAs.
    SABIO-RKQ53GQ0.

    Miscellaneous databases

    ChiTaRSiHSD17B12. human.
    GeneWikiiHSD17B12.
    GenomeRNAii51144.
    NextBioi54008.
    PROiQ53GQ0.
    SOURCEiSearch...

    Gene expression databases

    BgeeiQ53GQ0.
    CleanExiHS_HSD17B12.
    ExpressionAtlasiQ53GQ0. baseline and differential.
    GenevisibleiQ53GQ0. HS.

    Family and domain databases

    Gene3Di3.40.50.720. 1 hit.
    InterProiIPR002198. DH_sc/Rdtase_SDR.
    IPR002347. Glc/ribitol_DH.
    IPR016040. NAD(P)-bd_dom.
    IPR020904. Sc_DH/Rdtase_CS.
    [Graphical view]
    PfamiPF00106. adh_short. 1 hit.
    [Graphical view]
    PIRSFiPIRSF000126. 11-beta-HSD1. 1 hit.
    PRINTSiPR00081. GDHRDH.
    PR00080. SDRFAMILY.
    PROSITEiPS00061. ADH_SHORT. 1 hit.
    [Graphical view]
    ProtoNetiSearch...

    Publicationsi

    « Hide 'large scale' publications
    1. "Human steroid dehydrogenase homologue, complete cds."
      Liu T., Zhang J., Fu G., Zhang Q., Ye M., Zhou J., Wu J., Shen Y., Yu M., Chen S., Mao M., Chen Z.
      Submitted (JUL-1998) to the EMBL/GenBank/DDBJ databases
      Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
    2. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
      Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
      , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
      Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
      Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1), VARIANT LEU-280.
      Tissue: Liver, Placenta, Thymus and Thyroid.
    3. Suzuki Y., Sugano S., Totoki Y., Toyoda A., Takeda T., Sakaki Y., Tanaka A., Yokoyama S.
      Submitted (APR-2005) to the EMBL/GenBank/DDBJ databases
      Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1), VARIANT LEU-280.
      Tissue: Liver.
    4. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    5. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA], VARIANT LEU-280.
    6. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
      The MGC Project Team
      Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
      Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 2).
      Tissue: Brain and Pancreas.
    7. Bienvenut W.V.
      Submitted (JUN-2005) to UniProtKB
      Cited for: PROTEIN SEQUENCE OF 26-35; 65-72; 104-117; 157-179 AND 207-221, IDENTIFICATION BY MASS SPECTROMETRY.
      Tissue: B-cell lymphoma.
    8. "Identification of two mammalian reductases involved in the two-carbon fatty acyl elongation cascade."
      Moon Y.-A., Horton J.D.
      J. Biol. Chem. 278:7335-7343(2003) [PubMed] [Europe PMC] [Abstract]
      Cited for: FUNCTION, CATALYTIC ACTIVITY, PATHWAY, SUBCELLULAR LOCATION, TISSUE SPECIFICITY.
    9. "Platelets express steroidogenic 17beta-hydroxysteroid dehydrogenases. Distinct profiles predict the essential thrombocythemic phenotype."
      Gnatenko D.V., Cupit L.D., Huang E.C., Dhundale A., Perrotta P.L., Bahou W.F.
      Thromb. Haemost. 94:412-421(2005) [PubMed] [Europe PMC] [Abstract]
      Cited for: TISSUE SPECIFICITY.
    10. "Characterization of type 12 17beta-hydroxysteroid dehydrogenase, an isoform of type 3 17beta-hydroxysteroid dehydrogenase responsible for estradiol formation in women."
      Luu-The V., Tremblay P., Labrie F.
      Mol. Endocrinol. 20:437-443(2006) [PubMed] [Europe PMC] [Abstract]
      Cited for: FUNCTION, CATALYTIC ACTIVITY, PATHWAY, BIOPHYSICOCHEMICAL PROPERTIES, TISSUE SPECIFICITY, MUTAGENESIS OF VAL-196 AND PHE-234.
    11. Cited for: INTERACTION WITH ELOVL1 AND LASS2.
    12. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    13. "An enzyme assisted RP-RPLC approach for in-depth analysis of human liver phosphoproteome."
      Bian Y., Song C., Cheng K., Dong M., Wang F., Huang J., Sun D., Wang L., Ye M., Zou H.
      J. Proteomics 96:253-262(2014) [PubMed] [Europe PMC] [Abstract]
      Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
      Tissue: Liver.

    Entry informationi

    Entry nameiDHB12_HUMAN
    AccessioniPrimary (citable) accession number: Q53GQ0
    Secondary accession number(s): A8K9B0
    , D3DR23, Q96EA9, Q96JU2, Q9Y6G8
    Entry historyi
    Integrated into UniProtKB/Swiss-Prot: September 5, 2006
    Last sequence update: October 2, 2007
    Last modified: July 22, 2015
    This is version 107 of the entry and version 2 of the sequence. [Complete history]
    Entry statusiReviewed (UniProtKB/Swiss-Prot)
    Annotation programChordata Protein Annotation Program
    DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

    Miscellaneousi

    Keywords - Technical termi

    Complete proteome, Direct protein sequencing, Reference proteome

    Documents

    1. Human chromosome 11
      Human chromosome 11: entries, gene names and cross-references to MIM
    2. Human entries with polymorphisms or disease mutations
      List of human entries with polymorphisms or disease mutations
    3. Human polymorphisms and disease mutations
      Index of human polymorphisms and disease mutations
    4. MIM cross-references
      Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
    5. PATHWAY comments
      Index of metabolic and biosynthesis pathways
    6. SIMILARITY comments
      Index of protein domains and families

    External Data

    Dasty 3

    Similar proteinsi

    Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
    100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
    90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
    50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.