Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Beta-1,4-galactosyltransferase 7

Gene

B4GALT7

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Required for the biosynthesis of the tetrasaccharide linkage region of proteoglycans, especially for small proteoglycans in skin fibroblasts.1 Publication

Catalytic activityi

UDP-alpha-D-galactose + O-beta-D-xylosyl-[protein] = UDP + 4-beta-D-galactosyl-O-beta-D-xylosyl-[protein].1 Publication

Cofactori

Mn2+1 Publication

Pathwayi

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Metal bindingi165 – 1651Manganese
Binding sitei194 – 1941UDP-alpha-D-galactose1 Publication
Binding sitei224 – 2241UDP-alpha-D-galactose1 Publication
Metal bindingi257 – 2571Manganese; via tele nitrogen
Binding sitei266 – 2661UDP-alpha-D-galactose1 Publication

GO - Molecular functioni

  1. beta-N-acetylglucosaminylglycopeptide beta-1,4-galactosyltransferase activity Source: UniProtKB
  2. galactosyltransferase activity Source: UniProtKB
  3. manganese ion binding Source: UniProtKB
  4. xylosylprotein 4-beta-galactosyltransferase activity Source: UniProtKB

GO - Biological processi

  1. carbohydrate metabolic process Source: Reactome
  2. cellular protein modification process Source: ProtInc
  3. chondroitin sulfate metabolic process Source: Reactome
  4. extracellular fibril organization Source: UniProtKB
  5. glycosaminoglycan biosynthetic process Source: UniProtKB
  6. glycosaminoglycan metabolic process Source: Reactome
  7. negative regulation of fibroblast proliferation Source: UniProtKB
  8. pathogenesis Source: Reactome
  9. protein N-linked glycosylation Source: UniProtKB
  10. proteoglycan metabolic process Source: UniProtKB
  11. small molecule metabolic process Source: Reactome
Complete GO annotation...

Keywords - Molecular functioni

Glycosyltransferase, Transferase

Keywords - Ligandi

Manganese, Metal-binding

Enzyme and pathway databases

BioCyciMetaCyc:HS00459-MONOMER.
BRENDAi2.4.1.133. 2681.
ReactomeiREACT_121408. A tetrasaccharide linker sequence is required for GAG synthesis.
REACT_268749. Defective B4GALT7 causes EDS, progeroid type.
UniPathwayiUPA00378.

Protein family/group databases

CAZyiGT7. Glycosyltransferase Family 7.

Names & Taxonomyi

Protein namesi
Recommended name:
Beta-1,4-galactosyltransferase 7 (EC:2.4.1.-)
Short name:
Beta-1,4-GalTase 7
Short name:
Beta4Gal-T7
Short name:
b4Gal-T7
Alternative name(s):
UDP-Gal:beta-GlcNAc beta-1,4-galactosyltransferase 7
UDP-galactose:beta-N-acetylglucosamine beta-1,4-galactosyltransferase 7
Including the following 1 domains:
Xylosylprotein 4-beta-galactosyltransferase (EC:2.4.1.133)
Alternative name(s):
Proteoglycan UDP-galactose:beta-xylose beta1,4-galactosyltransferase I
UDP-galactose:beta-xylose beta-1,4-galactosyltransferase
XGPT
XGalT-1
Xylosylprotein beta-1,4-galactosyltransferase
Gene namesi
Name:B4GALT7
Synonyms:XGALT1
ORF Names:UNQ748/PRO1478
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 5

Organism-specific databases

HGNCiHGNC:930. B4GALT7.

Subcellular locationi

Topology

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Topological domaini1 – 3030CytoplasmicSequence AnalysisAdd
BLAST
Transmembranei31 – 5121Helical; Signal-anchor for type II membrane proteinSequence AnalysisAdd
BLAST
Topological domaini52 – 327276LumenalSequence AnalysisAdd
BLAST

GO - Cellular componenti

  1. Golgi apparatus Source: UniProtKB
  2. Golgi cisterna membrane Source: UniProtKB-SubCell
  3. Golgi membrane Source: Reactome
  4. integral component of membrane Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Golgi apparatus, Membrane

Pathology & Biotechi

Involvement in diseasei

Ehlers-Danlos syndrome, progeroid type, 1 (EDSP1)1 Publication

The disease is caused by mutations affecting the gene represented in this entry.

Disease descriptionA variant form of Ehlers-Danlos syndrome characterized by progeroid facies, mild mental retardation, short stature, skin hyperextensibility, moderate skin fragility, joint hypermobility principally in digits.

See also OMIM:130070
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti186 – 1861A → D in EDSP1. 1 Publication
VAR_010293
Natural varianti206 – 2061L → P in EDSP1. 1 Publication
VAR_010294

Keywords - Diseasei

Disease mutation, Ehlers-Danlos syndrome

Organism-specific databases

MIMi130070. phenotype.
Orphaneti75496. Ehlers-Danlos syndrome, progeroid type.
294049. Reunion Island's Larsen syndrome.
PharmGKBiPA25229.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 327327Beta-1,4-galactosyltransferase 7PRO_0000080550Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Glycosylationi154 – 1541N-linked (GlcNAc...)Sequence Analysis
Disulfide bondi316 ↔ 324

Keywords - PTMi

Disulfide bond, Glycoprotein

Proteomic databases

MaxQBiQ9UBV7.
PaxDbiQ9UBV7.
PRIDEiQ9UBV7.

PTM databases

PhosphoSiteiQ9UBV7.

Expressioni

Tissue specificityi

High expression in heart, pancreas and liver, medium in placenta and kidney, low in brain, skeletal muscle and lung.

Gene expression databases

BgeeiQ9UBV7.
CleanExiHS_B4GALT7.
ExpressionAtlasiQ9UBV7. baseline and differential.
GenevestigatoriQ9UBV7.

Organism-specific databases

HPAiHPA042330.

Interactioni

Protein-protein interaction databases

BioGridi116441. 7 interactions.
STRINGi9606.ENSP00000029410.

Structurei

Secondary structure

1
327
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Helixi82 – 843Combined sources
Helixi88 – 903Combined sources
Beta strandi95 – 1039Combined sources
Helixi105 – 12117Combined sources
Beta strandi126 – 1338Combined sources
Beta strandi135 – 1373Combined sources
Helixi141 – 15111Combined sources
Beta strandi158 – 1625Combined sources
Beta strandi166 – 1683Combined sources
Beta strandi183 – 1864Combined sources
Turni188 – 1903Combined sources
Beta strandi191 – 1933Combined sources
Beta strandi202 – 2076Combined sources
Helixi208 – 2136Combined sources
Beta strandi223 – 2264Combined sources
Helixi227 – 23711Combined sources
Helixi251 – 2533Combined sources
Beta strandi254 – 2574Combined sources
Turni261 – 2633Combined sources
Helixi274 – 2774Combined sources
Turni287 – 2893Combined sources
Beta strandi292 – 30211Combined sources
Beta strandi305 – 31410Combined sources
Turni318 – 3203Combined sources
Helixi322 – 3254Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
4IRPX-ray2.10A/B81-327[»]
4IRQX-ray2.30A/B/C/D81-327[»]
ProteinModelPortaliQ9UBV7.
SMRiQ9UBV7. Positions 81-327.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni100 – 1045UDP-alpha-D-galactose binding
Regioni139 – 1413UDP-alpha-D-galactose bindingBy similarity
Regioni164 – 1652UDP-alpha-D-galactose binding
Regioni226 – 2294N-acetyl-D-glucosamine bindingBy similarity
Regioni257 – 2593UDP-alpha-D-galactose binding

Sequence similaritiesi

Belongs to the glycosyltransferase 7 family.Curated

Keywords - Domaini

Signal-anchor, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiNOG305756.
GeneTreeiENSGT00760000119140.
HOGENOMiHOG000286021.
HOVERGENiHBG050654.
InParanoidiQ9UBV7.
KOiK00733.
OMAiFRFNRAS.
OrthoDBiEOG7C5M8P.
PhylomeDBiQ9UBV7.
TreeFamiTF312834.

Family and domain databases

Gene3Di3.90.550.10. 1 hit.
InterProiIPR003859. Galactosyl_T.
IPR027791. Galactosyl_T_C.
IPR027995. Galactosyl_T_N.
IPR029044. Nucleotide-diphossugar_trans.
[Graphical view]
PANTHERiPTHR19300. PTHR19300. 1 hit.
PfamiPF02709. Glyco_transf_7C. 1 hit.
PF13733. Glyco_transf_7N. 1 hit.
[Graphical view]
PRINTSiPR02050. B14GALTRFASE.
SUPFAMiSSF53448. SSF53448. 1 hit.

Sequencei

Sequence statusi: Complete.

Q9UBV7-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MFPSRRKAAQ LPWEDGRSGL LSGGLPRKCS VFHLFVACLS LGFFSLLWLQ
60 70 80 90 100
LSCSGDVARA VRGQGQETSG PPRACPPEPP PEHWEEDASW GPHRLAVLVP
110 120 130 140 150
FRERFEELLV FVPHMRRFLS RKKIRHHIYV LNQVDHFRFN RAALINVGFL
160 170 180 190 200
ESSNSTDYIA MHDVDLLPLN EELDYGFPEA GPFHVASPEL HPLYHYKTYV
210 220 230 240 250
GGILLLSKQH YRLCNGMSNR FWGWGREDDE FYRRIKGAGL QLFRPSGITT
260 270 280 290 300
GYKTFRHLHD PAWRKRDQKR IAAQKQEQFK VDREGGLNTV KYHVASRTAL
310 320
SVGGAPCTVL NIMLDCDKTA TPWCTFS
Length:327
Mass (Da):37,406
Last modified:April 30, 2000 - v1
Checksum:i2EDF51A2F8143135
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti147 – 1471V → L in AAF22225 (Ref. 3) Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti186 – 1861A → D in EDSP1. 1 Publication
VAR_010293
Natural varianti206 – 2061L → P in EDSP1. 1 Publication
VAR_010294

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ005382 mRNA. Translation: CAB56424.1.
AB028600 mRNA. Translation: BAA83414.1.
AF142675 mRNA. Translation: AAF22225.1.
AY358578 mRNA. Translation: AAQ88941.1.
AK023506 mRNA. Translation: BAG51201.1.
CH471195 Genomic DNA. Translation: EAW84965.1.
BC007317 mRNA. Translation: AAH07317.1.
BC062983 mRNA. Translation: AAH62983.1.
BC072403 mRNA. Translation: AAH72403.1.
CCDSiCCDS4429.1.
RefSeqiNP_009186.1. NM_007255.2.
UniGeneiHs.455109.

Genome annotation databases

EnsembliENST00000029410; ENSP00000029410; ENSG00000027847.
GeneIDi11285.
KEGGihsa:11285.
UCSCiuc003mhy.3. human.

Polymorphism databases

DMDMi13123990.

Cross-referencesi

Web resourcesi

GGDB

GlycoGene database

Functional Glycomics Gateway - GTase

Beta-1,4-galactosyltransferase 7

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ005382 mRNA. Translation: CAB56424.1.
AB028600 mRNA. Translation: BAA83414.1.
AF142675 mRNA. Translation: AAF22225.1.
AY358578 mRNA. Translation: AAQ88941.1.
AK023506 mRNA. Translation: BAG51201.1.
CH471195 Genomic DNA. Translation: EAW84965.1.
BC007317 mRNA. Translation: AAH07317.1.
BC062983 mRNA. Translation: AAH62983.1.
BC072403 mRNA. Translation: AAH72403.1.
CCDSiCCDS4429.1.
RefSeqiNP_009186.1. NM_007255.2.
UniGeneiHs.455109.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
4IRPX-ray2.10A/B81-327[»]
4IRQX-ray2.30A/B/C/D81-327[»]
ProteinModelPortaliQ9UBV7.
SMRiQ9UBV7. Positions 81-327.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi116441. 7 interactions.
STRINGi9606.ENSP00000029410.

Protein family/group databases

CAZyiGT7. Glycosyltransferase Family 7.

PTM databases

PhosphoSiteiQ9UBV7.

Polymorphism databases

DMDMi13123990.

Proteomic databases

MaxQBiQ9UBV7.
PaxDbiQ9UBV7.
PRIDEiQ9UBV7.

Protocols and materials databases

DNASUi11285.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000029410; ENSP00000029410; ENSG00000027847.
GeneIDi11285.
KEGGihsa:11285.
UCSCiuc003mhy.3. human.

Organism-specific databases

CTDi11285.
GeneCardsiGC05P177027.
HGNCiHGNC:930. B4GALT7.
HPAiHPA042330.
MIMi130070. phenotype.
604327. gene.
neXtProtiNX_Q9UBV7.
Orphaneti75496. Ehlers-Danlos syndrome, progeroid type.
294049. Reunion Island's Larsen syndrome.
PharmGKBiPA25229.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiNOG305756.
GeneTreeiENSGT00760000119140.
HOGENOMiHOG000286021.
HOVERGENiHBG050654.
InParanoidiQ9UBV7.
KOiK00733.
OMAiFRFNRAS.
OrthoDBiEOG7C5M8P.
PhylomeDBiQ9UBV7.
TreeFamiTF312834.

Enzyme and pathway databases

UniPathwayiUPA00378.
BioCyciMetaCyc:HS00459-MONOMER.
BRENDAi2.4.1.133. 2681.
ReactomeiREACT_121408. A tetrasaccharide linker sequence is required for GAG synthesis.
REACT_268749. Defective B4GALT7 causes EDS, progeroid type.

Miscellaneous databases

GeneWikiiB4GALT7.
GenomeRNAii11285.
NextBioi42963.
PROiQ9UBV7.
SOURCEiSearch...

Gene expression databases

BgeeiQ9UBV7.
CleanExiHS_B4GALT7.
ExpressionAtlasiQ9UBV7. baseline and differential.
GenevestigatoriQ9UBV7.

Family and domain databases

Gene3Di3.90.550.10. 1 hit.
InterProiIPR003859. Galactosyl_T.
IPR027791. Galactosyl_T_C.
IPR027995. Galactosyl_T_N.
IPR029044. Nucleotide-diphossugar_trans.
[Graphical view]
PANTHERiPTHR19300. PTHR19300. 1 hit.
PfamiPF02709. Glyco_transf_7C. 1 hit.
PF13733. Glyco_transf_7N. 1 hit.
[Graphical view]
PRINTSiPR02050. B14GALTRFASE.
SUPFAMiSSF53448. SSF53448. 1 hit.
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Cloning and expression of a proteoglycan UDP-galactose:beta-xylose beta1,4-galactosyltransferase I. A seventh member of the human beta4-galactosyltransferase gene family."
    Almeida R., Levery S.B., Mandel U., Kresse H., Schwientek T., Bennett E.P., Clausen H.
    J. Biol. Chem. 274:26165-26171(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
  2. "Human homolog of Caenorhabditis elegans sqv-3 gene is galactosyltransferase I involved in the biosynthesis of the glycosaminoglycan-protein linkage region of proteoglycans."
    Okajima T., Yoshida K., Kondo T., Furukawa K.
    J. Biol. Chem. 274:22915-22918(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
    Tissue: Melanoma.
  3. "Human beta-1,4-galactosyltransferase VII."
    Lo N.-W., Shaper N.L., Shaper J.H.
    Submitted (MAR-1999) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
  4. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
  5. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Placenta.
  6. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  7. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Pancreas and Skin.
  8. "Crystal structures of beta-1,4-galactosyltransferase 7 enzyme reveal conformational changes and substrate Binding."
    Tsutsui Y., Ramakrishnan B., Qasba P.K.
    J. Biol. Chem. 288:31963-31970(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (2.1 ANGSTROMS) OF 81-327 IN COMPLEX WITH UDP AND MANGANESE IONS, CATALYTIC ACTIVITY, FUNCTION, COFACTOR.
  9. "Molecular basis for the progeroid variant of Ehlers-Danlos syndrome. Identification and characterization of two mutations in galactosyltransferase I gene."
    Okajima T., Fukumoto S., Furukawa K., Urano T.
    J. Biol. Chem. 274:28841-28844(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: VARIANTS EDSP1 ASP-186 AND PRO-206.
  10. "Identification and characterization of large galactosyltransferase gene families: galactosyltransferases for all functions."
    Amado M., Almeida R., Schwientek T., Clausen H.
    Biochim. Biophys. Acta 1473:35-53(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: REVIEW.

Entry informationi

Entry nameiB4GT7_HUMAN
AccessioniPrimary (citable) accession number: Q9UBV7
Secondary accession number(s): B3KN39, Q9UHN2
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 20, 2001
Last sequence update: April 30, 2000
Last modified: March 31, 2015
This is version 145 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 5
    Human chromosome 5: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PATHWAY comments
    Index of metabolic and biosynthesis pathways
  6. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  7. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.