Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

2-amino-3-ketobutyrate coenzyme A ligase, mitochondrial

Gene

GCAT

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Catalytic activityi

Acetyl-CoA + glycine = CoA + 2-amino-3-oxobutanoate.

Cofactori

Pathway:iL-threonine degradation via oxydo-reductase pathway

This protein is involved in step 2 of the subpathway that synthesizes glycine from L-threonine.
Proteins known to be involved in the 2 steps of the subpathway in this organism are:
  1. no protein annotated in this organism
  2. 2-amino-3-ketobutyrate coenzyme A ligase, mitochondrial (GCAT)
This subpathway is part of the pathway L-threonine degradation via oxydo-reductase pathway, which is itself part of Amino-acid degradation.
View all proteins of this organism that are known to be involved in the subpathway that synthesizes glycine from L-threonine, the pathway L-threonine degradation via oxydo-reductase pathway and in Amino-acid degradation.

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Binding sitei159 – 1591SubstrateBy similarity
Binding sitei206 – 2061Pyridoxal phosphateBy similarity
Binding sitei389 – 3891SubstrateBy similarity

GO - Molecular functioni

  • glycine C-acetyltransferase activity Source: UniProtKB
  • pyridoxal phosphate binding Source: InterPro

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Acyltransferase, Transferase

Keywords - Ligandi

Pyridoxal phosphate

Enzyme and pathway databases

UniPathwayiUPA00046; UER00506.

Names & Taxonomyi

Protein namesi
Recommended name:
2-amino-3-ketobutyrate coenzyme A ligase, mitochondrial (EC:2.3.1.29)
Short name:
AKB ligase
Alternative name(s):
Aminoacetone synthase
Glycine acetyltransferase
Gene namesi
Name:GCAT
Synonyms:KBL
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 22

Organism-specific databases

HGNCiHGNC:4188. GCAT.

Subcellular locationi

  • Mitochondrion By similarity
  • Nucleus 1 Publication

  • Note: Translocates to the nucleus upon cold and osmotic stress.

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Mitochondrion, Nucleus

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA28603.

Chemistry

DrugBankiDB00145. Glycine.

Polymorphism and mutation databases

BioMutaiGCAT.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Transit peptidei1 – 2121MitochondrionBy similarityAdd
BLAST
Chaini22 – 4193982-amino-3-ketobutyrate coenzyme A ligase, mitochondrialPRO_0000001246Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei45 – 451N6-acetyllysine; alternateBy similarity
Modified residuei45 – 451N6-succinyllysine; alternateBy similarity
Modified residuei187 – 1871N6-acetyllysine; alternateBy similarity
Modified residuei187 – 1871N6-succinyllysine; alternateBy similarity
Modified residuei265 – 2651N6-(pyridoxal phosphate)lysineCurated
Modified residuei326 – 3261N6-succinyllysineBy similarity
Modified residuei368 – 3681N6-succinyllysineBy similarity
Modified residuei383 – 3831N6-acetyllysine; alternateBy similarity
Modified residuei383 – 3831N6-succinyllysine; alternateBy similarity

Keywords - PTMi

Acetylation

Proteomic databases

MaxQBiO75600.
PaxDbiO75600.
PRIDEiO75600.

PTM databases

PhosphoSiteiO75600.

Expressioni

Tissue specificityi

Strongly expressed in heart, brain, liver and pancreas. Also found in lung.1 Publication

Gene expression databases

BgeeiO75600.
CleanExiHS_GCAT.
ExpressionAtlasiO75600. baseline and differential.
GenevisibleiO75600. HS.

Organism-specific databases

HPAiHPA020460.

Interactioni

Protein-protein interaction databases

BioGridi117027. 10 interactions.
IntActiO75600. 3 interactions.

Structurei

3D structure databases

ProteinModelPortaliO75600.
SMRiO75600. Positions 21-418.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni134 – 1352Pyridoxal phosphate bindingBy similarity
Regioni262 – 2654Pyridoxal phosphate bindingBy similarity
Regioni295 – 2962Pyridoxal phosphate binding; shared with dimeric partnerBy similarity

Sequence similaritiesi

Keywords - Domaini

Transit peptide

Phylogenomic databases

eggNOGiCOG0156.
GeneTreeiENSGT00530000063111.
HOGENOMiHOG000221022.
HOVERGENiHBG105208.
InParanoidiO75600.
KOiK00639.
OMAiCTIMATH.
OrthoDBiEOG79GT68.
PhylomeDBiO75600.
TreeFamiTF105923.

Family and domain databases

Gene3Di3.40.640.10. 1 hit.
3.90.1150.10. 1 hit.
HAMAPiMF_00985. 2am3keto_CoA_ligase.
InterProiIPR011282. 2am3keto_CoA_ligase.
IPR001917. Aminotrans_II_pyridoxalP_BS.
IPR004839. Aminotransferase_I/II.
IPR015424. PyrdxlP-dep_Trfase.
IPR015421. PyrdxlP-dep_Trfase_major_sub1.
IPR015422. PyrdxlP-dep_Trfase_major_sub2.
[Graphical view]
PfamiPF00155. Aminotran_1_2. 1 hit.
[Graphical view]
SUPFAMiSSF53383. SSF53383. 1 hit.
TIGRFAMsiTIGR01822. 2am3keto_CoA. 1 hit.
PROSITEiPS00599. AA_TRANSFER_CLASS_2. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: O75600-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MWPGNAWRAA LFWVPRGRRA QSALAQLRGI LEGELEGIRG AGTWKSERVI
60 70 80 90 100
TSRQGPHIRV DGVSGGILNF CANNYLGLSS HPEVIQAGLQ ALEEFGAGLS
110 120 130 140 150
SVRFICGTQS IHKNLEAKIA RFHQREDAIL YPSCYDANAG LFEALLTPED
160 170 180 190 200
AVLSDELNHA SIIDGIRLCK AHKYRYRHLD MADLEAKLQE AQKHRLRLVA
210 220 230 240 250
TDGAFSMDGD IAPLQEICCL ASRYGALVFM DECHATGFLG PTGRGTDELL
260 270 280 290 300
GVMDQVTIIN STLGKALGGA SGGYTTGPGP LVSLLRQRAR PYLFSNSLPP
310 320 330 340 350
AVVGCASKAL DLLMGSNTIV QSMAAKTQRF RSKMEAAGFT ISGASHPICP
360 370 380 390 400
VMLGDARLAS RMADDMLKRG IFVIGFSYPV VPKGKARIRV QISAVHSEED
410
IDRCVEAFVE VGRLHGALP
Length:419
Mass (Da):45,285
Last modified:November 1, 1998 - v1
Checksum:iC7760699E0474821
GO
Isoform 2 (identifier: O75600-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     65-65: G → GGPGTVIFPGLPLPHLSCCIHLLSFTS

Note: No experimental confirmation available.Curated
Show »
Length:445
Mass (Da):47,974
Checksum:i5AD82BF1119ADC87
GO

Sequence cautioni

The sequence BAC85552.1 differs from that shown. Reason: Frameshift at position 232. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti387 – 3871R → W in AAH14457 (PubMed:15489334).Curated
Isoform 2 (identifier: O75600-2)
Sequence conflicti77 – 771L → S in BAC85552 (PubMed:14702039).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti39 – 391R → C.
Corresponds to variant rs710187 [ dbSNP | Ensembl ].
VAR_015094
Natural varianti100 – 1001S → N.
Corresponds to variant rs34468367 [ dbSNP | Ensembl ].
VAR_048229

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei65 – 651G → GGPGTVIFPGLPLPHLSCCI HLLSFTS in isoform 2. 1 PublicationVSP_044607

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF077740 mRNA. Translation: AAC27720.1.
AK123190 mRNA. Translation: BAC85552.1. Frameshift.
Z97630 Genomic DNA. Translation: CAB42830.1.
Z97630 Genomic DNA. Translation: CAX14890.1.
BC014457 mRNA. Translation: AAH14457.1.
CCDSiCCDS13957.1. [O75600-1]
CCDS54527.1. [O75600-2]
RefSeqiNP_001165161.1. NM_001171690.1. [O75600-2]
NP_055106.1. NM_014291.3. [O75600-1]
XP_005261467.1. XM_005261410.3. [O75600-2]
UniGeneiHs.54609.

Genome annotation databases

EnsembliENST00000248924; ENSP00000248924; ENSG00000100116.
ENST00000323205; ENSP00000371110; ENSG00000100116. [O75600-2]
GeneIDi23464.
KEGGihsa:23464.
UCSCiuc003atz.3. human. [O75600-1]
uc003aua.2. human.

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF077740 mRNA. Translation: AAC27720.1.
AK123190 mRNA. Translation: BAC85552.1. Frameshift.
Z97630 Genomic DNA. Translation: CAB42830.1.
Z97630 Genomic DNA. Translation: CAX14890.1.
BC014457 mRNA. Translation: AAH14457.1.
CCDSiCCDS13957.1. [O75600-1]
CCDS54527.1. [O75600-2]
RefSeqiNP_001165161.1. NM_001171690.1. [O75600-2]
NP_055106.1. NM_014291.3. [O75600-1]
XP_005261467.1. XM_005261410.3. [O75600-2]
UniGeneiHs.54609.

3D structure databases

ProteinModelPortaliO75600.
SMRiO75600. Positions 21-418.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi117027. 10 interactions.
IntActiO75600. 3 interactions.

Chemistry

DrugBankiDB00145. Glycine.

PTM databases

PhosphoSiteiO75600.

Polymorphism and mutation databases

BioMutaiGCAT.

Proteomic databases

MaxQBiO75600.
PaxDbiO75600.
PRIDEiO75600.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000248924; ENSP00000248924; ENSG00000100116.
ENST00000323205; ENSP00000371110; ENSG00000100116. [O75600-2]
GeneIDi23464.
KEGGihsa:23464.
UCSCiuc003atz.3. human. [O75600-1]
uc003aua.2. human.

Organism-specific databases

CTDi23464.
GeneCardsiGC22P038203.
HGNCiHGNC:4188. GCAT.
HPAiHPA020460.
MIMi607422. gene.
neXtProtiNX_O75600.
PharmGKBiPA28603.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiCOG0156.
GeneTreeiENSGT00530000063111.
HOGENOMiHOG000221022.
HOVERGENiHBG105208.
InParanoidiO75600.
KOiK00639.
OMAiCTIMATH.
OrthoDBiEOG79GT68.
PhylomeDBiO75600.
TreeFamiTF105923.

Enzyme and pathway databases

UniPathwayiUPA00046; UER00506.

Miscellaneous databases

GenomeRNAii23464.
NextBioi45781.
PROiO75600.
SOURCEiSearch...

Gene expression databases

BgeeiO75600.
CleanExiHS_GCAT.
ExpressionAtlasiO75600. baseline and differential.
GenevisibleiO75600. HS.

Family and domain databases

Gene3Di3.40.640.10. 1 hit.
3.90.1150.10. 1 hit.
HAMAPiMF_00985. 2am3keto_CoA_ligase.
InterProiIPR011282. 2am3keto_CoA_ligase.
IPR001917. Aminotrans_II_pyridoxalP_BS.
IPR004839. Aminotransferase_I/II.
IPR015424. PyrdxlP-dep_Trfase.
IPR015421. PyrdxlP-dep_Trfase_major_sub1.
IPR015422. PyrdxlP-dep_Trfase_major_sub2.
[Graphical view]
PfamiPF00155. Aminotran_1_2. 1 hit.
[Graphical view]
SUPFAMiSSF53383. SSF53383. 1 hit.
TIGRFAMsiTIGR01822. 2am3keto_CoA. 1 hit.
PROSITEiPS00599. AA_TRANSFER_CLASS_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Molecular cloning of the human and murine 2-amino-3-ketobutyrate coenzyme A ligase cDNAs."
    Edgar A.J., Polak J.M.
    Eur. J. Biochem. 267:1805-1812(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1), TISSUE SPECIFICITY.
    Tissue: Lung.
  2. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
    Tissue: Cerebellum.
  3. "The DNA sequence of human chromosome 22."
    Dunham I., Hunt A.R., Collins J.E., Bruskiewich R., Beare D.M., Clamp M., Smink L.J., Ainscough R., Almeida J.P., Babbage A.K., Bagguley C., Bailey J., Barlow K.F., Bates K.N., Beasley O.P., Bird C.P., Blakey S.E., Bridgeman A.M.
    , Buck D., Burgess J., Burrill W.D., Burton J., Carder C., Carter N.P., Chen Y., Clark G., Clegg S.M., Cobley V.E., Cole C.G., Collier R.E., Connor R., Conroy D., Corby N.R., Coville G.J., Cox A.V., Davis J., Dawson E., Dhami P.D., Dockree C., Dodsworth S.J., Durbin R.M., Ellington A.G., Evans K.L., Fey J.M., Fleming K., French L., Garner A.A., Gilbert J.G.R., Goward M.E., Grafham D.V., Griffiths M.N.D., Hall C., Hall R.E., Hall-Tamlyn G., Heathcott R.W., Ho S., Holmes S., Hunt S.E., Jones M.C., Kershaw J., Kimberley A.M., King A., Laird G.K., Langford C.F., Leversha M.A., Lloyd C., Lloyd D.M., Martyn I.D., Mashreghi-Mohammadi M., Matthews L.H., Mccann O.T., Mcclay J., Mclaren S., McMurray A.A., Milne S.A., Mortimore B.J., Odell C.N., Pavitt R., Pearce A.V., Pearson D., Phillimore B.J.C.T., Phillips S.H., Plumb R.W., Ramsay H., Ramsey Y., Rogers L., Ross M.T., Scott C.E., Sehra H.K., Skuce C.D., Smalley S., Smith M.L., Soderlund C., Spragon L., Steward C.A., Sulston J.E., Swann R.M., Vaudin M., Wall M., Wallis J.M., Whiteley M.N., Willey D.L., Williams L., Williams S.A., Williamson H., Wilmer T.E., Wilming L., Wright C.L., Hubbard T., Bentley D.R., Beck S., Rogers J., Shimizu N., Minoshima S., Kawasaki K., Sasaki T., Asakawa S., Kudoh J., Shintani A., Shibuya K., Yoshizaki Y., Aoki N., Mitsuyama S., Roe B.A., Chen F., Chu L., Crabtree J., Deschamps S., Do A., Do T., Dorman A., Fang F., Fu Y., Hu P., Hua A., Kenton S., Lai H., Lao H.I., Lewis J., Lewis S., Lin S.-P., Loh P., Malaj E., Nguyen T., Pan H., Phan S., Qi S., Qian Y., Ray L., Ren Q., Shaull S., Sloan D., Song L., Wang Q., Wang Y., Wang Z., White J., Willingham D., Wu H., Yao Z., Zhan M., Zhang G., Chissoe S., Murray J., Miller N., Minx P., Fulton R., Johnson D., Bemis G., Bentley D., Bradshaw H., Bourne S., Cordes M., Du Z., Fulton L., Goela D., Graves T., Hawkins J., Hinds K., Kemp K., Latreille P., Layman D., Ozersky P., Rohlfing T., Scheet P., Walker C., Wamsley A., Wohldmann P., Pepin K., Nelson J., Korf I., Bedell J.A., Hillier L.W., Mardis E., Waterston R., Wilson R., Emanuel B.S., Shaikh T., Kurahashi H., Saitta S., Budarf M.L., McDermid H.E., Johnson A., Wong A.C.C., Morrow B.E., Edelmann L., Kim U.J., Shizuya H., Simon M.I., Dumanski J.P., Peyrard M., Kedra D., Seroussi E., Fransson I., Tapia I., Bruder C.E., O'Brien K.P., Wilkinson P., Bodenteich A., Hartman K., Hu X., Khan A.S., Lane L., Tilahun Y., Wright H.
    Nature 402:489-495(1999) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  4. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Tissue: Skin.
  5. "Nuclear translocation of 2-amino-3-ketobutyrate coenzyme A ligase by cold and osmotic stress."
    Hoshino A., Fujii H.
    Cell Stress Chaperones 12:186-191(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: SUBCELLULAR LOCATION.
  6. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  7. "An enzyme assisted RP-RPLC approach for in-depth analysis of human liver phosphoproteome."
    Bian Y., Song C., Cheng K., Dong M., Wang F., Huang J., Sun D., Wang L., Ye M., Zou H.
    J. Proteomics 96:253-262(2014) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Liver.

Entry informationi

Entry nameiKBL_HUMAN
AccessioniPrimary (citable) accession number: O75600
Secondary accession number(s): E2QC23, Q6ZWF1, Q96CA9
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 30, 2000
Last sequence update: November 1, 1998
Last modified: July 22, 2015
This is version 146 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 22
    Human chromosome 22: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PATHWAY comments
    Index of metabolic and biosynthesis pathways
  6. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.