Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Thiazole synthase

Gene

thiG

Organism
Escherichia coli (strain K12)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Catalyzes the rearrangement of 1-deoxy-D-xylulose 5-phosphate (DXP) to produce the thiazole phosphate moiety of thiamine. Sulfur is provided by the thiocarboxylate moiety of the carrier protein ThiS. In vitro, sulfur can be provided by H2S.1 Publication

Catalytic activityi

1-deoxy-D-xylulose 5-phosphate + 2-iminoacetate + thiocarboxy-adenylate-[sulfur-carrier protein ThiS] = 2-((2R,5Z)-2-carboxy-4-methylthiazol-5(2H)-ylidene)ethyl phosphate + [sulfur-carrier protein ThiS] + 2 H2O.

Pathwayi: thiamine diphosphate biosynthesis

This protein is involved in the pathway thiamine diphosphate biosynthesis, which is part of Cofactor biosynthesis.
View all proteins of this organism that are known to be involved in the pathway thiamine diphosphate biosynthesis and in Cofactor biosynthesis.

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Active sitei95 – 951Schiff-base intermediate with DXPBy similarity
Binding sitei156 – 1561DXP; via amide nitrogenBy similarity

GO - Molecular functioni

GO - Biological processi

  • thiamine biosynthetic process Source: EcoCyc
  • thiamine diphosphate biosynthetic process Source: UniProtKB-UniPathway
Complete GO annotation...

Keywords - Molecular functioni

Transferase

Keywords - Biological processi

Thiamine biosynthesis

Keywords - Ligandi

Schiff base

Enzyme and pathway databases

BioCyciEcoCyc:THIG-MONOMER.
ECOL316407:JW5549-MONOMER.
MetaCyc:THIG-MONOMER.
BRENDAi2.8.1.10. 2026.
UniPathwayiUPA00060.

Names & Taxonomyi

Protein namesi
Recommended name:
Thiazole synthase (EC:2.8.1.10)
Gene namesi
Name:thiG
Ordered Locus Names:b3991, JW5549
OrganismiEscherichia coli (strain K12)
Taxonomic identifieri83333 [NCBI]
Taxonomic lineageiBacteriaProteobacteriaGammaproteobacteriaEnterobacterialesEnterobacteriaceaeEscherichia
Proteomesi
  • UP000000318 Componenti: Chromosome
  • UP000000625 Componenti: Chromosome

Organism-specific databases

EcoGeneiEG11589. thiG.

Subcellular locationi

GO - Cellular componenti

  • cytosol Source: EcoCyc
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 256256Thiazole synthasePRO_0000162815Add
BLAST

Proteomic databases

EPDiP30139.
PaxDbiP30139.

Interactioni

Subunit structurei

Homotetramer. Forms heterodimers with either ThiH or ThiS.1 Publication

Binary interactionsi

WithEntry#Exp.IntActNotes
thiHP301402EBI-547059,EBI-1125553

Protein-protein interaction databases

BioGridi4262656. 16 interactions.
DIPiDIP-6868N.
IntActiP30139. 10 interactions.
MINTiMINT-1289283.
STRINGi511145.b3991.

Structurei

3D structure databases

ProteinModelPortaliP30139.
SMRiP30139. Positions 11-235.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni182 – 1832DXP bindingBy similarity
Regioni204 – 2052DXP bindingBy similarity

Sequence similaritiesi

Belongs to the ThiG family.Curated

Phylogenomic databases

eggNOGiENOG4105CA8. Bacteria.
COG2022. LUCA.
HOGENOMiHOG000248049.
InParanoidiP30139.
KOiK03149.
OMAiAQYPSPA.
OrthoDBiEOG6KMBD9.
PhylomeDBiP30139.

Family and domain databases

Gene3Di3.20.20.70. 1 hit.
HAMAPiMF_00443. ThiG.
InterProiIPR013785. Aldolase_TIM.
IPR008867. ThiG.
[Graphical view]
SUPFAMiSSF110399. SSF110399. 1 hit.

Sequencei

Sequence statusi: Complete.

P30139-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MLRIADKTFD SHLFTGTGKF ASSQLMVEAI RASGSQLVTL AMKRVDLRQH
60 70 80 90 100
NDAILEPLIA AGVTLLPNTS GAKTAEEAIF AAHLAREALG TNWLKLEIHP
110 120 130 140 150
DARWLLPDPI ETLKAAETLV QQGFVVLPYC GADPVLCKRL EEVGCAAVMP
160 170 180 190 200
LGAPIGSNQG LETRAMLEII IQQATVPVVV DAGIGVPSHA AQALEMGADA
210 220 230 240 250
VLVNTAIAVA DDPVNMAKAF RLAVEAGLLA RQSGPGSRSY FAHATSPLTG

FLEASA
Length:256
Mass (Da):26,896
Last modified:August 29, 2001 - v3
Checksum:i8B0BDEE617104E38
GO

Sequence cautioni

The sequence AAC43089.1 differs from that shown. Reason: Erroneous initiation. Curated

Mass spectrometryi

Molecular mass is 26893.3 Da from positions 1 - 256. Determined by ESI. 1 Publication
Molecular mass is 26896.5 Da from positions 1 - 256. Determined by ESI. 1 Publication

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M88701 Genomic DNA. Translation: AAB95621.1.
U00006 Genomic DNA. Translation: AAC43089.1. Different initiation.
U00096 Genomic DNA. Translation: AAC76965.2.
AP009048 Genomic DNA. Translation: BAE77329.1.
PIRiB65206.
RefSeqiNP_418418.2. NC_000913.3.
WP_000944104.1. NZ_LN832404.1.

Genome annotation databases

EnsemblBacteriaiAAC76965; AAC76965; b3991.
BAE77329; BAE77329; BAE77329.
GeneIDi948493.
KEGGiecj:JW5549.
eco:b3991.
PATRICi32123503. VBIEscCol129921_4104.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M88701 Genomic DNA. Translation: AAB95621.1.
U00006 Genomic DNA. Translation: AAC43089.1. Different initiation.
U00096 Genomic DNA. Translation: AAC76965.2.
AP009048 Genomic DNA. Translation: BAE77329.1.
PIRiB65206.
RefSeqiNP_418418.2. NC_000913.3.
WP_000944104.1. NZ_LN832404.1.

3D structure databases

ProteinModelPortaliP30139.
SMRiP30139. Positions 11-235.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi4262656. 16 interactions.
DIPiDIP-6868N.
IntActiP30139. 10 interactions.
MINTiMINT-1289283.
STRINGi511145.b3991.

Proteomic databases

EPDiP30139.
PaxDbiP30139.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsemblBacteriaiAAC76965; AAC76965; b3991.
BAE77329; BAE77329; BAE77329.
GeneIDi948493.
KEGGiecj:JW5549.
eco:b3991.
PATRICi32123503. VBIEscCol129921_4104.

Organism-specific databases

EchoBASEiEB1547.
EcoGeneiEG11589. thiG.

Phylogenomic databases

eggNOGiENOG4105CA8. Bacteria.
COG2022. LUCA.
HOGENOMiHOG000248049.
InParanoidiP30139.
KOiK03149.
OMAiAQYPSPA.
OrthoDBiEOG6KMBD9.
PhylomeDBiP30139.

Enzyme and pathway databases

UniPathwayiUPA00060.
BioCyciEcoCyc:THIG-MONOMER.
ECOL316407:JW5549-MONOMER.
MetaCyc:THIG-MONOMER.
BRENDAi2.8.1.10. 2026.

Miscellaneous databases

PROiP30139.

Family and domain databases

Gene3Di3.20.20.70. 1 hit.
HAMAPiMF_00443. ThiG.
InterProiIPR013785. Aldolase_TIM.
IPR008867. ThiG.
[Graphical view]
SUPFAMiSSF110399. SSF110399. 1 hit.
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Structural genes for thiamine biosynthetic enzymes (thiCEFGH) in Escherichia coli K-12."
    Vander Horn P.B., Backstrom A.D., Stewart V., Begley T.P.
    J. Bacteriol. 175:982-992(1993) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA].
    Strain: K12.
  2. "Analysis of the Escherichia coli genome. IV. DNA sequence of the region from 89.2 to 92.8 minutes."
    Blattner F.R., Burland V.D., Plunkett G. III, Sofia H.J., Daniels D.L.
    Nucleic Acids Res. 21:5408-5417(1993) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: K12 / MG1655 / ATCC 47076.
  3. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: K12 / MG1655 / ATCC 47076.
  4. "Highly accurate genome sequences of Escherichia coli K-12 strains MG1655 and W3110."
    Hayashi K., Morooka N., Yamamoto Y., Fujita K., Isono K., Choi S., Ohtsubo E., Baba T., Wanner B.L., Mori H., Horiuchi T.
    Mol. Syst. Biol. 2:E1-E5(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: K12 / W3110 / ATCC 27325 / DSM 5911.
  5. "Thiamine biosynthesis in Escherichia coli: isolation and initial characterisation of the ThiGH complex."
    Leonardi R., Fairhurst S.A., Kriek M., Lowe D.J., Roach P.L.
    FEBS Lett. 539:95-99(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 1-7, FUNCTION, MASS SPECTROMETRY, SUBUNIT, INTERACTION WITH THIH.
  6. "Efficient sequence analysis of the six gene products (7-74 kDa) from the Escherichia coli thiamin biosynthetic operon by tandem high-resolution mass spectrometry."
    Kelleher N.L., Taylor S.V., Grannis D., Kinsland C., Chiu H.-J., Begley T.P., McLafferty F.W.
    Protein Sci. 7:1796-1801(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: MASS SPECTROMETRY.

Entry informationi

Entry nameiTHIG_ECOLI
AccessioniPrimary (citable) accession number: P30139
Secondary accession number(s): P76779, Q2M8S7
Entry historyi
Integrated into UniProtKB/Swiss-Prot: April 1, 1993
Last sequence update: August 29, 2001
Last modified: July 6, 2016
This is version 130 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programProkaryotic Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Escherichia coli
    Escherichia coli (strain K12): entries and cross-references to EcoGene
  2. PATHWAY comments
    Index of metabolic and biosynthesis pathways
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.