Skip Header

Contribute Send feedback
Read comments (?) or add your own

P10476 (GUNA_CELJU) Reviewed, UniProtKB/Swiss-Prot

Last modified January 25, 2012. Version 95. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Endoglucanase A

Short name=EGA
EC=3.2.1.4
Alternative name(s):
Cellulase
Endo-1,4-beta-glucanase
Gene names
Name:celA
Synonyms:cel9A
Ordered Locus Names:CJA_2472
OrganismCellvibrio japonicus (strain Ueda107) (Pseudomonas fluorescens subsp. cellulosa) [Complete proteome] [HAMAP]
Taxonomic identifier498211 [NCBI]
Taxonomic lineageBacteriaProteobacteriaGammaproteobacteriaPseudomonadalesPseudomonadaceaeCellvibrio

Protein attributes

Sequence length962 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceInferred from homology

General annotation (Comments)

Catalytic activity

Endohydrolysis of (1->4)-beta-D-glucosidic linkages in cellulose, lichenin and cereal beta-D-glucans.

Sequence similarities

Belongs to the glycosyl hydrolase 9 (cellulase E) family.

Contains 1 CBM2 (carbohydrate binding type-2) domain.

Ontologies

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 3232 Potential
Chain33 – 962930Endoglucanase A
PRO_0000007957

Regions

Domain859 – 962104CBM2
Compositional bias608 – 66457Ser-rich
Compositional bias823 – 85937Ser-rich (linker)

Sites

Active site5231 By similarity
Active site5731 By similarity
Active site5821 By similarity

Amino acid modifications

Disulfide bond671 ↔ 702 By similarity
Disulfide bond681 ↔ 696 By similarity
Disulfide bond866 ↔ 961 By similarity

Experimental info

Sequence conflict141K → T in CAA31082. Ref.1
Sequence conflict251C → G in CAA31082. Ref.1
Sequence conflict5031S → I in CAA31082. Ref.1
Sequence conflict5071S → P in CAA31082. Ref.1
Sequence conflict5551L → F in CAA31082. Ref.1
Sequence conflict8941G → R in CAA31082. Ref.1
Sequence conflict9511A → R in CAA31082. Ref.1

Sequences

Sequence LengthMass (Da)Tools
P10476 [UniParc].

Last modified April 20, 2010. Version 2.
Checksum: 95212655950CB52A

FASTA962100,109
        10         20         30         40         50         60 
MINRSVLKIP ALVKPLVQAL VLVGCTLGVA QAEVGNPRVN QLGYIPNGDR IAVYKASNNS 

        70         80         90        100        110        120 
AQTWQLTHNG SLIASGQTIP KGSDASSGDN IHHIDLSSVT ATGSGFTLTV GGDSSYPFSI 

       130        140        150        160        170        180 
SSTTFNAAFY DALKYFYHNR SGIAIETPYT GGGRGSYASH SRWSRPAGHL NQGANKGDMN 

       190        200        210        220        230        240 
VPCWSGTCNY SLNVTKGWYD AGDHGKYVVN GGISVWTLLN LYERAQHITG NLAAVADGSM 

       250        260        270        280        290        300 
NIPESGNGVA DILDEARWQM EFMLAMQVPQ GQAKAGMAHH KIHDVGWTGL PLAPHEDPQQ 

       310        320        330        340        350        360 
RALVPPSTAA TLNLAATAAQ AARIWKDIDA GFAALCLTAA ERAWNAAQAN PNDIYSGNYD 

       370        380        390        400        410        420 
NGGGGYGDRF VADEFYWAAA ELYITTGDSR YLPTINNYTL ERTDFGWPDT ELLGVMSLAV 

       430        440        450        460        470        480 
VPATHTNSLR IAARNHIQTI ASTHLTTQSA SGYPAPLSSL EYYWGSNSVI ANKLVLMGLA 

       490        500        510        520        530        540 
YDFSGNQNFA LGVSKGINYL FGSNVLSTSF ITGLGTNTVA QPHHRFWAGA LNSNYPWAPP 

       550        560        570        580        590        600 
GALSGGPNAG LEDSLSASRL SGCTSRPATC WLDSIDAWST NEITINWNAP LAWVLGFYND 

       610        620        630        640        650        660 
FAATQGGSSS SSSSSSSSVP VSSSSSSSII PSSSSSSIQP SSSSSSMPSS SSSSSSVVAS 

       670        680        690        700        710        720 
SSSSVSGGLR CNWYGTLYPL CVTTQSGWGW ENSQSCISAS TCSAQPAPYG IVGAASSSSQ 

       730        740        750        760        770        780 
AANRSPTLQL SANATGFEGG SMVCCTLHIN GAASDPDGDN LTYSWQVISG NTVVASGSSS 

       790        800        810        820        830        840 
SASIHVSNQR GYEVSMTVSD GRGGVATETT FVSVYFSDYF PGSSSSASNI NSSSSSSSSS 

       850        860        870        880        890        900 
SSSAIVSSSS SVVSSSSSSA ASGGNCQYVV TNQWNNGFTA VIRVRNNGSS AINGWSVNWS 

       910        920        930        940        950        960 
YSDGSRITNS WNANVTGNNP YAASALGWNA NIQPGQTAEF GFQGTKGAGS AQVPAVTGSV 


CQ 

« Hide

References

« Hide 'large scale' references
[1]"The nucleotide sequence of a carboxymethylcellulase gene from Pseudomonas fluorescens subsp. cellulosa."
Hall J., Gilbert H.J.
Mol. Gen. Genet. 213:112-117(1988) [PubMed: 2851699] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA].
[2]"Insights into plant cell wall degradation from the genome sequence of the soil bacterium Cellvibrio japonicus."
DeBoy R.T., Mongodin E.F., Fouts D.E., Tailford L.E., Khouri H., Emerson J.B., Mohamoud Y., Watkins K., Henrissat B., Gilbert H.J., Nelson K.E.
J. Bacteriol. 190:5455-5463(2008) [PubMed: 18556790] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
Strain: Ueda107.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
X12570 Genomic DNA. Translation: CAA31082.1.
CP000934 Genomic DNA. Translation: ACE85757.1.
RefSeqYP_001982933.1. NC_010995.1.

3D structure databases

ProteinModelPortalP10476.
ModBaseSearch...

Protein family/group databases

CAZyCBM10. Carbohydrate-Binding Module Family 10.
CBM2. Carbohydrate-Binding Module Family 2.
GH9. Glycoside Hydrolase Family 9.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

GeneID6416717.
GenomeReviewsGene locus CJA_2472 in contig CP000934_GR.
KEGGcja:CJA_2472.
PATRIC21328310. VBICelJap122165_2424.

Organism-specific databases

CMRSearch...

Phylogenomic databases

ProtClustDBCLSK2312845.

Family and domain databases

InterProIPR008928. 6-hairpin_glycosidase-like.
IPR012341. 6hp_glycosidase.
IPR008965. Carb-bd_dom.
IPR012291. CBD_carb-bd_dom.
IPR002883. CBM10/Dockerin_dom.
IPR018366. CBM2_CS.
IPR009031. CBM_fam10.
IPR001919. Cellulose-bd_dom_fam2_bac.
IPR001701. Glyco_hydro_9.
IPR018221. Glyco_hydro_9_AS.
IPR004197. Glyco_hydro_9_Ig-like.
IPR013783. Ig-like_fold.
IPR014756. Ig_E-set.
IPR000601. PKD_dom.
[Graphical view]
Gene3DG3DSA:2.60.40.290. CBD_carb_bd. 1 hit.
G3DSA:1.50.10.10. CelA/Cel48F_cat. 1 hit.
G3DSA:2.60.40.10. Ig-like_fold. 1 hit.
G3DSA:2.30.32.30. TypeX_cellulose-bd_reg_CBDX. 1 hit.
PfamPF02013. CBM_10. 1 hit.
PF00553. CBM_2. 1 hit.
PF02927. CelD_N. 1 hit.
PF00759. Glyco_hydro_9. 1 hit.
PF00801. PKD. 1 hit.
[Graphical view]
SMARTSM00637. CBD_II. 1 hit.
SM01064. CBM_10. 1 hit.
[Graphical view]
SUPFAMSSF57615. CBDX. 1 hit.
SSF49384. Cellul_bind. 1 hit.
SSF48208. Glyco_trans_6hp. 1 hit.
SSF81296. Ig_E-set. 1 hit.
SSF49299. PKD. 1 hit.
PROSITEPS51173. CBM2. 1 hit.
PS00561. CBM2_A. 1 hit.
PS00592. GLYCOSYL_HYDROL_F9_1. 1 hit.
PS00698. GLYCOSYL_HYDROL_F9_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Entry information

Entry nameGUNA_CELJU
AccessionPrimary (citable) accession number: P10476
Secondary accession number(s): B3PKK4
Entry history
Integrated into UniProtKB/Swiss-Prot: July 1, 1989
Last sequence update: April 20, 2010
Last modified: January 25, 2012
This is version 95 of the entry and version 2 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programProkaryotic Protein Annotation Program

Relevant documents

Glycosyl hydrolases

Classification of glycosyl hydrolase families and list of entries

SIMILARITY comments

Index of protein domains and families