Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P50429 (ARSB_MOUSE) Reviewed, UniProtKB/Swiss-Prot

Last modified July 9, 2014. Version 123. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Arylsulfatase B

Short name=ASB
EC=3.1.6.12
Alternative name(s):
N-acetylgalactosamine-4-sulfatase
Short name=G4S
Gene names
Name:Arsb
Synonyms:As1, As1-s
OrganismMus musculus (Mouse) [Reference proteome]
Taxonomic identifier10090 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus

Protein attributes

Sequence length534 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at transcript level

General annotation (Comments)

Catalytic activity

Hydrolysis of the 4-sulfate groups of the N-acetyl-D-galactosamine 4-sulfate units of chondroitin sulfate and dermatan sulfate.

Cofactor

Binds 1 calcium ion per subunit By similarity.

Subunit structure

Homodimer By similarity.

Subcellular location

Lysosome.

Post-translational modification

The conversion to 3-oxoalanine (also known as C-formylglycine, FGly), of a serine or cysteine residue in prokaryotes and of a cysteine residue in eukaryotes, is critical for catalytic activity By similarity.

Sequence similarities

Belongs to the sulfatase family.

Alternative products

This entry describes 2 isoforms produced by alternative splicing. [Align] [Select]
Isoform 1 (identifier: P50429-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform 2 (identifier: P50429-2)

The sequence of this isoform differs from the canonical sequence as follows:
     383-431: EGHPSPRVEL...EHSAFNTSIH → PVTGDHWHAE...PDCGRARWFL
     432-534: Missing.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 4141 Potential
Chain42 – 534493Arylsulfatase B
PRO_0000033423

Sites

Active site1481 By similarity
Metal binding541Calcium By similarity
Metal binding551Calcium By similarity
Metal binding921Calcium; via 3-oxoalanine By similarity
Metal binding3011Calcium By similarity
Metal binding3021Calcium By similarity
Binding site1461Substrate By similarity
Binding site2431Substrate By similarity
Binding site3191Substrate By similarity

Amino acid modifications

Modified residue9213-oxoalanine (Cys) By similarity
Glycosylation1891N-linked (GlcNAc...) Potential
Glycosylation2801N-linked (GlcNAc...) Potential
Glycosylation2921N-linked (GlcNAc...) Potential
Glycosylation3671N-linked (GlcNAc...) Potential
Glycosylation4271N-linked (GlcNAc...) Potential
Glycosylation4591N-linked (GlcNAc...) Potential
Disulfide bond118 ↔ 522 By similarity
Disulfide bond122 ↔ 156 By similarity
Disulfide bond182 ↔ 193 By similarity
Disulfide bond406 ↔ 448 By similarity

Natural variations

Alternative sequence383 – 43149EGHPS…NTSIH → PVTGDHWHAEGELGCSFRTA SAAEEEPTYKLREKKRRKSP DCGRARWFL in isoform 2.
VSP_007881
Alternative sequence432 – 534103Missing in isoform 2.
VSP_022249

Experimental info

Sequence conflict601D → A in AAA37261. Ref.3
Sequence conflict3521T → S in BAE34455. Ref.1
Sequence conflict3521T → S in CAA63067. Ref.4
Sequence conflict4671G → D in CAI84992. Ref.2

Sequences

Sequence LengthMass (Da)Tools
Isoform 1 [UniParc].

Last modified January 9, 2007. Version 3.
Checksum: 78DAB2D65C71E97D

FASTA53459,647
        10         20         30         40         50         60 
MGKLSPCTGR SRPGGPGPQL PLLLLLLQLL LLLLSPARAS GATQPPHVVF VLADDLGWND 

        70         80         90        100        110        120 
LGFHGSVIRT PHLDALAAGG VVLDNYYVQP LCTPSRSQLL TGRYQIHLGL QHYLIMTCQP 

       130        140        150        160        170        180 
SCVPLDEKLL PQLLKEAGYA THMVGKWHLG MYRKECLPTR RGFDTYFGYL LGSEDYYTHE 

       190        200        210        220        230        240 
ACAPIESLNG TRCALDLRDG EEPAKEYNNI YSTNIFTKRA TTVIANHPPE KPLFLYLAFQ 

       250        260        270        280        290        300 
SVHDPLQVPE EYMEPYGFIQ DKHRRIYAGM VSLMDEAVGN VTKALKSHGL WNNTVFIFST 

       310        320        330        340        350        360 
DNGGQTRSGG NNWPLRGRKG TLWEGGIRGT GFVASPLLKQ KGVKSRELMH ITDWLPTLVD 

       370        380        390        400        410        420 
LAGGSTNGTK PLDGFNMWKT ISEGHPSPRV ELLHNIDQDF FDGLPCPGKN MTPAKDDSFP 

       430        440        450        460        470        480 
LEHSAFNTSI HAGIRYKNWK LLTGHPGCGY WFPPPSQSNV SEIPPVGPPT KTLWLFDINQ 

       490        500        510        520        530 
DPEERHDVSR EHPHIVQNLL SRLQYYHEHS VPSHFPPLDP RCDPKSTGVW SPWM 

« Hide

Isoform 2 [UniParc].

Checksum: E3C8BBE3E0B7C6D5
Show »

FASTA43147,934

References

« Hide 'large scale' references
[1]"The transcriptional landscape of the mammalian genome."
Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J. expand/collapse author list , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 2).
Strain: C57BL/6J.
Tissue: Inner ear and Thymus.
[2]"Lineage-specific biology revealed by a finished genome assembly of the mouse."
Church D.M., Goodstadt L., Hillier L.W., Zody M.C., Goldstein S., She X., Bult C.J., Agarwala R., Cherry J.L., DiCuccio M., Hlavina W., Kapustin Y., Meric P., Maglott D., Birtle Z., Marques A.C., Graves T., Zhou S. expand/collapse author list , Teague B., Potamousis K., Churas C., Place M., Herschleb J., Runnheim R., Forrest D., Amos-Landgraf J., Schwartz D.C., Cheng Z., Lindblad-Toh K., Eichler E.E., Ponting C.P.
PLoS Biol. 7:E1000112-E1000112(2009) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
Strain: C57BL/6J.
[3]"The sulfatase gene family: cross-species PCR cloning using the MOPAC technique."
Grompe M., Pieretti M., Caskey C.T., Ballabio A.
Genomics 12:755-760(1992) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 58-90 (ISOFORM 1).
[4]"Targeted disruption of the arylsulfatase B gene results in mice resembling the phenotype of mucopolysaccharidosis VI."
Evers M., Saftig P., Schmidt P., Hafner A., McLoghlin D.B., Schmahl W., Hess B., von Figura K., Peters C.
Proc. Natl. Acad. Sci. U.S.A. 93:8214-8219(1996) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 137-388 (ISOFORM 1).
[5]"Sulfatases and sulfatase modifying factors: an exclusive and promiscuous relationship."
Sardiello M., Annunziata I., Roma G., Ballabio A.
Hum. Mol. Genet. 14:3203-3217(2005) [PubMed] [Europe PMC] [Abstract]
Cited for: IDENTIFICATION.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AK083309 mRNA. Translation: BAC38859.1.
AK154098 mRNA. Translation: BAE32375.1.
AK158312 mRNA. Translation: BAE34455.1.
AC131739 Genomic DNA. No translation available.
AC136976 Genomic DNA. No translation available.
M82877 mRNA. Translation: AAA37261.1.
X92096 mRNA. Translation: CAA63067.1.
BN000746 mRNA. Translation: CAI84992.1.
CCDSCCDS36749.1. [P50429-1]
RefSeqNP_033842.3. NM_009712.3.
UniGeneMm.300178.
Mm.472255.

3D structure databases

ProteinModelPortalP50429.
SMRP50429. Positions 43-534.
ModBaseSearch...
MobiDBSearch...

PTM databases

PhosphoSiteP50429.

Proteomic databases

MaxQBP50429.
PaxDbP50429.
PRIDEP50429.

Protocols and materials databases

DNASU11881.
StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENSMUST00000091403; ENSMUSP00000088964; ENSMUSG00000042082.
GeneID11881.
KEGGmmu:11881.
UCSCuc007rlo.1. mouse. [P50429-1]
uc011zcv.1. mouse. [P50429-2]

Organism-specific databases

CTD411.
MGIMGI:88075. Arsb.

Phylogenomic databases

eggNOGCOG3119.
GeneTreeENSGT00560000077076.
HOGENOMHOG000135354.
HOVERGENHBG004282.
InParanoidP50429.
KOK01135.
OrthoDBEOG7MKW5Q.
PhylomeDBP50429.
TreeFamTF314186.

Enzyme and pathway databases

SABIO-RKP50429.

Gene expression databases

BgeeP50429.
CleanExMM_ARSB.
GenevestigatorP50429.

Family and domain databases

Gene3D3.40.720.10. 1 hit.
InterProIPR017849. Alkaline_Pase-like_a/b/a.
IPR017850. Alkaline_phosphatase_core.
IPR000917. Sulfatase.
IPR024607. Sulfatase_CS.
[Graphical view]
PfamPF00884. Sulfatase. 1 hit.
[Graphical view]
SUPFAMSSF53649. SSF53649. 1 hit.
PROSITEPS00523. SULFATASE_1. 1 hit.
PS00149. SULFATASE_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

NextBio279911.
PROP50429.
SOURCESearch...

Entry information

Entry nameARSB_MOUSE
AccessionPrimary (citable) accession number: P50429
Secondary accession number(s): Q32KJ1 expand/collapse secondary AC list , Q3TYV7, Q3U4Q6, Q8C404
Entry history
Integrated into UniProtKB/Swiss-Prot: October 1, 1996
Last sequence update: January 9, 2007
Last modified: July 9, 2014
This is version 123 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families

MGD cross-references

Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot