Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

O08710 (THYG_MOUSE) Reviewed, UniProtKB/Swiss-Prot

Last modified July 9, 2014. Version 126. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Thyroglobulin

Short name=Tg
Gene names
Name:Tg
Synonyms:Tgn
OrganismMus musculus (Mouse) [Reference proteome]
Taxonomic identifier10090 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus

Protein attributes

Sequence length2766 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Precursor of the iodinated thyroid hormones thyroxine (T4) and triiodothyronine (T3).

Subunit structure

Homodimer By similarity.

Subcellular location

Secreted.

Tissue specificity

Thyroid gland specific.

Post-translational modification

Sulfated tyrosines are desulfated during iodination By similarity.

Involvement in disease

Defects in Tg are the cause of some forms of goiter. Goiter is an enlargement of the thyroid gland. The variant Pro-2283 exhibits a defect in exit from the endoplasmic reticulum.

Sequence similarities

Belongs to the type-B carboxylesterase/lipase family.

Contains 11 thyroglobulin type-1 domains.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 2020 By similarity
Chain21 – 27662746Thyroglobulin
PRO_0000008637

Regions

Domain32 – 9362Thyroglobulin type-1 1
Domain94 – 16168Thyroglobulin type-1 2
Domain162 – 298137Thyroglobulin type-1 3
Domain299 – 35961Thyroglobulin type-1 4
Domain605 – 65854Thyroglobulin type-1 5
Domain659 – 72668Thyroglobulin type-1 6
Domain727 – 922196Thyroglobulin type-1 7
Domain923 – 1074152Thyroglobulin type-1 8
Domain1075 – 114672Thyroglobulin type-1 9
Domain1147 – 121165Thyroglobulin type-1 10
Repeat1455 – 146814Type II
Repeat1469 – 148517Type II
Repeat1486 – 150217Type II
Domain1510 – 156455Thyroglobulin type-1 11
Repeat1602 – 1722121Type IIIA
Repeat1723 – 1889167Type IIIB
Repeat1890 – 1992103Type IIIA
Repeat1993 – 2125133Type IIIB
Repeat2126 – 218358Type IIIA

Amino acid modifications

Modified residue251Sulfotyrosine; alternate By similarity
Modified residue251Thyroxine; alternate By similarity
Modified residue25721Thyroxine By similarity
Modified residue25861Thyroxine By similarity
Modified residue27641Triiodothyronine By similarity
Glycosylation1111N-linked (GlcNAc...) Potential
Glycosylation1991N-linked (GlcNAc...) Potential
Glycosylation4841N-linked (GlcNAc...) Potential
Glycosylation4961N-linked (GlcNAc...) Potential
Glycosylation7481N-linked (GlcNAc...) Potential
Glycosylation8171N-linked (GlcNAc...) Potential
Glycosylation9481N-linked (GlcNAc...) Potential
Glycosylation11411N-linked (GlcNAc...) Potential
Glycosylation13491N-linked (GlcNAc...) Potential
Glycosylation13651N-linked (GlcNAc...) Potential
Glycosylation17151N-linked (GlcNAc...) Potential
Glycosylation17291N-linked (GlcNAc...) Potential
Glycosylation17731N-linked (GlcNAc...) Potential
Glycosylation18641N-linked (GlcNAc...) Potential
Glycosylation19351N-linked (GlcNAc...) Potential
Glycosylation20101N-linked (GlcNAc...) Potential
Glycosylation21201N-linked (GlcNAc...) Potential
Glycosylation22491N-linked (GlcNAc...) Potential
Glycosylation22941N-linked (GlcNAc...) Potential
Glycosylation25811N-linked (GlcNAc...) Potential
Disulfide bond35 ↔ 53 By similarity
Disulfide bond64 ↔ 71 By similarity
Disulfide bond73 ↔ 93 By similarity
Disulfide bond97 ↔ 121 By similarity
Disulfide bond132 ↔ 139 By similarity
Disulfide bond141 ↔ 161 By similarity
Disulfide bond165 ↔ 184 By similarity
Disulfide bond195 ↔ 236 By similarity
Disulfide bond302 ↔ 320 By similarity
Disulfide bond331 ↔ 337 By similarity
Disulfide bond339 ↔ 359 By similarity
Disulfide bond608 ↔ 620 By similarity
Disulfide bond631 ↔ 636 By similarity
Disulfide bond638 ↔ 658 By similarity
Disulfide bond662 ↔ 687 By similarity
Disulfide bond698 ↔ 703 By similarity
Disulfide bond705 ↔ 726 By similarity
Disulfide bond730 ↔ 763 By similarity
Disulfide bond774 ↔ 899 By similarity
Disulfide bond901 ↔ 922 By similarity
Disulfide bond1043 ↔ 1050 By similarity
Disulfide bond1052 ↔ 1074 By similarity
Disulfide bond1078 ↔ 1109 By similarity
Disulfide bond1127 ↔ 1146 By similarity
Disulfide bond1150 ↔ 1170 By similarity
Disulfide bond1182 ↔ 1189 By similarity
Disulfide bond1191 ↔ 1211 By similarity
Disulfide bond1513 ↔ 1522 By similarity
Disulfide bond1542 ↔ 1564 By similarity
Disulfide bond2263 ↔ 2280 Potential

Natural variations

Natural variant22831L → P in goiter. Ref.2

Experimental info

Sequence conflict801E → K in AAC32268. Ref.2
Sequence conflict801E → K in AAC32269. Ref.3
Sequence conflict921V → I in AAC32268. Ref.2
Sequence conflict921V → I in AAC32269. Ref.3
Sequence conflict13271A → T in AAB53204. Ref.1
Sequence conflict14271N → S in AAB53204. Ref.1
Sequence conflict1436 – 14427RTQLGCM → GLSLDVL in AAB53204. Ref.1
Sequence conflict17211T → I in AAB53204. Ref.1
Sequence conflict18131S → T in AAB53204. Ref.1
Sequence conflict1957 – 19593RVK → KVN in AAC32268. Ref.2
Sequence conflict1957 – 19593RVK → KVN in AAC32269. Ref.3
Sequence conflict20901S → SS in AAB53204. Ref.1
Sequence conflict24071R → K in AAC32268. Ref.2
Sequence conflict24071R → K in AAC32269. Ref.3
Sequence conflict24141G → S in AAC32268. Ref.2
Sequence conflict24141G → S in AAC32269. Ref.3
Sequence conflict24271R → K in AAC32268. Ref.2
Sequence conflict24271R → K in AAC32269. Ref.3
Sequence conflict24341A → T in AAC32268. Ref.2
Sequence conflict24341A → T in AAC32269. Ref.3
Sequence conflict2443 – 245311TSSIQEVVSCL → NFIHPGSGIMF in AAC32268. Ref.2
Sequence conflict2443 – 245311TSSIQEVVSCL → NFIHPGSGIMF in AAC32269. Ref.3
Sequence conflict27281D → GN in AAB53204. Ref.1

Sequences

Sequence LengthMass (Da)Tools
O08710 [UniParc].

Last modified July 27, 2011. Version 3.
Checksum: 06227D4192AC1902

FASTA2,766304,473
        10         20         30         40         50         60 
MTALVLWVST LLSSVCLVAA NIFEYQVDAQ PLRPCELQRE KAFLKQAEYV PQCSEDGSFQ 

        70         80         90        100        110        120 
TVQCQNDGQS CWCVDSDGRE VPGSRQLGRP TVCLSFCQLH KQRILLGSYI NSTDALYLPQ 

       130        140        150        160        170        180 
CQDSGNYAPV QCDLQRVQCW CVDTEGMEVY GTRQQGRPTR CPRSCEIRNR RLLHGVGDRS 

       190        200        210        220        230        240 
PPQCTADGEF MPVQCKFVNT TDMMIFDLIH NYNRFPDAFV TFSSFRGRFP EVSGYCYCAD 

       250        260        270        280        290        300 
SQGRELAETG LELLLDEIYD TIFAGLDQAS TFTQSTMYRI LQRRFLAIQL VISGRFRCPT 

       310        320        330        340        350        360 
KCEVEQFAAT RFGHSYIPRC HRDGHYQTVQ CQTEGMCWCV DAQGREVPGT RQQGQPPSCA 

       370        380        390        400        410        420 
ADQSCALERQ QALSRFYFET PDYFSPQDLL SSEDRLAPVS GVRSDTSCPP RIKELFVDSG 

       430        440        450        460        470        480 
LLRSIAVEHY QRLSESRSLL REAIRAVFPS RELAGLALQF TTNPKRLQQN LFGGTFLANA 

       490        500        510        520        530        540 
AQFNLSGALG TRSTFNFSQF FQQFGLPGFL NRDRVTTLAK LLPVRLDSSS TPETLRVSEK 

       550        560        570        580        590        600 
TVAMNKRVVG NFGFKVNLQE NQDALKFLVS LLELPEFLVF LQRAVSVPED IARDLGDVME 

       610        620        630        640        650        660 
MVFSAQACKQ MPGKFFVPSC TAGGSYEDIQ CYAGECWCVD SRGKELDGSR VRGGRPRCPT 

       670        680        690        700        710        720 
KCEKQRAQMQ SLASAQPAGS SFFVPTCTRE GYFLPVQCFN SECYCVDTEG QVIPGTQSTV 

       730        740        750        760        770        780 
GEAKQCPSVC QLQAEQAFLG VVGVLLSNSS MVPSISNVYI PQCSASGQWR HVQCDGPHEQ 

       790        800        810        820        830        840 
VFEWYERWKT QNGDGQELTP AALLMKIVSY REVASRNFSL FLQSLYDAGQ QRIFPVLAQY 

       850        860        870        880        890        900 
PSLQDVPQVV LEGATTPPGE NIFLDPYIFW QILNGQLSQY PGPYSDFNMP LEHFNLRSCW 

       910        920        930        940        950        960 
CVDEAGQKLD GTQTKPGEIP ACPGPCEEVK LRVLKFIKET EEIVSASNAS SFPLGESFLV 

       970        980        990       1000       1010       1020 
AKGIQLTSEE LDLPPQFPSR DAFSEKFLRG GEYAIRLAAQ STLTFYQSLR ASLGKSDGAA 

      1030       1040       1050       1060       1070       1080 
SLLWSGPYMP QCNMIGGWEP VQCHAGTGQC WCVDGRGEFI PGSLMSRSSQ MPQCPTNCEL 

      1090       1100       1110       1120       1130       1140 
SRASGLISAW KQAGPQRNPG PGDLFIPVCL QTGEYVRKQT SGTGTWCVDP ASGEGMPVNT 

      1150       1160       1170       1180       1190       1200 
NGSAQCPGLC DVLKSRALSR KVGLGYSPVC EALDGAFSPV QCDLAQGSCW CVLGSGEEVP 

      1210       1220       1230       1240       1250       1260 
GTRVVGTQPA CESPQCPLPF SGSDVADGVI FCETASSSGV TTVQQCQLLC RQGLRSAFSP 

      1270       1280       1290       1300       1310       1320 
GPLICSLESQ HWVTLPPPRA CQRPQLWQTM QTQAHFQLLL PPGKMCSVDY SGLLQAFQVF 

      1330       1340       1350       1360       1370       1380 
ILDELIARGF CQIQVKTFGT LVSSTVCDNS SIQVGCLTAE RLGVNVTWKL QLEDISVGSL 

      1390       1400       1410       1420       1430       1440 
PDLYSIERAV TGQDLLGRFA DLIQSGRFQL HLDSKTFSAD TTLYFLNGDS FVTSPRTQLG 

      1450       1460       1470       1480       1490       1500 
CMEGFYRVPT TRQDALGCVK CPEGSFSQDG RCTPCPAGTY QEQAGSSACI PCPRGRTTIT 

      1510       1520       1530       1540       1550       1560 
TGAFSKTHCV TDCQKNEAGL QCDQNGQYQA SQKNRDSGEV FCVDSEGRKL QWLQTEAGLS 

      1570       1580       1590       1600       1610       1620 
ESQCLMIRKF DKAPESKVIF DANSPVIVKS SVPSADSPLV QCLTDCANDE ACSFLTVSTM 

      1630       1640       1650       1660       1670       1680 
ESEVSCDFYS WTRDNFACVT SDQEQDAMGS LKATSFGSLR CQVKVRNSGK DSLAVYVKKG 

      1690       1700       1710       1720       1730       1740 
YESTAAGQKS FEPTGFQNVL SGLYSPVVFS ASGANLTDTH TYCLLACDND SCCDGFIITQ 

      1750       1760       1770       1780       1790       1800 
VKGGPTICGL LSSPDILLCH INDWRDTSAT QANATCAGVT YDQGSRQMTL SLGGQEFLQG 

      1810       1820       1830       1840       1850       1860 
LALLEGTQDS FTSFQQVYLW KDSDMGSRPE SMGCERGMVP RSDFPGDMAT ELFSPVDITQ 

      1870       1880       1890       1900       1910       1920 
VIVNTSHSLP SQQYWLFTHL FSAEQANLWC LSRCAQEPIF CQLADITKSS SLYFTCFLYP 

      1930       1940       1950       1960       1970       1980 
EAQVCDNVME SNAKNCSQIL PHQPTALFRR KVVLNDRVKN FYTRLPFQKL TGISIRDKVP 

      1990       2000       2010       2020       2030       2040 
MSGKLISNGF FECERLCDRD PCCTGFGFLN VSQLQGGEVT CLTLNSMGIQ TCNEESGATW 

      2050       2060       2070       2080       2090       2100 
RILDCGSEDT EVHTYPFGWY QKPAVWSDTP SFCPSAALQS LTEEKVTSDS WQTLALSSVI 

      2110       2120       2130       2140       2150       2160 
VDPSIKHFDV AHISTAATSN FSMAQDFCLQ QCSRHQDCLV TTLQIQPGVV RCVFYPDIQN 

      2170       2180       2190       2200       2210       2220 
CIHSLRSHTC WLLLHEEATY IYRKSGIPLV QSDVTSTPSV RIDSFGQLQG GSQVIKVGTA 

      2230       2240       2250       2260       2270       2280 
WKQVYRFLGV PYAAPPLADN RFRAPEVLNW TGSWDATKPR ASCWQPGTRT PTPPQINEDC 

      2290       2300       2310       2320       2330       2340 
LYLNVFVPEN LVSNASVLVF FHNTMEMEGS GGQLTIDGSI LAAVGNFIVV TANYRLGVFG 

      2350       2360       2370       2380       2390       2400 
FLSSGSDEVA GNWGLLDQVA ALTWVQSHIG AFGGDPQRVT LAADRSGADV ASIHLLISRP 

      2410       2420       2430       2440       2450       2460 
TRLQLFRKAL LMGGSALSPA AIISPERAQQ QAAALAKEVG CPTSSIQEVV SCLRQKPANI 

      2470       2480       2490       2500       2510       2520 
LNDAQTKLLA VSGPFHYWGP VVDGQYLREL PSRRLKRPLP VKVDLLIGGS QDDGLINRAK 

      2530       2540       2550       2560       2570       2580 
AVKQFEESQG RTNSKTAFYQ ALQNSLGGED SDARILAAAV WYYSLEHSTD DYASFSRALE 

      2590       2600       2610       2620       2630       2640 
NATRDYFIIC PMVNMASLWA RRTRGNVFMY HVPESYGHGS LELLADVQYA FGLPFYSAYQ 

      2650       2660       2670       2680       2690       2700 
GQFSTEEQSL SLKVMQYFSN FIRSGNPNYP HEFSRKAAEF ATPWPDFIPG AGGESYKELS 

      2710       2720       2730       2740       2750       2760 
AQLPNRQGLK QADCSFWSKY IQTLKDADGA KDAQLTKSEE EDLEVGPGLE EDLSGSLEPV 


PKSYSK 

« Hide

References

« Hide 'large scale' references
[1]"Cloning and characterization of murine thyroglobulin cDNA."
Caturegli P., Vidalain P.O., Vali M., Aguilera-Galaviz L.A., Rose N.R.
Clin. Immunol. Immunopathol. 85:221-226(1997) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA].
[2]"A single amino acid change in the acetylcholinesterase-like domain of thyroglobulin causes congenital goiter with hypothyroidism in the cog/cog mouse: a model of human endoplasmic reticulum storage diseases."
Kim P.S., Hossain S.A., Park Y.-N., Lee I., Yoo S.-E., Arvan P.
Proc. Natl. Acad. Sci. U.S.A. 95:9909-9913(1998) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA], VARIANT GOITER PRO-2283.
Strain: COG.
Tissue: Thyroid.
[3]"Cloning, characterization, site-directed mutagenesis, and transient expression of 8301-nucleotide AKR/J mouse thyroglobulin cDNA: defective secretion of mutant thyroglobulins."
Hossain S.A., Yoo S.-E., Kim P.S.
Submitted (JUL-1998) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE [MRNA].
Strain: AKR/J.
Tissue: Thyroid.
[4]Mural R.J., Adams M.D., Myers E.W., Smith H.O., Venter J.C.
Submitted (JUL-2005) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
[5]"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
The MGC Project Team
Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
Tissue: Thyroid.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
U76389 mRNA. Translation: AAB53204.1.
AF076186 mRNA. Translation: AAC32268.1.
AF076187 mRNA. Translation: AAC32269.1.
CH466545 Genomic DNA. Translation: EDL29374.1.
BC111467 mRNA. Translation: AAI11468.1.
CCDSCCDS37091.1.
RefSeqNP_033401.2. NM_009375.2.
UniGeneMm.441333.

3D structure databases

ModBaseSearch...
MobiDBSearch...

Protein family/group databases

MEROPSS09.978.

PTM databases

PhosphoSiteO08710.

Proteomic databases

PRIDEO08710.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENSMUST00000065916; ENSMUSP00000070239; ENSMUSG00000053469.
GeneID21819.
KEGGmmu:21819.
UCSCuc007wap.1. mouse.

Organism-specific databases

CTD7038.
MGIMGI:98733. Tg.

Phylogenomic databases

eggNOGCOG2272.
GeneTreeENSGT00680000100015.
HOGENOMHOG000128427.
HOVERGENHBG017929.
InParanoidQ2NKY1.
KOK10809.
OMALRSCWCV.
OrthoDBEOG77M8MP.
TreeFamTF351833.

Gene expression databases

ArrayExpressO08710.
BgeeO08710.
CleanExMM_TG.
GenevestigatorO08710.

Family and domain databases

Gene3D3.40.50.1820. 1 hit.
4.10.800.10. 13 hits.
InterProIPR029058. AB_hydrolase.
IPR002018. CarbesteraseB.
IPR019819. Carboxylesterase_B_CS.
IPR016324. Thyroglobulin.
IPR000716. Thyroglobulin_1.
IPR011641. Tyr-kin_ephrin_A/B_rcpt-like.
[Graphical view]
PfamPF00135. COesterase. 1 hit.
PF07699. GCC2_GCC3. 1 hit.
PF00086. Thyroglobulin_1. 7 hits.
[Graphical view]
PIRSFPIRSF001831. Thyroglobulin. 1 hit.
SMARTSM00211. TY. 10 hits.
[Graphical view]
SUPFAMSSF53474. SSF53474. 1 hit.
SSF57610. SSF57610. 13 hits.
PROSITEPS00941. CARBOXYLESTERASE_B_2. 1 hit.
PS00484. THYROGLOBULIN_1_1. 9 hits.
PS51162. THYROGLOBULIN_1_2. 11 hits.
[Graphical view]
ProtoNetSearch...

Other

ChiTaRSTG. mouse.
NextBio301232.
PROO08710.
SOURCESearch...

Entry information

Entry nameTHYG_MOUSE
AccessionPrimary (citable) accession number: O08710
Secondary accession number(s): O88590, Q2NKY1, Q9QWY7
Entry history
Integrated into UniProtKB/Swiss-Prot: November 1, 1997
Last sequence update: July 27, 2011
Last modified: July 9, 2014
This is version 126 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families

MGD cross-references

Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot