Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

Q96NM4 (TOX2_HUMAN) Reviewed, UniProtKB/Swiss-Prot

Last modified July 9, 2014. Version 118. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (5) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
TOX high mobility group box family member 2
Alternative name(s):
Granulosa cell HMG box protein 1
Short name=GCX-1
Gene names
Name:TOX2
Synonyms:C20orf100, GCX1
OrganismHomo sapiens (Human) [Reference proteome]
Taxonomic identifier9606 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo

Protein attributes

Sequence length488 AA.
Sequence statusComplete.
Protein existenceEvidence at transcript level

General annotation (Comments)

Function

Putative transcriptional activator involved in the hypothalamo-pituitary-gonadal system.

Subcellular location

Nucleus By similarity.

Sequence similarities

Contains 1 HMG box DNA-binding domain.

Caution

It is uncertain whether Met-1 or Met-52 is the initiator.

Alternative products

This entry describes 4 isoforms produced by alternative splicing. [Align] [Select]
Isoform 1 (identifier: Q96NM4-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform 2 (identifier: Q96NM4-2)

The sequence of this isoform differs from the canonical sequence as follows:
     302-302: Q → QAYKRKTEAAKKEYLKALAAYRASLVSK
Note: No experimental confirmation available.
Isoform 3 (identifier: Q96NM4-3)

The sequence of this isoform differs from the canonical sequence as follows:
     1-51: Missing.
     302-302: Q → QAYKRKTEAAKKEYLKALAAYRASLVSK
Note: No experimental confirmation available.
Isoform 4 (identifier: Q96NM4-4)

The sequence of this isoform differs from the canonical sequence as follows:
     1-41: MQQTRTEAVAGAFSRCLGFCGMRLGLLLLARHWCIAGVFPQ → MDVRLYPSAPAVGARPGAEPAGLAHLDYYHGG
     302-302: Q → QAYKRKTEAAKKEYLKALAAYRASLVSK

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Chain1 – 488488TOX high mobility group box family member 2
PRO_0000048571

Regions

DNA binding255 – 32369HMG box
Region76 – 11439Required for transcriptional activation By similarity
Motif223 – 25230Nuclear localization signal By similarity
Compositional bias245 – 2506Poly-Lys
Compositional bias372 – 45685Pro-rich

Natural variations

Alternative sequence1 – 5151Missing in isoform 3.
VSP_045645
Alternative sequence1 – 4141MQQTR…GVFPQ → MDVRLYPSAPAVGARPGAEP AGLAHLDYYHGG in isoform 4.
VSP_047108
Alternative sequence3021Q → QAYKRKTEAAKKEYLKALAA YRASLVSK in isoform 2, isoform 3 and isoform 4.
VSP_002187
Natural variant2231V → A.
Corresponds to variant rs6103584 [ dbSNP | Ensembl ].
VAR_049560

Experimental info

Sequence conflict3721P → PP in BAF82595. Ref.1
Sequence conflict4821D → N in BAB70860. Ref.1

Sequences

Sequence LengthMass (Da)Tools
Isoform 1 [UniParc].

Last modified October 19, 2002. Version 2.
Checksum: 687FD144CF30731A

FASTA48851,604
        10         20         30         40         50         60 
MQQTRTEAVA GAFSRCLGFC GMRLGLLLLA RHWCIAGVFP QKFDGDSAYV GMSDGNPELL 

        70         80         90        100        110        120 
STSQTYNGQS ENNEDYEIPP ITPPNLPEPS LLHLGDHEAS YHSLCHGLTP NGLLPAYSYQ 

       130        140        150        160        170        180 
AMDLPAIMVS NMLAQDSHLL SGQLPTIQEM VHSEVAAYDS GRPGPLLGRP AMLASHMSAL 

       190        200        210        220        230        240 
SQSQLISQMG IRSSIAHSSP SPPGSKSATP SPSSSTQEEE SEVHFKISGE KRPSADPGKK 

       250        260        270        280        290        300 
AKNPKKKKKK DPNEPQKPVS AYALFFRDTQ AAIKGQNPSA TFGDVSKIVA SMWDSLGEEQ 

       310        320        330        340        350        360 
KQSSPDQGET KSTQANPPAK MLPPKQPMYA MPGLASFLTP SDLQAFRSGA SPASLARTLG 

       370        380        390        400        410        420 
SKSLLPGLSA SPPPPPSFPL SPTLHQQLSL PPHAQGALLS PPVSMSPAPQ PPVLPTPMAL 

       430        440        450        460        470        480 
QVQLAMSPSP PGPQDFPHIS EFPSSSGSCS PGPSNPTSSG DWDSSYPSGE CGISTCSLLP 


RDKSLYLT 

« Hide

Isoform 2 [UniParc].

Checksum: 5B9ED0A9228B1449
Show »

FASTA51554,645
Isoform 3 [UniParc].

Checksum: 07E22E3F3E8D782A
Show »

FASTA46449,112
Isoform 4 [UniParc].

Checksum: E5B3941DAA4E8536
Show »

FASTA50653,444

References

[1]"Complete sequencing and characterization of 21,243 full-length human cDNAs."
Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S. expand/collapse author list , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 3).
Tissue: Brain and Corpus callosum.
[2]"The DNA sequence and comparative analysis of human chromosome 20."
Deloukas P., Matthews L.H., Ashurst J.L., Burton J., Gilbert J.G.R., Jones M., Stavrides G., Almeida J.P., Babbage A.K., Bagguley C.L., Bailey J., Barlow K.F., Bates K.N., Beard L.M., Beare D.M., Beasley O.P., Bird C.P., Blakey S.E. expand/collapse author list , Bridgeman A.M., Brown A.J., Buck D., Burrill W.D., Butler A.P., Carder C., Carter N.P., Chapman J.C., Clamp M., Clark G., Clark L.N., Clark S.Y., Clee C.M., Clegg S., Cobley V.E., Collier R.E., Connor R.E., Corby N.R., Coulson A., Coville G.J., Deadman R., Dhami P.D., Dunn M., Ellington A.G., Frankland J.A., Fraser A., French L., Garner P., Grafham D.V., Griffiths C., Griffiths M.N.D., Gwilliam R., Hall R.E., Hammond S., Harley J.L., Heath P.D., Ho S., Holden J.L., Howden P.J., Huckle E., Hunt A.R., Hunt S.E., Jekosch K., Johnson C.M., Johnson D., Kay M.P., Kimberley A.M., King A., Knights A., Laird G.K., Lawlor S., Lehvaeslaiho M.H., Leversha M.A., Lloyd C., Lloyd D.M., Lovell J.D., Marsh V.L., Martin S.L., McConnachie L.J., McLay K., McMurray A.A., Milne S.A., Mistry D., Moore M.J.F., Mullikin J.C., Nickerson T., Oliver K., Parker A., Patel R., Pearce T.A.V., Peck A.I., Phillimore B.J.C.T., Prathalingam S.R., Plumb R.W., Ramsay H., Rice C.M., Ross M.T., Scott C.E., Sehra H.K., Shownkeen R., Sims S., Skuce C.D., Smith M.L., Soderlund C., Steward C.A., Sulston J.E., Swann R.M., Sycamore N., Taylor R., Tee L., Thomas D.W., Thorpe A., Tracey A., Tromans A.C., Vaudin M., Wall M., Wallis J.M., Whitehead S.L., Whittaker P., Willey D.L., Williams L., Williams S.A., Wilming L., Wray P.W., Hubbard T., Durbin R.M., Bentley D.R., Beck S., Rogers J.
Nature 414:865-871(2001) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
[3]Mural R.J., Istrail S., Sutton G., Florea L., Halpern A.L., Mobarry C.M., Lippert R., Walenz B., Shatkay H., Dew I., Miller J.R., Flanigan M.J., Edwards N.J., Bolanos R., Fasulo D., Halldorsson B.V., Hannenhalli S., Turner R. expand/collapse author list , Yooseph S., Lu F., Nusskern D.R., Shue B.C., Zheng X.H., Zhong F., Delcher A.L., Huson D.H., Kravitz S.A., Mouchard L., Reinert K., Remington K.A., Clark A.G., Waterman M.S., Eichler E.E., Adams M.D., Hunkapiller M.W., Myers E.W., Venter J.C.
Submitted (SEP-2005) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
[4]"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
The MGC Project Team
Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
Tissue: Muscle.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AK055135 mRNA. Translation: BAB70860.1.
AK289906 mRNA. Translation: BAF82595.1.
AL121587, AL034419 Genomic DNA. Translation: CAI21559.1.
AL121587, AL034419 Genomic DNA. Translation: CAI21560.1.
AL121587, AL034419 Genomic DNA. Translation: CAI21561.1.
AL034419, AL121587 Genomic DNA. Translation: CAI42198.1.
AL034419, AL121587 Genomic DNA. Translation: CAI42200.1.
AL034419, AL121587 Genomic DNA. Translation: CAI42201.1.
AL035089 Genomic DNA. No translation available.
AL353797 Genomic DNA. No translation available.
CH471077 Genomic DNA. Translation: EAW75944.1.
CH471077 Genomic DNA. Translation: EAW75945.1.
CH471077 Genomic DNA. Translation: EAW75946.1.
BC007636 mRNA. No translation available.
CCDSCCDS13324.1. [Q96NM4-3]
CCDS42875.1. [Q96NM4-1]
CCDS46603.1. [Q96NM4-4]
RefSeqNP_001092266.1. NM_001098796.1. [Q96NM4-3]
NP_001092267.1. NM_001098797.1. [Q96NM4-4]
NP_001092268.1. NM_001098798.1. [Q96NM4-1]
NP_116272.1. NM_032883.2. [Q96NM4-3]
XP_006723947.1. XM_006723884.1. [Q96NM4-2]
UniGeneHs.26608.

3D structure databases

ProteinModelPortalQ96NM4.
SMRQ96NM4. Positions 251-302.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

BioGrid124399. 3 interactions.
STRING9606.ENSP00000344724.

PTM databases

PhosphoSiteQ96NM4.

Polymorphism databases

DMDM24211591.

Proteomic databases

MaxQBQ96NM4.
PaxDbQ96NM4.
PRIDEQ96NM4.

Protocols and materials databases

DNASU84969.
StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENST00000341197; ENSP00000344724; ENSG00000124191. [Q96NM4-4]
ENST00000358131; ENSP00000350849; ENSG00000124191. [Q96NM4-1]
ENST00000372999; ENSP00000362090; ENSG00000124191. [Q96NM4-3]
ENST00000423191; ENSP00000390278; ENSG00000124191. [Q96NM4-3]
GeneID84969.
KEGGhsa:84969.
UCSCuc002xlf.4. human. [Q96NM4-1]
uc010ggo.3. human.

Organism-specific databases

CTD84969.
GeneCardsGC20P042543.
HGNCHGNC:16095. TOX2.
HPAHPA049900.
MIM611163. gene.
neXtProtNX_Q96NM4.
PharmGKBPA162406727.
GenAtlasSearch...

Phylogenomic databases

eggNOGNOG291143.
HOGENOMHOG000230949.
HOVERGENHBG051183.
OMADHEASYH.
OrthoDBEOG7R834J.
PhylomeDBQ96NM4.
TreeFamTF106481.

Gene expression databases

ArrayExpressQ96NM4.
BgeeQ96NM4.
CleanExHS_TOX2.
GenevestigatorQ96NM4.

Family and domain databases

Gene3D1.10.30.10. 1 hit.
InterProIPR009071. HMG_box_dom.
[Graphical view]
PfamPF00505. HMG_box. 1 hit.
[Graphical view]
SMARTSM00398. HMG. 1 hit.
[Graphical view]
SUPFAMSSF47095. SSF47095. 1 hit.
PROSITEPS50118. HMG_BOX_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

GenomeRNAi84969.
NextBio35463834.
PROQ96NM4.
SOURCESearch...

Entry information

Entry nameTOX2_HUMAN
AccessionPrimary (citable) accession number: Q96NM4
Secondary accession number(s): A8K1J1 expand/collapse secondary AC list , E1P5X0, G3XAC7, Q5TE33, Q5TE34, Q5TE35, Q96IC9, Q9BQN5
Entry history
Integrated into UniProtKB/Swiss-Prot: October 19, 2002
Last sequence update: October 19, 2002
Last modified: July 9, 2014
This is version 118 of the entry and version 2 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Relevant documents

SIMILARITY comments

Index of protein domains and families

MIM cross-references

Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot

Human polymorphisms and disease mutations

Index of human polymorphisms and disease mutations

Human entries with polymorphisms or disease mutations

List of human entries with polymorphisms or disease mutations

Human chromosome 20

Human chromosome 20: entries, gene names and cross-references to MIM