Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P35712 (SOX6_HUMAN) Reviewed, UniProtKB/Swiss-Prot

Last modified April 16, 2014. Version 132. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (3) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Interactions·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Transcription factor SOX-6
Gene names
Name:SOX6
OrganismHomo sapiens (Human) [Reference proteome]
Taxonomic identifier9606 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo

Protein attributes

Sequence length828 AA.
Sequence statusComplete.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Transcriptional activator. Binds specifically to the DNA sequence 5'-AACAAT-3'. Plays a key role in several developmental processes, including neurogenesis and skeleton formation.

Subunit structure

Interacts with DAZAP2 By similarity.

Subcellular location

Nucleus Ref.7.

Tissue specificity

Expressed in a wide variety of tissues, most abundantly in skeletal muscle. Ref.1

Post-translational modification

Sumoylation inhibits the transcriptional activity. Ref.7

Sequence similarities

Contains 1 HMG box DNA-binding domain.

Sequence caution

The sequence BC037866 differs from that shown. Reason: Frameshift at position 505.

Ontologies

Keywords
   Biological processTranscription
Transcription regulation
   Cellular componentNucleus
   Coding sequence diversityAlternative splicing
   DomainCoiled coil
   LigandDNA-binding
   Molecular functionActivator
Developmental protein
   PTMIsopeptide bond
Ubl conjugation
   Technical termComplete proteome
Reference proteome
Gene Ontology (GO)
   Biological_processastrocyte differentiation

Inferred from electronic annotation. Source: Ensembl

cardiocyte differentiation

Inferred from electronic annotation. Source: Ensembl

cartilage development

Inferred from electronic annotation. Source: Ensembl

cell morphogenesis

Inferred from electronic annotation. Source: Ensembl

cellular response to transforming growth factor beta stimulus

Inferred from direct assay PubMed 21401405. Source: UniProtKB

erythrocyte development

Inferred from electronic annotation. Source: Ensembl

gene silencing

Inferred from electronic annotation. Source: Ensembl

in utero embryonic development

Inferred from electronic annotation. Source: Ensembl

muscle cell differentiation

Inferred from electronic annotation. Source: Ensembl

muscle organ development

Non-traceable author statement Ref.1. Source: UniProtKB

negative regulation of transcription from RNA polymerase II promoter

Inferred from electronic annotation. Source: Ensembl

oligodendrocyte cell fate specification

Inferred from electronic annotation. Source: Ensembl

positive regulation of cartilage development

Inferred from direct assay PubMed 21401405. Source: UniProtKB

positive regulation of chondrocyte differentiation

Inferred from direct assay PubMed 21401405. Source: UniProtKB

positive regulation of mesenchymal stem cell differentiation

Inferred from direct assay PubMed 21401405. Source: UniProtKB

positive regulation of transcription from RNA polymerase II promoter

Inferred from electronic annotation. Source: Ensembl

post-embryonic development

Inferred from electronic annotation. Source: Ensembl

regulation of transcription, DNA-templated

Non-traceable author statement Ref.6. Source: UniProtKB

transcription, DNA-templated

Inferred from electronic annotation. Source: UniProtKB-KW

   Cellular_componentnucleus

Non-traceable author statement Ref.6. Source: UniProtKB

   Molecular_functionDNA binding

Non-traceable author statement Ref.6. Source: UniProtKB

sequence-specific DNA binding

Inferred from electronic annotation. Source: Ensembl

sequence-specific DNA binding transcription factor activity

Non-traceable author statement Ref.6. Source: UniProtKB

transcription regulatory region DNA binding

Inferred from electronic annotation. Source: Ensembl

Complete GO annotation...

Binary interactions

With

Entry

#Exp.

IntAct

Notes

SHOXO152663EBI-3505706,EBI-3505698

Alternative products

This entry describes 4 isoforms produced by alternative splicing. [Align] [Select]
Isoform 1 (identifier: P35712-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform 2 (identifier: P35712-2)

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MGRM
     327-367: Missing.
     477-477: S → SLGKWKSQHQEETYE
Isoform 3 (identifier: P35712-3)

The sequence of this isoform differs from the canonical sequence as follows:
     578-597: Missing.
Isoform 4 (identifier: P35712-4)

The sequence of this isoform differs from the canonical sequence as follows:
     327-367: Missing.
     477-477: S → SLGKWKSQHQEETYE
Note: No experimental confirmation available.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Chain1 – 828828Transcription factor SOX-6
PRO_0000048729

Regions

DNA binding621 – 68969HMG box
Coiled coil184 – 26279 Potential
Compositional bias219 – 26143Gln-rich
Compositional bias280 – 2856Poly-Ala
Compositional bias313 – 3175Poly-Ala

Amino acid modifications

Cross-link404Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO) Ref.7
Cross-link417Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO) Ref.7

Natural variations

Alternative sequence11M → MGRM in isoform 2.
VSP_039693
Alternative sequence327 – 36741Missing in isoform 2 and isoform 4.
VSP_039694
Alternative sequence4771S → SLGKWKSQHQEETYE in isoform 2 and isoform 4.
VSP_039695
Alternative sequence578 – 59720Missing in isoform 3.
VSP_039696

Experimental info

Mutagenesis4041K → R: Partial loss of sumoylation. Complete loss of sumoylation; when associated with R-417. Ref.7
Mutagenesis4171K → R: Partial loss of sumoylation. Complete loss of sumoylation; when associated with R-404. Ref.7
Sequence conflict3301V → A in AAK26115. Ref.1
Sequence conflict6331K → R in CAA46614. Ref.6

Sequences

Sequence LengthMass (Da)Tools
Isoform 1 [UniParc].

Last modified November 25, 2008. Version 3.
Checksum: 38CA781528C839CF

FASTA82891,921
        10         20         30         40         50         60 
MSSKQATSPF ACAADGEDAM TQDLTSREKE EGSDQHVASH LPLHPIMHNK PHSEELPTLV 

        70         80         90        100        110        120 
STIQQDADWD SVLSSQQRME SENNKLCSLY SFRNTSTSPH KPDEGSRDRE IMTSVTFGTP 

       130        140        150        160        170        180 
ERRKGSLADV VDTLKQKKLE EMTRTEQEDS SCMEKLLSKD WKEKMERLNT SELLGEIKGT 

       190        200        210        220        230        240 
PESLAEKERQ LSTMITQLIS LREQLLAAHD EQKKLAASQI EKQRQQMDLA RQQQEQIARQ 

       250        260        270        280        290        300 
QQQLLQQQHK INLLQQQIQV QGHMPPLMIP IFPHDQRTLA AAAAAQQGFL FPPGITYKPG 

       310        320        330        340        350        360 
DNYPVQFIPS TMAAAAASGL SPLQLQKGHV SHPQINQRLK GLSDRFGRNL DTFEHGGGHS 

       370        380        390        400        410        420 
YNHKQIEQLY AAQLASMQVS PGAKMPSTPQ PPNTAGTVSP TGIKNEKRGT SPVTQVKDEA 

       430        440        450        460        470        480 
AAQPLNLSSR PKTAEPVKSP TSPTQNLFPA SKTSPVNLPN KSSIPSPIGG SLGRGSSLDI 

       490        500        510        520        530        540 
LSSLNSPALF GDQDTVMKAI QEARKMREQI QREQQQQQPH GVDGKLSSIN NMGLNSCRNE 

       550        560        570        580        590        600 
KERTRFENLG PQLTGKSNED GKLGPGVIDL TRPEDAEGSK AMNGSAAKLQ QYYCWPTGGA 

       610        620        630        640        650        660 
TVAEARVYRD ARGRASSEPH IKRPMNAFMV WAKDERRKIL QAFPDMHNSN ISKILGSRWK 

       670        680        690        700        710        720 
SMSNQEKQPY YEEQARLSKI HLEKYPNYKY KPRPKRTCIV DGKKLRIGEY KQLMRSRRQE 

       730        740        750        760        770        780 
MRQFFTVGQQ PQIPITTGTG VVYPGAITMA TTTPSPQMTS DCSSTSASPE PSLPVIQSTY 

       790        800        810        820 
GMKTDGGSLA GNEMINGEDE MEMYDDYEDD PKSDYSSENE APEAVSAN 

« Hide

Isoform 2 [UniParc].

Checksum: 9280E8989FDCFFEE
Show »

FASTA80489,332
Isoform 3 [UniParc].

Checksum: 93969BB5B2C71036
Show »

FASTA80889,735
Isoform 4 [UniParc].

Checksum: A9B101C9C38D0D84
Show »

FASTA80188,988

References

« Hide 'large scale' references
[1]"Cloning, characterization and chromosome mapping of the human SOX6 gene."
Cohen-Barak O., Hagiwara N., Arlt M.F., Horton J.P., Brilliant M.H.
Gene 265:157-164(2001) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA / MRNA] (ISOFORM 3), ALTERNATIVE SPLICING, TISSUE SPECIFICITY.
Tissue: Lymphocyte and Myoblast.
[2]"Towards a catalog of human genes and proteins: sequencing and analysis of 500 novel complete protein coding human cDNAs."
Wiemann S., Weil B., Wellenreuther R., Gassenhuber J., Glassl S., Ansorge W., Boecher M., Bloecker H., Bauersachs S., Blum H., Lauber J., Duesterhoeft A., Beyer A., Koehrer K., Strack N., Mewes H.-W., Ottenwaelder B., Obermaier B. expand/collapse author list , Tampe J., Heubner D., Wambutt R., Korn B., Klein M., Poustka A.
Genome Res. 11:422-435(2001) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
Tissue: Testis.
[3]"Human chromosome 11 DNA sequence and analysis including novel gene identification."
Taylor T.D., Noguchi H., Totoki Y., Toyoda A., Kuroki Y., Dewar K., Lloyd C., Itoh T., Takeda T., Kim D.-W., She X., Barlow K.F., Bloom T., Bruford E., Chang J.L., Cuomo C.A., Eichler E., FitzGerald M.G. expand/collapse author list , Jaffe D.B., LaButti K., Nicol R., Park H.-S., Seaman C., Sougnez C., Yang X., Zimmer A.R., Zody M.C., Birren B.W., Nusbaum C., Fujiyama A., Hattori M., Rogers J., Lander E.S., Sakaki Y.
Nature 440:497-500(2006) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
[4]Mural R.J., Istrail S., Sutton G.G., Florea L., Halpern A.L., Mobarry C.M., Lippert R., Walenz B., Shatkay H., Dew I., Miller J.R., Flanigan M.J., Edwards N.J., Bolanos R., Fasulo D., Halldorsson B.V., Hannenhalli S., Turner R. expand/collapse author list , Yooseph S., Lu F., Nusskern D.R., Shue B.C., Zheng X.H., Zhong F., Delcher A.L., Huson D.H., Kravitz S.A., Mouchard L., Reinert K., Remington K.A., Clark A.G., Waterman M.S., Eichler E.E., Adams M.D., Hunkapiller M.W., Myers E.W., Venter J.C.
Submitted (SEP-2005) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
[5]"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
The MGC Project Team
Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 4).
Tissue: Testis.
[6]"A conserved family of genes related to the testis determining gene, SRY."
Denny P., Swift S., Brand N., Dabhade N., Barton P., Ashworth A.
Nucleic Acids Res. 20:2887-2887(1992) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 632-685 (ISOFORMS 1/2/3).
[7]"Repression of SOX6 transcriptional activity by SUMO modification."
Fernandez-Lloris R., Osses N., Jaffray E., Shen L.N., Vaughan O.A., Girwood D., Bartrons R., Rosa J.L., Hay R.T., Ventura F.
FEBS Lett. 580:1215-1221(2006) [PubMed] [Europe PMC] [Abstract]
Cited for: SUMOYLATION AT LYS-404 AND LYS-417, MUTAGENESIS OF LYS-404 AND LYS-417, SUBCELLULAR LOCATION.
[8]"Lys-N and trypsin cover complementary parts of the phosphoproteome in a refined SCX-based approach."
Gauci S., Helbig A.O., Slijper M., Krijgsveld J., Heck A.J., Mohammed S.
Anal. Chem. 81:4493-4501(2009) [PubMed] [Europe PMC] [Abstract]
Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AF309034 mRNA. Translation: AAK26115.1.
AF309476 expand/collapse EMBL AC list , AF309471, AF309472, AF309473, AF309474, AF309475 Genomic DNA. Translation: AAK26243.1.
AF309476 expand/collapse EMBL AC list , AF309471, AF309472, AF309473, AF309474, AF309475 Genomic DNA. Translation: AAK26244.1.
AL136780 mRNA. Translation: CAB66714.1.
AC009869 Genomic DNA. No translation available.
AC013595 Genomic DNA. No translation available.
AC027016 Genomic DNA. No translation available.
AC068405 Genomic DNA. No translation available.
AC103794 Genomic DNA. No translation available.
CH471064 Genomic DNA. Translation: EAW68458.1.
BC037866 mRNA. No translation available.
BC047064 mRNA. Translation: AAH47064.2.
X65663 mRNA. Translation: CAA46614.1.
RefSeqNP_001139283.1. NM_001145811.1.
NP_001139291.1. NM_001145819.1.
NP_059978.1. NM_017508.2.
NP_201583.2. NM_033326.3.
UniGeneHs.368226.

3D structure databases

ProteinModelPortalP35712.
SMRP35712. Positions 619-688.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

BioGrid120714. 13 interactions.
IntActP35712. 3 interactions.
MINTMINT-4719252.
STRING9606.ENSP00000336946.

PTM databases

PhosphoSiteP35712.

Polymorphism databases

DMDM215274178.

Proteomic databases

PaxDbP35712.
PRIDEP35712.

Protocols and materials databases

DNASU55553.
StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENST00000316399; ENSP00000324948; ENSG00000110693. [P35712-3]
ENST00000352083; ENSP00000339876; ENSG00000110693. [P35712-1]
ENST00000396356; ENSP00000379644; ENSG00000110693. [P35712-3]
ENST00000527619; ENSP00000434455; ENSG00000110693. [P35712-2]
ENST00000528252; ENSP00000432134; ENSG00000110693. [P35712-4]
ENST00000528429; ENSP00000433233; ENSG00000110693. [P35712-1]
GeneID55553.
KEGGhsa:55553.
UCSCuc001mmd.3. human. [P35712-2]
uc001mme.3. human. [P35712-1]
uc001mmf.3. human. [P35712-4]
uc001mmg.3. human. [P35712-3]

Organism-specific databases

CTD55553.
GeneCardsGC11M015949.
HGNCHGNC:16421. SOX6.
HPAHPA001923.
HPA003908.
MIM607257. gene.
neXtProtNX_P35712.
PharmGKBPA38137.
GenAtlasSearch...

Phylogenomic databases

eggNOGNOG253815.
HOGENOMHOG000056455.
HOVERGENHBG003915.
InParanoidP35712.
KOK09269.
OMAFENLGPQ.
OrthoDBEOG70087H.
PhylomeDBP35712.
TreeFamTF320471.

Gene expression databases

ArrayExpressP35712.
BgeeP35712.
CleanExHS_SOX6.
GenevestigatorP35712.

Family and domain databases

Gene3D1.10.30.10. 1 hit.
InterProIPR009071. HMG_box_dom.
[Graphical view]
PfamPF00505. HMG_box. 1 hit.
[Graphical view]
SMARTSM00398. HMG. 1 hit.
[Graphical view]
SUPFAMSSF47095. SSF47095. 1 hit.
PROSITEPS50118. HMG_BOX_2. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

ChiTaRSSOX6. human.
GeneWikiSOX6.
GenomeRNAi55553.
NextBio60012.
PROP35712.
SOURCESearch...

Entry information

Entry nameSOX6_HUMAN
AccessionPrimary (citable) accession number: P35712
Secondary accession number(s): Q86VX7 expand/collapse secondary AC list , Q9BXQ3, Q9BXQ4, Q9BXQ5, Q9H0I8
Entry history
Integrated into UniProtKB/Swiss-Prot: June 1, 1994
Last sequence update: November 25, 2008
Last modified: April 16, 2014
This is version 132 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Relevant documents

SIMILARITY comments

Index of protein domains and families

MIM cross-references

Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot

Human chromosome 11

Human chromosome 11: entries, gene names and cross-references to MIM