Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Protein single-minded

Gene

sim

Organism
Drosophila melanogaster (Fruit fly)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcription factor that functions as a master developmental regulator controlling midline development of the ventral nerve cord. Required to correctly specify the formation of the central brain complex, which controls walking behavior. Also required for correct patterning of the embryonic genital disk and anal pad anlage. Plays a role in synapse development.3 Publications

GO - Molecular functioni

GO - Biological processi

  • adult walking behavior Source: FlyBase
  • axon guidance Source: FlyBase
  • axonogenesis Source: FlyBase
  • brain development Source: FlyBase
  • central nervous system development Source: FlyBase
  • determination of genital disc primordium Source: FlyBase
  • ectoderm development Source: FlyBase
  • locomotion Source: FlyBase
  • positive regulation of transcription from RNA polymerase II promoter Source: FlyBase
  • regulation of transcription, DNA-templated Source: UniProtKB
  • transcription from RNA polymerase II promoter Source: GOC
  • ventral cord development Source: FlyBase
  • ventral midline development Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Developmental protein

Keywords - Biological processi

Differentiation, Neurogenesis, Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding

Names & Taxonomyi

Protein namesi
Recommended name:
Protein single-minded
Gene namesi
Name:sim
ORF Names:CG7771
OrganismiDrosophila melanogaster (Fruit fly)
Taxonomic identifieri7227 [NCBI]
Taxonomic lineageiEukaryotaMetazoaEcdysozoaArthropodaHexapodaInsectaPterygotaNeopteraEndopterygotaDipteraBrachyceraMuscomorphaEphydroideaDrosophilidaeDrosophilaSophophora
Proteomesi
  • UP000000803 Componenti: Chromosome 3R

Organism-specific databases

FlyBaseiFBgn0004666. sim.

Subcellular locationi

  • Nucleus PROSITE-ProRule annotation

GO - Cellular componenti

  • cytoplasm Source: FlyBase
  • nucleus Source: UniProtKB
  • transcription factor complex Source: InterPro
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi65 – 651S → F in allele sim-J1-47; temperature sensitive embryonic midline axon phenotype. 1 Publication

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 697697Protein single-mindedPRO_0000127438Add
BLAST

Proteomic databases

PaxDbiP05709.

Expressioni

Tissue specificityi

Embryonic nerve cord.1 Publication

Gene expression databases

BgeeiP05709.
GenevisibleiP05709. DM.

Interactioni

Subunit structurei

Efficient DNA binding requires dimerization with another bHLH protein.

GO - Molecular functioni

  • protein heterodimerization activity Source: FlyBase

Protein-protein interaction databases

BioGridi66699. 6 interactions.
IntActiP05709. 1 interaction.
STRINGi7227.FBpp0082178.

Structurei

3D structure databases

ProteinModelPortaliP05709.
SMRiP05709. Positions 26-381.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini24 – 7754bHLHPROSITE-ProRule annotationAdd
BLAST
Domaini100 – 17273PAS 1PROSITE-ProRule annotationAdd
BLAST
Domaini266 – 33671PAS 2PROSITE-ProRule annotationAdd
BLAST

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni406 – 4464114 X 3 AA repeats of A-A-Q (approximate)Add
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi553 – 672120Ser-richAdd
BLAST
Compositional biasi673 – 69321Gln/His-richAdd
BLAST

Sequence similaritiesi

Contains 1 bHLH (basic helix-loop-helix) domain.PROSITE-ProRule annotation
Contains 2 PAS (PER-ARNT-SIM) domains.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiKOG3559. Eukaryota.
ENOG410XY57. LUCA.
InParanoidiP05709.
KOiK09100.
OrthoDBiEOG790G0R.
PhylomeDBiP05709.

Family and domain databases

InterProiIPR011598. bHLH_dom.
IPR001067. Nuc_translocat.
IPR001610. PAC.
IPR000014. PAS.
IPR013767. PAS_fold.
IPR013655. PAS_fold_3.
[Graphical view]
PfamiPF00010. HLH. 1 hit.
PF00989. PAS. 1 hit.
PF08447. PAS_3. 1 hit.
[Graphical view]
PRINTSiPR00785. NCTRNSLOCATR.
SMARTiSM00353. HLH. 1 hit.
SM00086. PAC. 1 hit.
SM00091. PAS. 2 hits.
[Graphical view]
SUPFAMiSSF47459. SSF47459. 1 hit.
SSF55785. SSF55785. 2 hits.
TIGRFAMsiTIGR00229. sensory_box. 1 hit.
PROSITEiPS50888. BHLH. 1 hit.
PS50112. PAS. 2 hits.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform A (identifier: P05709-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MTNHRRVRKD CYESRLHDIA KTCAMKEKSK NAARTRREKE NTEFCELAKL
60 70 80 90 100
LPLPAAITSQ LDKASVIRLT TSYLKMRQVF PDGLGEAWGS SPAMQRGATI
110 120 130 140 150
KELGSHLLQT LDGFIFVVAP DGKIMYISET ASVHLGLSQV ELTGNSIFEY
160 170 180 190 200
IHNYDQDEMN AILSLHPHIN QHPLAQTHTP IGSPNGVQHP SAYDHDRGSH
210 220 230 240 250
TIEIEKTFFL RMKCVLAKRN AGLTTSGFKV IHCSGYLKAR IYPDRGDGQG
260 270 280 290 300
SLIQNLGLVA VGHSLPSSAI TEIKLHQNMF MFRAKLDMKL IFFDARVSQL
310 320 330 340 350
TGYEPQDLIE KTLYQYIHAA DIMAMRCSHQ ILLYKGQVTT KYYRFLTKGG
360 370 380 390 400
GWVWVQSYAT LVHNSRSSRE VFIVSVNYVL SEREVKDLVL NEIQTGVVKR
410 420 430 440 450
EPISPAAQAA QAAQAAQAAQ AAQAAQAAQA AQAAQAAHVA QAVQAQVVVV
460 470 480 490 500
PQQSVVVQPQ CAGATGQPVG PGTPVSLALS ASPKLDPYFE PELPLQPAVT
510 520 530 540 550
PVPPTNNSSS SSNNNNGVWH HHHVQQQQQS GSMDHDSLSY TQLYPPLNDL
560 570 580 590 600
VVSSSSSVGG GTASSAGGGS SASASSSGVY STEMQYPDTT TGNLYYNNNN
610 620 630 640 650
HYYYDYDATV DVATSMIRPF SANSNSCSSS SESERQLSTG NASIVNETSP
660 670 680 690
SQTTYSDLSH NFELSYFSDN SSQQHQHQQQ QQHLMEQQHL QYQYATW
Note: No experimental confirmation available.
Length:697
Mass (Da):76,475
Last modified:May 23, 2003 - v3
Checksum:i588414A4A17101AD
GO
Isoform B (identifier: P05709-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-24: Missing.

Show »
Length:673
Mass (Da):73,589
Checksum:i2F9F0ABBA2BC0FBE
GO

Sequence cautioni

The sequence AAC64519.1 differs from that shown. Reason: Erroneous gene model prediction. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti151 – 1511I → Y in AAC64519 (PubMed:9840810).Curated

Polymorphismi

Berkeley strain has 11 A-A-Q repeats.1 Publication

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti406 – 4149Missing in strain: Berkeley.

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 2424Missing in isoform B. CuratedVSP_011812Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF071934 Genomic DNA. Translation: AAC64519.1. Sequence problems.
AE014297 Genomic DNA. Translation: AAF54902.3.
AE014297 Genomic DNA. Translation: AAN14343.3.
AY129457 mRNA. Translation: AAM76199.1.
M19020 mRNA. Translation: AAA28900.1.
PIRiA29945.
A41647.
RefSeqiNP_524340.2. NM_079616.4.
NP_731771.3. NM_169495.4.
UniGeneiDm.4557.

Genome annotation databases

GeneIDi41612.
KEGGidme:Dmel_CG7771.

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF071934 Genomic DNA. Translation: AAC64519.1. Sequence problems.
AE014297 Genomic DNA. Translation: AAF54902.3.
AE014297 Genomic DNA. Translation: AAN14343.3.
AY129457 mRNA. Translation: AAM76199.1.
M19020 mRNA. Translation: AAA28900.1.
PIRiA29945.
A41647.
RefSeqiNP_524340.2. NM_079616.4.
NP_731771.3. NM_169495.4.
UniGeneiDm.4557.

3D structure databases

ProteinModelPortaliP05709.
SMRiP05709. Positions 26-381.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi66699. 6 interactions.
IntActiP05709. 1 interaction.
STRINGi7227.FBpp0082178.

Proteomic databases

PaxDbiP05709.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi41612.
KEGGidme:Dmel_CG7771.

Organism-specific databases

CTDi41612.
FlyBaseiFBgn0004666. sim.

Phylogenomic databases

eggNOGiKOG3559. Eukaryota.
ENOG410XY57. LUCA.
InParanoidiP05709.
KOiK09100.
OrthoDBiEOG790G0R.
PhylomeDBiP05709.

Miscellaneous databases

GenomeRNAii41612.
PROiP05709.

Gene expression databases

BgeeiP05709.
GenevisibleiP05709. DM.

Family and domain databases

InterProiIPR011598. bHLH_dom.
IPR001067. Nuc_translocat.
IPR001610. PAC.
IPR000014. PAS.
IPR013767. PAS_fold.
IPR013655. PAS_fold_3.
[Graphical view]
PfamiPF00010. HLH. 1 hit.
PF00989. PAS. 1 hit.
PF08447. PAS_3. 1 hit.
[Graphical view]
PRINTSiPR00785. NCTRNSLOCATR.
SMARTiSM00353. HLH. 1 hit.
SM00086. PAC. 1 hit.
SM00091. PAS. 2 hits.
[Graphical view]
SUPFAMiSSF47459. SSF47459. 1 hit.
SSF55785. SSF55785. 2 hits.
TIGRFAMsiTIGR00229. sensory_box. 1 hit.
PROSITEiPS50888. BHLH. 1 hit.
PS50112. PAS. 2 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Specification of the Drosophila CNS midline cell lineage: direct control of single-minded transcription by dorsal/ventral patterning genes."
    Kasai Y., Stahl S., Crews S.
    Gene Expr. 7:171-189(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] (ISOFORM B), FUNCTION.
  2. "Novel behavioral and developmental defects associated with Drosophila single-minded."
    Pielage J., Steffes G., Lau D.C., Parente B.A., Crews S.T., Strauss R., Klambt C.
    Dev. Biol. 249:283-299(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE (ALLELE SIM-J1-47), FUNCTION, TISSUE SPECIFICITY, MUTAGENESIS OF SER-65.
  3. "The genome sequence of Drosophila melanogaster."
    Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., Sutton G.G., Wortman J.R., Yandell M.D.
    , Zhang Q., Chen L.X., Brandon R.C., Rogers Y.-H.C., Blazej R.G., Champe M., Pfeiffer B.D., Wan K.H., Doyle C., Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C., Baldwin D., Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., Cherry J.M., Cawley S., Dahlke C., Davenport L.B., Davies P., de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., Durbin K.J., Evangelista C.C., Ferraz C., Ferriera S., Fleischmann W., Fosler C., Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M., Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C., Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J.A., Ketchum K.A., Kimmel B.E., Kodira C.D., Kraft C.L., Kravitz S., Kulp D., Lai Z., Lasko P., Lei Y., Levitsky A.A., Li J.H., Li Z., Liang Y., Lin X., Liu X., Mattei B., McIntosh T.C., McLeod M.P., McPherson D., Merkulov G., Milshina N.V., Mobarry C., Morris J., Moshrefi A., Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., Reinert K., Remington K., Saunders R.D.C., Scheeler F., Shen H., Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T.J., Spier E., Spradling A.C., Stapleton M., Strong R., Sun E., Svirskas R., Tector C., Turner R., Venter E., Wang A.H., Wang X., Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S.C., Zhu X., Smith H.O., Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.
    Science 287:2185-2195(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: Berkeley.
  4. Cited for: GENOME REANNOTATION, ALTERNATIVE SPLICING.
    Strain: Berkeley.
  5. "A Drosophila full-length cDNA resource."
    Stapleton M., Carlson J.W., Brokstein P., Yu C., Champe M., George R.A., Guarin H., Kronmiller B., Pacleb J.M., Park S., Wan K.H., Rubin G.M., Celniker S.E.
    Genome Biol. 3:RESEARCH0080.1-RESEARCH0080.8(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM A).
    Strain: Berkeley.
    Tissue: Embryo.
  6. "The Drosophila single-minded gene encodes a helix-loop-helix protein that acts as a master regulator of CNS midline development."
    Nambu J.R., Lewis J.O., Wharton K.A. Jr., Crews S.T.
    Cell 67:1157-1167(1991) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE OF 25-42, SIMILARITY TO HLH PROTEINS.
  7. "The Drosophila single-minded gene encodes a nuclear protein with sequence similarity to the per gene product."
    Crews S.T., Thomas J.B., Goodman C.S.
    Cell 52:143-151(1988) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 43-697.
  8. "GATAD2B loss-of-function mutations cause a recognisable syndrome with intellectual disability and are associated with learning deficits and synaptic undergrowth in Drosophila."
    Willemsen M.H., Nijhof B., Fenckova M., Nillesen W.M., Bongers E.M., Castells-Nobau A., Asztalos L., Viragh E., van Bon B.W., Tezel E., Veltman J.A., Brunner H.G., de Vries B.B., de Ligt J., Yntema H.G., van Bokhoven H., Isidor B., Le Caignec C.
    , Lorino E., Asztalos Z., Koolen D.A., Vissers L.E., Schenck A., Kleefstra T.
    J. Med. Genet. 50:507-514(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION IN SYNAPSE DEVELOPMENT.

Entry informationi

Entry nameiSIM_DROME
AccessioniPrimary (citable) accession number: P05709
Secondary accession number(s): O96521
, Q7KSL7, Q8MQI7, Q9VFZ3
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 1, 1988
Last sequence update: May 23, 2003
Last modified: June 8, 2016
This is version 166 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programDrosophila annotation project

Miscellaneousi

Miscellaneous

Mutations result in the loss of the precursor cells that give rise to midline cells of the embryonic central nervous system.

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Drosophila
    Drosophila: entries, gene names and cross-references to FlyBase
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.