Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Symplekin

Gene

Sym

Organism
Drosophila melanogaster (Fruit fly)
Status
Reviewed-Annotation score: Annotation score: 4 out of 5-Experimental evidence at protein leveli

Functioni

Component of a protein complex required for cotranscriptional processing of 3'-ends of polyadenylated and histone pre-mRNA.2 Publications

GO - Molecular functioni

  • DNA binding Source: FlyBase
  • RNA binding Source: UniProtKB-KW

GO - Biological processi

  • mRNA 3'-end processing by stem-loop binding and cleavage Source: FlyBase
  • mRNA polyadenylation Source: FlyBase
Complete GO annotation...

Keywords - Biological processi

mRNA processing

Keywords - Ligandi

RNA-binding

Names & Taxonomyi

Protein namesi
Recommended name:
SymplekinImported
Gene namesi
Name:Sym
ORF Names:CG2097
OrganismiDrosophila melanogaster (Fruit fly)
Taxonomic identifieri7227 [NCBI]
Taxonomic lineageiEukaryotaMetazoaEcdysozoaArthropodaHexapodaInsectaPterygotaNeopteraEndopterygotaDipteraBrachyceraMuscomorphaEphydroideaDrosophilidaeDrosophilaSophophora
Proteomesi
  • UP000000803 Componenti: Chromosome 3R

Organism-specific databases

FlyBaseiFBgn0037371. Sym.

Subcellular locationi

GO - Cellular componenti

  • histone locus body Source: UniProtKB
  • nucleus Source: FlyBase
  • tricellular tight junction Source: FlyBase
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00004219781 – 1165SymplekinAdd BLAST1165

Proteomic databases

PaxDbiQ8MSU4.
PRIDEiQ8MSU4.

Expressioni

Gene expression databases

BgeeiFBgn0037371.
GenevisibleiQ8MSU4. DM.

Interactioni

Subunit structurei

Interacts with Cpsf73 and Cpsf100 forming a core cleavage factor required for both polyadenylated and histone mRNA processing. Interacts with Slbp and Lsm11.1 Publication

Protein-protein interaction databases

BioGridi65914. 3 interactors.
IntActiQ8MSU4. 12 interactors.
MINTiMINT-754132.
STRINGi7227.FBpp0078372.

Structurei

Secondary structure

11165
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Helixi22 – 37Combined sources16
Helixi42 – 56Combined sources15
Turni57 – 60Combined sources4
Helixi61 – 67Combined sources7
Helixi68 – 72Combined sources5
Helixi73 – 75Combined sources3
Helixi80 – 96Combined sources17
Helixi98 – 103Combined sources6
Helixi105 – 111Combined sources7
Helixi117 – 140Combined sources24
Beta strandi141 – 143Combined sources3
Helixi146 – 164Combined sources19
Helixi165 – 167Combined sources3
Helixi171 – 187Combined sources17
Beta strandi193 – 195Combined sources3
Helixi204 – 206Combined sources3
Helixi216 – 234Combined sources19
Helixi241 – 257Combined sources17
Helixi259 – 261Combined sources3
Helixi262 – 274Combined sources13
Helixi282 – 300Combined sources19
Helixi303 – 308Combined sources6
Helixi309 – 318Combined sources10
Helixi323 – 327Combined sources5
Helixi335 – 348Combined sources14

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3GS3X-ray2.40A19-270[»]
4IMIX-ray2.35A/C19-351[»]
4IMJX-ray2.58A/C19-351[»]
4YGXX-ray2.95A/C19-351[»]
ProteinModelPortaliQ8MSU4.
SMRiQ8MSU4.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ8MSU4.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Repeati23 – 58HEAT 11 PublicationAdd BLAST36
Repeati61 – 95HEAT 21 PublicationAdd BLAST35
Repeati98 – 140HEAT 31 PublicationAdd BLAST43
Repeati147 – 186HEAT 41 PublicationAdd BLAST40
Repeati218 – 257HEAT 51 PublicationAdd BLAST40

Domaini

The HEAT repeats have been determined based on 3D-structure analysis and are not detected by sequence-based prediction programs.1 Publication

Sequence similaritiesi

Belongs to the Symplekin family.Curated
Contains 5 HEAT repeats.1 Publication

Keywords - Domaini

Coiled coil, Repeat

Phylogenomic databases

eggNOGiKOG1895. Eukaryota.
ENOG410XQAS. LUCA.
GeneTreeiENSGT00390000017045.
HOGENOMiHOG000247061.
InParanoidiQ8MSU4.
KOiK06100.
OMAiNDGIRTN.
OrthoDBiEOG091G02S0.
PhylomeDBiQ8MSU4.

Family and domain databases

Gene3Di1.25.10.10. 4 hits.
InterProiIPR011989. ARM-like.
IPR016024. ARM-type_fold.
IPR021850. Symplekin/Pta1.
IPR032460. Symplekin/Pta1_N.
IPR022075. Symplekin_C.
[Graphical view]
PANTHERiPTHR15245:SF20. PTHR15245:SF20. 1 hit.
PfamiPF11935. DUF3453. 1 hit.
PF12295. Symplekin_C. 1 hit.
[Graphical view]
SUPFAMiSSF48371. SSF48371. 7 hits.

Sequencei

Sequence statusi: Complete.

Q8MSU4-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MDSIIGRSQF VSETANLFTD EKTATARAKV VDWCNELVIA SPSTKCELLA
60 70 80 90 100
KVQETVLGSC AELAEEFLES VLSLAHDSNM EVRKQVVAFV EQVCKVKVEL
110 120 130 140 150
LPHVINVVSM LLRDNSAQVI KRVIQACGSI YKNGLQYLCS LMEPGDSAEQ
160 170 180 190 200
AWNILSLIKA QILDMIDNEN DGIRTNAIKF LEGVVVLQSF ADEDSLKRDG
210 220 230 240 250
DFSLADVPDH CTLFRREKLQ EEGNNILDIL LQFHGTTHIS SVNLIACTSS
260 270 280 290 300
LCTIAKMRPI FMGAVVEAFK QLNANLPPTL TDSQVSSVRK SLKMQLQTLL
310 320 330 340 350
KNRGAFEFAS TIRGMLVDLG SSTNEIQKLI PKMDKQEMAR RQKRILENAA
360 370 380 390 400
QSLAKRARLA CEQQDQQQRE MELDTEELER QKQKSTRVNE KFLAEHFRNP
410 420 430 440 450
ETVVTLVLEF LPSLPTEVPQ KFLQEYTPIR EMSIQQQVTN ISRFFGEQLS
460 470 480 490 500
EKRLGPGAAT FSREPPMRVK KVQAIESTLT AMEVDEDAVQ KLSEEEFQRK
510 520 530 540 550
EEATKKLRET MERAKGEQTV IEKMKERAKT LKLQEITKPL PRNLKEKFLT
560 570 580 590 600
DAVRRILNSE RQCIKGGVSS KRRKLVTVIA ATFPDNVRYG IMEFILEDIK
610 620 630 640 650
QRIDLAFSWL FEEYSLLQGF TRHTYVKTEN RPDHAYNELL NKLIFGIGER
660 670 680 690 700
CDHKDKIILI RRVYLEAPIL PEVSIGHLVQ LSLDDEFSQH GLELIKDLAV
710 720 730 740 750
LRPPRKNRFV RVLLNFSVHE RLDLRDLAQA HLVSLYHVHK ILPARIDEFA
760 770 780 790 800
LEWLKFIEQE SPPAAVFSQD FGRPTEEPDW REDTTKVCFG LAFTLLPYKP
810 820 830 840 850
EVYLQQICQV FVSTSAELKR TILRSLDIPI KKMGVESPTL LQLIEDCPKG
860 870 880 890 900
METLVIRIIY ILTERVPSPH EELVRRVRDL YQNKVKDVRV MIPVLSGLTR
910 920 930 940 950
SELISVLPKL IKLNPAVVKE VFNRLLGIGA EFAHQTMAMT PTDILVALHT
960 970 980 990 1000
IDTSVCDIKA IVKATSLCLA ERDLYTQEVL MAVLQQLVEV TPLPTLMMRT
1010 1020 1030 1040 1050
TIQSLTLYPR LANFVMNLLQ RLIIKQVWRQ KVIWEGFLKT VQRLKPQSMP
1060 1070 1080 1090 1100
ILLHLPPAQL VDALQQCPDL RPALSEYAES MQDEPMNGSG ITQQVLDIIS
1110 1120 1130 1140 1150
GKSVDVFVTD ESGGYISAEH IKKEAPDPSE ISVISTVPVL TSLVPLPVPP
1160
PIGSDLNQPL PPGED
Length:1,165
Mass (Da):132,077
Last modified:October 1, 2002 - v1
Checksum:iCFA818C50B2CC847
GO

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AE014297 Genomic DNA. Translation: AAF51962.2.
AY118592 mRNA. Translation: AAM49961.1.
RefSeqiNP_649580.1. NM_141323.2.
UniGeneiDm.31227.

Genome annotation databases

EnsemblMetazoaiFBtr0078723; FBpp0078372; FBgn0037371.
GeneIDi40709.
KEGGidme:Dmel_CG2097.
UCSCiCG2097-RA. d. melanogaster.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AE014297 Genomic DNA. Translation: AAF51962.2.
AY118592 mRNA. Translation: AAM49961.1.
RefSeqiNP_649580.1. NM_141323.2.
UniGeneiDm.31227.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3GS3X-ray2.40A19-270[»]
4IMIX-ray2.35A/C19-351[»]
4IMJX-ray2.58A/C19-351[»]
4YGXX-ray2.95A/C19-351[»]
ProteinModelPortaliQ8MSU4.
SMRiQ8MSU4.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi65914. 3 interactors.
IntActiQ8MSU4. 12 interactors.
MINTiMINT-754132.
STRINGi7227.FBpp0078372.

Proteomic databases

PaxDbiQ8MSU4.
PRIDEiQ8MSU4.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsemblMetazoaiFBtr0078723; FBpp0078372; FBgn0037371.
GeneIDi40709.
KEGGidme:Dmel_CG2097.
UCSCiCG2097-RA. d. melanogaster.

Organism-specific databases

CTDi40709.
FlyBaseiFBgn0037371. Sym.

Phylogenomic databases

eggNOGiKOG1895. Eukaryota.
ENOG410XQAS. LUCA.
GeneTreeiENSGT00390000017045.
HOGENOMiHOG000247061.
InParanoidiQ8MSU4.
KOiK06100.
OMAiNDGIRTN.
OrthoDBiEOG091G02S0.
PhylomeDBiQ8MSU4.

Miscellaneous databases

EvolutionaryTraceiQ8MSU4.
GenomeRNAii40709.
PROiQ8MSU4.

Gene expression databases

BgeeiFBgn0037371.
GenevisibleiQ8MSU4. DM.

Family and domain databases

Gene3Di1.25.10.10. 4 hits.
InterProiIPR011989. ARM-like.
IPR016024. ARM-type_fold.
IPR021850. Symplekin/Pta1.
IPR032460. Symplekin/Pta1_N.
IPR022075. Symplekin_C.
[Graphical view]
PANTHERiPTHR15245:SF20. PTHR15245:SF20. 1 hit.
PfamiPF11935. DUF3453. 1 hit.
PF12295. Symplekin_C. 1 hit.
[Graphical view]
SUPFAMiSSF48371. SSF48371. 7 hits.
ProtoNetiSearch...

Entry informationi

Entry nameiSYMPK_DROME
AccessioniPrimary (citable) accession number: Q8MSU4
Secondary accession number(s): Q9VNH4
Entry historyi
Integrated into UniProtKB/Swiss-Prot: April 3, 2013
Last sequence update: October 1, 2002
Last modified: November 30, 2016
This is version 119 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programDrosophila annotation project

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Drosophila
    Drosophila: entries, gene names and cross-references to FlyBase
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.