Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Q8CFT2

- SET1B_MOUSE

UniProt

Q8CFT2 - SET1B_MOUSE

(max 400 entries)x

Your basket is currently empty.

Select item(s) and click on "Add to basket" to create your own collection here
(400 entries max)

Protein

Histone-lysine N-methyltransferase SETD1B

Gene

Setd1b

Organism
Mus musculus (Mouse)
Status
Reviewed - Annotation score: 4 out of 5- Experimental evidence at protein leveli

Functioni

Histone methyltransferase that specifically methylates 'Lys-4' of histone H3, when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. H3 'Lys-4' methylation represents a specific tag for epigenetic transcriptional activation. The non-overalpping localization with SETD1B suggests that SETD1A and SETD1B make non-redundant contributions to the epigenetic control of chromatin structure and gene expression (By similarity).By similarity

Catalytic activityi

S-adenosyl-L-methionine + L-lysine-[histone] = S-adenosyl-L-homocysteine + N(6)-methyl-L-lysine-[histone].

GO - Molecular functioni

  1. histone methyltransferase activity (H3-K4 specific) Source: Ensembl
  2. nucleotide binding Source: InterPro
  3. RNA binding Source: UniProtKB-KW

GO - Biological processi

  1. regulation of transcription, DNA-templated Source: UniProtKB-KW
  2. transcription, DNA-templated Source: UniProtKB-KW
Complete GO annotation...

Keywords - Molecular functioni

Activator, Chromatin regulator, Methyltransferase, Transferase

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

RNA-binding, S-adenosyl-L-methionine

Names & Taxonomyi

Protein namesi
Recommended name:
Histone-lysine N-methyltransferase SETD1B (EC:2.1.1.43)
Alternative name(s):
SET domain-containing protein 1B
Gene namesi
Name:Setd1b
Synonyms:Kiaa1076, Set1b
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
ProteomesiUP000000589: Chromosome 5

Organism-specific databases

MGIiMGI:2652820. Setd1b.

Subcellular locationi

Nucleus speckle By similarity. Chromosome By similarity
Note: Localizes to a largely non-overlapping set of euchromatic nuclear speckles with SETD1A, suggesting that SETD1A and SETD1B each bind to a unique set of target genes.By similarity

GO - Cellular componenti

  1. chromosome Source: UniProtKB-KW
  2. nucleus Source: MGI
  3. Set1C/COMPASS complex Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Chromosome, Nucleus

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 19851985Histone-lysine N-methyltransferase SETD1BPRO_0000316994Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei1678 – 16781PhosphoserineBy similarity
Modified residuei1682 – 16821PhosphoserineBy similarity

Keywords - PTMi

Phosphoprotein

Proteomic databases

MaxQBiQ8CFT2.
PaxDbiQ8CFT2.
PRIDEiQ8CFT2.

PTM databases

PhosphoSiteiQ8CFT2.

Expressioni

Tissue specificityi

Widely expressed.1 Publication

Gene expression databases

BgeeiQ8CFT2.
CleanExiMM_SETD1B.
ExpressionAtlasiQ8CFT2. baseline and differential.
GenevestigatoriQ8CFT2.

Interactioni

Subunit structurei

Component of the SET1 complex, at least composed of the catalytic subunit (SETD1A or SETD1B), WDR5, WDR82, RBBP5, ASH2L/ASH2, CXXC1/CFP1, HCFC1 and DPY30. Interacts with HCFC1. and ASH2L/ASH2. Interacts (via the RRM domain) with WDR82. Interacts (via the RRM domain) with hyperphosphorylated C-terminal domain (CTD) of RNA polymerase II large subunit (POLR2A) only in the presence of WDR82. Binds specifically to CTD heptad repeats phosphorylated on 'Ser-5' of each heptad. Interacts with RBM15 (By similarity).By similarity

Structurei

3D structure databases

ProteinModelPortaliQ8CFT2.
SMRiQ8CFT2. Positions 97-202.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini92 – 18089RRMPROSITE-ProRule annotationAdd
BLAST
Domaini1846 – 1963118SETPROSITE-ProRule annotationAdd
BLAST
Domaini1969 – 198517Post-SETPROSITE-ProRule annotationAdd
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi373 – 16901318Pro-richAdd
BLAST
Compositional biasi1031 – 1179149Ser-richAdd
BLAST
Compositional biasi1059 – 1331273Glu-richAdd
BLAST

Sequence similaritiesi

Belongs to the class V-like SAM-binding methyltransferase superfamily.PROSITE-ProRule annotation
Contains 1 post-SET domain.PROSITE-ProRule annotation
Contains 1 RRM (RNA recognition motif) domain.PROSITE-ProRule annotation
Contains 1 SET domain.PROSITE-ProRule annotation

Phylogenomic databases

eggNOGiCOG2940.
GeneTreeiENSGT00760000119228.
HOVERGENiHBG055596.
InParanoidiQ8CFT2.
KOiK11422.
OMAiQHWPPLP.
OrthoDBiEOG7GQXTT.
TreeFamiTF106436.

Family and domain databases

Gene3Di3.30.70.330. 2 hits.
InterProiIPR024657. COMPASS_Set1_N-SET.
IPR012677. Nucleotide-bd_a/b_plait.
IPR003616. Post-SET_dom.
IPR000504. RRM_dom.
IPR001214. SET_dom.
[Graphical view]
PfamiPF11764. N-SET. 1 hit.
PF00076. RRM_1. 1 hit.
PF00856. SET. 1 hit.
[Graphical view]
SMARTiSM00508. PostSET. 1 hit.
SM00360. RRM. 1 hit.
SM00317. SET. 1 hit.
[Graphical view]
PROSITEiPS50868. POST_SET. 1 hit.
PS50102. RRM. 1 hit.
PS50280. SET. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. Align

Isoform 1 (identifier: Q8CFT2-1) [UniParc]FASTAAdd to Basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MENSHPHHHH QQPPPQPGPS GERRNHHWRS YKLMIDPALK KGHHKLYRYD
60 70 80 90 100
GQHFSLAMSS NRPVEIVEDP RVVGIWTKNK ELELSVPKFK IDEFYVGPVP
110 120 130 140 150
PKQVTFAKLN DNVRENFLRD MCKKYGEVEE VEILYNPKTK KHLGIAKVVF
160 170 180 190 200
ATVRGAKEAV QHLHSTSVMG NIIHVELDTK GETRMRFYEL LVTGRYTPQT
210 220 230 240 250
LPVGELDAIS PIVSETLQLS DALKRLKDGS LSAGCGSGSS SVTPNSGGTP
260 270 280 290 300
FSQDTAYSSC RLDTPNSYGQ GTPITPRLGT PFSQDSSYSS RQPTPSYLFS
310 320 330 340 350
QDPTATFKAR RHESKFTDAY NRRHEHHYVH NSAVAGATAP FRGSSDLSFG
360 370 380 390 400
TVGSSGTPFK AQSQDATTFA HTPPPAQTAT ASGFKSAFSP YQTPAPPFPP
410 420 430 440 450
PPEEPTATAA FGSRDSGEFR RAPAPPPLPP AEPPAKEKPG TPPGPPPPDS
460 470 480 490 500
NSMELGGRPT FGWSPEPCDS PGTPTLESSP AGPEKPHDSL DSRIEMLLKE
510 520 530 540 550
QRTKLPFLRE QDSDTEIQME GSPISSSSSQ LSPLSHFGTN SQPGFRGPSP
560 570 580 590 600
PSSRPSSTGL EDISPTPLPD SDEDEDLGLG LGPRPPPEPG PPDPMGLLGQ
610 620 630 640 650
TAEVDLDLAG DRTPTSERMD EGQQSSGEDM EISDDEMPSA PITSADCPKP
660 670 680 690 700
MVVTPGAGAV AAPNVLAPNL PLPPPPGFPP LPPPPPPPPP QPGFPMPPPL
710 720 730 740 750
PPPPPPPPPA HPAVTVPPPP LPAPPGVPPP PILPPLPPFP PGLFPVMQVD
760 770 780 790 800
MSHVLGGQWG GMPMSFQMQT QMLSRLMTGQ GACPYPPFMA AAAAAASAGL
810 820 830 840 850
QFVNLPPYRS PFSLSNSGPG RGQHWPPLPK FDPSVPPPGY IPRQEDPHKA
860 870 880 890 900
TVDGVLLVVL KELKAIMKRD LNRKMVEVVA FRAFDEWWDK KERMAKASLT
910 920 930 940 950
PVKSGEHKDE DRPKPKDRIA SCLLESWGKG EGLGYEGLGL GIGLRGAIRL
960 970 980 990 1000
PSFKVKRKEP PDTASSGDQK RLRPSTSVDE EDEESERERD RDIADAPCEL
1010 1020 1030 1040 1050
TKRDPKSVGV RRRPGRPLEL DSGGEEDEKE SLSASSSSSA SSSSGSSTTS
1060 1070 1080 1090 1100
PSSSASDKEE EDRESTEEEE EEEEEEAEEE EEEGPRSRIS SPSSSSSSDK
1110 1120 1130 1140 1150
DDEDDNEADS DGQIDSDIDD QGAPLSEASE KDNGDSEEEE TESITTSKAP
1160 1170 1180 1190 1200
AESSSSSSES SGSSEFESSS ESESSSSSSE DEEEMTVPGV EEEEEEEEEE
1210 1220 1230 1240 1250
EKETAMAAAT VVAMAEESMP PAGGQDFEQD RAEVPLGPRG PMRESLGTEE
1260 1270 1280 1290 1300
EVDIEAEDEV PEMQAPELEE PPLPMGARKL EGSPEPPEEP GPNTQGDMLL
1310 1320 1330 1340 1350
SPELPARETE EAQLPSPPEH GPESDLDMEP EPPPMLSLPL QPPLPPPRLL
1360 1370 1380 1390 1400
RPPSPPPEPE TPEPPKPPVP LEPPPEDHPP RTPGLCGSLA KSQSTETVPA
1410 1420 1430 1440 1450
TPGGEPPLSG SSSGLSLSSP QVPGSPFSYP SPSPGLSSGG LPRTPGRDFS
1460 1470 1480 1490 1500
FTPTFPEPSG PLLLPVCPLP TGRRDERTGP LASPVLLETG LPLPLPLPLP
1510 1520 1530 1540 1550
LPLALPVPVL RAQPRPPPQL PPLLPATLAP CPTPIKRKPG RPRRSPPSML
1560 1570 1580 1590 1600
SLDGPLVRPP PGPALGRDLL LLPGQPPAPI FPSAHDPRAV TLDFRNTGIP
1610 1620 1630 1640 1650
APPPPLPPQP PPPPPPPPVE STKLPFKELD NQWPSEAIPP GPRRDEVTEE
1660 1670 1680 1690 1700
YVDLAKVRGP WRRPPKKRHE DLVAPSASPE PSPPQPLFRP RSEFEEMTIL
1710 1720 1730 1740 1750
YDIWNGGIDE EDIRFLCVTY ERLLQQDNGM DWLNDTLWVY HPSTSLSSAK
1760 1770 1780 1790 1800
KKKREDGIRE HVTGCARSEG FYTIDKKDKL RYLNSSRAST DEPPMDTQGM
1810 1820 1830 1840 1850
SIPAQPHAST RAGSERRSEQ RRLLSSFTGS CDSDLLKFNQ LKFRKKKLKF
1860 1870 1880 1890 1900
CKSHIHDWGL FAMEPIAADE MVIEYVGQNI RQVIADMREK RYEDEGIGSS
1910 1920 1930 1940 1950
YMFRVDHDTI IDATKCGNFA RFINHSCNPN CYAKVITVES QKKIVIYSKQ
1960 1970 1980
HINVNEEITY DYKFPIEDVK IPCLCGSENC RGTLN
Length:1,985
Mass (Da):215,352
Last modified:February 5, 2008 - v2
Checksum:i760EA261769292EC
GO
Isoform 2 (identifier: Q8CFT2-2) [UniParc]FASTAAdd to Basket

The sequence of this isoform differs from the canonical sequence as follows:
     1139-1179: Missing.

Note: No experimental confirmation available.

Show »
Length:1,944
Mass (Da):211,286
Checksum:iE2BD1B70C876448C
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti1222 – 12221A → V in BAC65717. (PubMed:12693553)Curated
Sequence conflicti1411 – 14111S → G in BAC65717. (PubMed:12693553)Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1139 – 117941Missing in isoform 2. 1 PublicationVSP_030851Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AC158114 Genomic DNA. No translation available.
BC038367 mRNA. Translation: AAH38367.2.
BC040775 mRNA. Translation: AAH40775.1.
BC041681 mRNA. Translation: AAH41681.1.
AK122435 mRNA. Translation: BAC65717.1.
CCDSiCCDS59684.1. [Q8CFT2-1]
RefSeqiNP_001035488.2. NM_001040398.2. [Q8CFT2-1]
UniGeneiMm.250391.

Genome annotation databases

EnsembliENSMUST00000056053; ENSMUSP00000134686; ENSMUSG00000038384. [Q8CFT2-1]
ENSMUST00000163030; ENSMUSP00000133933; ENSMUSG00000038384. [Q8CFT2-1]
ENSMUST00000174836; ENSMUSP00000134461; ENSMUSG00000038384. [Q8CFT2-2]
GeneIDi208043.
KEGGimmu:208043.

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AC158114 Genomic DNA. No translation available.
BC038367 mRNA. Translation: AAH38367.2 .
BC040775 mRNA. Translation: AAH40775.1 .
BC041681 mRNA. Translation: AAH41681.1 .
AK122435 mRNA. Translation: BAC65717.1 .
CCDSi CCDS59684.1. [Q8CFT2-1 ]
RefSeqi NP_001035488.2. NM_001040398.2. [Q8CFT2-1 ]
UniGenei Mm.250391.

3D structure databases

ProteinModelPortali Q8CFT2.
SMRi Q8CFT2. Positions 97-202.
ModBasei Search...
MobiDBi Search...

PTM databases

PhosphoSitei Q8CFT2.

Proteomic databases

MaxQBi Q8CFT2.
PaxDbi Q8CFT2.
PRIDEi Q8CFT2.

Protocols and materials databases

Structural Biology Knowledgebase Search...

Genome annotation databases

Ensembli ENSMUST00000056053 ; ENSMUSP00000134686 ; ENSMUSG00000038384 . [Q8CFT2-1 ]
ENSMUST00000163030 ; ENSMUSP00000133933 ; ENSMUSG00000038384 . [Q8CFT2-1 ]
ENSMUST00000174836 ; ENSMUSP00000134461 ; ENSMUSG00000038384 . [Q8CFT2-2 ]
GeneIDi 208043.
KEGGi mmu:208043.

Organism-specific databases

CTDi 23067.
MGIi MGI:2652820. Setd1b.
Rougei Search...

Phylogenomic databases

eggNOGi COG2940.
GeneTreei ENSGT00760000119228.
HOVERGENi HBG055596.
InParanoidi Q8CFT2.
KOi K11422.
OMAi QHWPPLP.
OrthoDBi EOG7GQXTT.
TreeFami TF106436.

Miscellaneous databases

PROi Q8CFT2.
SOURCEi Search...

Gene expression databases

Bgeei Q8CFT2.
CleanExi MM_SETD1B.
ExpressionAtlasi Q8CFT2. baseline and differential.
Genevestigatori Q8CFT2.

Family and domain databases

Gene3Di 3.30.70.330. 2 hits.
InterProi IPR024657. COMPASS_Set1_N-SET.
IPR012677. Nucleotide-bd_a/b_plait.
IPR003616. Post-SET_dom.
IPR000504. RRM_dom.
IPR001214. SET_dom.
[Graphical view ]
Pfami PF11764. N-SET. 1 hit.
PF00076. RRM_1. 1 hit.
PF00856. SET. 1 hit.
[Graphical view ]
SMARTi SM00508. PostSET. 1 hit.
SM00360. RRM. 1 hit.
SM00317. SET. 1 hit.
[Graphical view ]
PROSITEi PS50868. POST_SET. 1 hit.
PS50102. RRM. 1 hit.
PS50280. SET. 1 hit.
[Graphical view ]
ProtoNeti Search...

Publicationsi

« Hide 'large scale' publications
  1. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: C57BL/6J.
  2. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 883-1985 (ISOFORM 1).
    Strain: FVB/N.
    Tissue: Eye and Mammary tumor.
  3. "Prediction of the coding sequences of mouse homologues of KIAA gene: II. The complete nucleotide sequences of 400 mouse KIAA-homologous cDNAs identified by screening of terminal sequences of cDNA clones randomly sampled from size-fractionated libraries."
    Okazaki N., Kikuno R., Ohara R., Inamoto S., Aizawa H., Yuasa S., Nakajima D., Nagase T., Ohara O., Koga H.
    DNA Res. 10:35-48(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 1090-1985 (ISOFORM 2).
  4. "Identification and characterization of the human Set1B histone H3-Lys4 methyltransferase complex."
    Lee J.-H., Tate C.M., You J.-S., Skalnik D.G.
    J. Biol. Chem. 282:13419-13428(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: TISSUE SPECIFICITY.
  5. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Liver.

Entry informationi

Entry nameiSET1B_MOUSE
AccessioniPrimary (citable) accession number: Q8CFT2
Secondary accession number(s): Q80TK9, Q8CFQ8, Q8CGD1
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 5, 2008
Last sequence update: February 5, 2008
Last modified: October 29, 2014
This is version 101 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3