Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

Q18221 (SET2_CAEEL) Reviewed, UniProtKB/Swiss-Prot

Last modified June 11, 2014. Version 111. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Probable histone-lysine N-methyltransferase set-2

EC=2.1.1.43
Alternative name(s):
SET domain-containing protein 2
Gene names
Name:set-2
ORF Names:C26E6.9
OrganismCaenorhabditis elegans [Reference proteome]
Taxonomic identifier6239 [NCBI]
Taxonomic lineageEukaryotaMetazoaEcdysozoaNematodaChromadoreaRhabditidaRhabditoideaRhabditidaePeloderinaeCaenorhabditis

Protein attributes

Sequence length1507 AA.
Sequence statusComplete.
Protein existenceEvidence at transcript level

General annotation (Comments)

Function

Probable histone methyltransferase involved in chromatin modification and/or remodeling in meiotic germ cells. May act redundantly with mes-3 and mes-4 proteins. Required for RNAi. Functions as an antagonist of hpl-1 and hpl-2 activity in growth and somatic gonad development. Ref.2 Ref.3 Ref.4 Ref.5

Catalytic activity

S-adenosyl-L-methionine + L-lysine-[histone] = S-adenosyl-L-homocysteine + N(6)-methyl-L-lysine-[histone].

Subcellular location

Nucleus. Note: Localized in mitotic and mid-late-stage meiotic nuclei but is undetectable in early pachytene nuclei. Ref.2

Tissue specificity

Expressed in all cells of embryo. In L1 larva, it is predominantly expressed in Z2 and Z3 primordial germ cells. In adults, it is predominantly expressed in the germline. Ref.2

Developmental stage

Expressed throughout embryogenesis.

Sequence similarities

Belongs to the class V-like SAM-binding methyltransferase superfamily.

Contains 1 post-SET domain.

Contains 1 RRM (RNA recognition motif) domain.

Contains 1 SET domain.

Alternative products

This entry describes 3 isoforms produced by alternative splicing. [Align] [Select]
Isoform a (identifier: Q18221-1)

Also known as: L;

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform b (identifier: Q18221-2)

Also known as: S;

The sequence of this isoform differs from the canonical sequence as follows:
     1-768: Missing.
     769-831: MDELSRKVAE...PSNHLIADMM → MYNNSAPYLN...PQRVYRSINS
Isoform c (identifier: Q18221-3)

The sequence of this isoform differs from the canonical sequence as follows:
     831-831: M → MPSQ
Note: No experimental confirmation available.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Chain1 – 15071507Probable histone-lysine N-methyltransferase set-2
PRO_0000097695

Regions

Domain128 – 19972RRM
Domain1368 – 1485118SET
Domain1491 – 150717Post-SET
Compositional bias296 – 35459Pro-rich
Compositional bias554 – 664111Pro-rich
Compositional bias870 – 1011142Ser-rich

Natural variations

Alternative sequence1 – 768768Missing in isoform b.
VSP_007217
Alternative sequence769 – 83163MDELS…IADMM → MYNNSAPYLNHSSLNTVRKK VVTVRRVLPSLPPPPPPPPS LYPPCSVFKVPYIPQRVYRS INS in isoform b.
VSP_007218
Alternative sequence8311M → MPSQ in isoform c.
VSP_038347

Sequences

Sequence LengthMass (Da)Tools
Isoform a (L) [UniParc].

Last modified April 23, 2003. Version 2.
Checksum: E7D9689DA720C34A

FASTA1,507171,683
        10         20         30         40         50         60 
MSTHDMNHHP PRKSHSKRDK PSSSNSGPKI ENHKCKWAWQ KVFETGKSFL RRDGFPQDCK 

        70         80         90        100        110        120 
SKEDFERIKR TGVRKTSENM LEDPRKNFES LQQSSVYQTN SFRNPRYLCR AHLRVDSYYC 

       130        140        150        160        170        180 
TIPPKREVSL FNMDDNCTEV LLRDFAKDCG KVEKAYVCIH PETKRHMKMA YVKFATVKEA 

       190        200        210        220        230        240 
HNFYSMYHAQ NLLATKCTPR IDPFLSILNE EYEVATNGQV LPILPDDLAS IDPSVLRDLR 

       250        260        270        280        290        300 
ANFLRDQNEK YELAMRNTYE DEGGMLSGVI MDTSDHYERD YTMDHDVGPS SMKMSPIPPP 

       310        320        330        340        350        360 
PIKEESPPPP PPPPVASVSN LAPVPSVQLP YYNNIQPSSS TMHMPEFRPT EPPPSYSRED 

       370        380        390        400        410        420 
PYRSTSRSSL SRHRNRSRSP SDGMDRSGRS SSRRTHRRPE SRNGSKNANG DVVKYETYKM 

       430        440        450        460        470        480 
EKRKIKYEGG NKKYEQVHIK ERTAVIRGKN QLENVSSESA SGSSSVDTYP DFSDEERKKK 

       490        500        510        520        530        540 
KRPKSPNRSK KDSRAFGWDS TDESDEDTRR RRSGRSQNRS SERKFQTTSS SSTRRELSST 

       550        560        570        580        590        600 
HTNSVPNLKS HETPPPPPPK GHPSVHLQTP YQHVQPQMIP ATYYNLPPQH MAPPPITTSL 

       610        620        630        640        650        660 
PPFCDFSQPP PGFTPTFKPI TNAPLPTPYQ ASNIPQPGLV QIAALSAAPE PFSSIPGPPP 

       670        680        690        700        710        720 
GPAPIQEDVG RAESPEKPSL SERFSGIFGP TQREEPAQVE VEYDYPLKHS ESHDDRHSLE 

       730        740        750        760        770        780 
DMDVEVSSDG ETVSNVEKIE CMEEKKRQDL ERIAIARTPI VKKCKKRMMD ELSRKVAEDI 

       790        800        810        820        830        840 
RQQIMRQCFA ALDEKLHLKA IADEEKRKKE REEKARQEAE KPSNHLIADM MTLYNNQSFA 

       850        860        870        880        890        900 
SSSRGFYRKQ KPIPKSHPKH QEHHHHAKAS VSTPVHSSST SRNSSVAPTP QRTVSTSSSS 

       910        920        930        940        950        960 
SSAATSARVS EDESDSDSTP GEVQRRKTSV LSNDKRRRRA SFSSTSIQSS PERQRDVSSS 

       970        980        990       1000       1010       1020 
SRTSSSSSTS SMKQEETADE KSRKRKLIMS SDESSTTGST ATSVVSSRQS SLEPQQEKTD 

      1030       1040       1050       1060       1070       1080 
GEPPKKKSQT DFISERVSKI EGEERPLPEP VETSGPIIGD SSYLPYKIVH WEKAGIIEMN 

      1090       1100       1110       1120       1130       1140 
LPANSIRAHE YHPFTTEHCY FGIDDPRQPK IQIFDHSPCK SEPGSEPLKI TPAPWGPIDN 

      1150       1160       1170       1180       1190       1200 
VAETGPLIYM DVVTAPKTVQ KKQKPRKQVF EKDPYEYYEP PPTKRPAPPP RFKKTFKPRS 

      1210       1220       1230       1240       1250       1260 
EEEKKKIIGD CEDLPDLEDQ WYLRAALNEM QSEVKSADEL PWKKMLTFKE MLRSEDPLLR 

      1270       1280       1290       1300       1310       1320 
LNPIRSKKGL PDAFYEDEEL DGVIPVAAGC SRARPYEKMT MKQKRSLVRR PDNESHPTAI 

      1330       1340       1350       1360       1370       1380 
FSERDETAIR HQHLASKDMR LLQRRLLTSL GDANNDFFKI NQLKFRKKMI KFARSRIHGW 

      1390       1400       1410       1420       1430       1440 
GLYAMESIAP DEMIVEYIGQ TIRSLVAEER EKAYERRGIG SSYLFRIDLH HVIDATKRGN 

      1450       1460       1470       1480       1490       1500 
FARFINHSCQ PNCYAKVLTI EGEKRIVIYS RTIIKKGEEI TYDYKFPIED DKIDCLCGAK 


TCRGYLN 

« Hide

Isoform b (S) [UniParc].

Checksum: 6588F8D83C72BABD
Show »

FASTA73983,999
Isoform c [UniParc].

Checksum: F301028054840D2E
Show »

FASTA1,510171,995

References

« Hide 'large scale' references
[1]"Genome sequence of the nematode C. elegans: a platform for investigating biology."
The C. elegans sequencing consortium
Science 282:2012-2018(1998) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA], ALTERNATIVE SPLICING.
Strain: Bristol N2.
[2]"Depletion of a novel SET-domain protein enhances the sterility of mes-3 and mes-4 mutants of Caenorhabditis elegans."
Xu L., Strome S.
Genetics 159:1019-1029(2001) [PubMed] [Europe PMC] [Abstract]
Cited for: FUNCTION, SUBCELLULAR LOCATION, TISSUE SPECIFICITY, ALTERNATIVE SPLICING.
[3]"A targeted RNAi screen for genes involved in chromosome morphogenesis and nuclear organization in the Caenorhabditis elegans germline."
Colaiacovo M.P., Stanfield G.M., Reddy K.C., Reinke V., Kim S.K., Villeneuve A.M.
Genetics 162:113-128(2002) [PubMed] [Europe PMC] [Abstract]
Cited for: FUNCTION.
[4]"Telomeric position effect variegation in Saccharomyces cerevisiae by Caenorhabditis elegans linker histones suggests a mechanistic connection between germ line and telomeric silencing."
Jedrusik M.A., Schulze E.
Mol. Cell. Biol. 23:3681-3691(2003) [PubMed] [Europe PMC] [Abstract]
Cited for: FUNCTION.
[5]"Antagonistic functions of SET-2/SET1 and HPL/HP1 proteins in C. elegans development."
Simonet T., Dulermo R., Schott S., Palladino F.
Dev. Biol. 312:367-383(2007) [PubMed] [Europe PMC] [Abstract]
Cited for: FUNCTION.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
FO080680 Genomic DNA. Translation: CCD65735.1.
FO080680 Genomic DNA. Translation: CCD65734.1.
FO080680 Genomic DNA. Translation: CCD65736.1.
PIRA88445.
RefSeqNP_498039.1. NM_065638.3. [Q18221-3]
NP_498040.1. NM_065639.4. [Q18221-1]
NP_498041.1. NM_065640.3. [Q18221-2]
UniGeneCel.8145.

3D structure databases

ProteinModelPortalQ18221.
SMRQ18221. Positions 124-222, 1340-1507.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

BioGrid40896. 2 interactions.

Proteomic databases

PaxDbQ18221.
PRIDEQ18221.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblMetazoaC26E6.9a; C26E6.9a; WBGene00004782. [Q18221-1]
GeneID175662.
KEGGcel:CELE_C26E6.9.
UCSCC26E6.9a. c. elegans. [Q18221-1]

Organism-specific databases

CTD175662.
WormBaseC26E6.9a; CE27735; WBGene00004782; set-2.
C26E6.9b; CE01158; WBGene00004782; set-2.
C26E6.9c; CE27736; WBGene00004782; set-2.

Phylogenomic databases

eggNOGCOG2940.
GeneTreeENSGT00740000115089.
HOGENOMHOG000021414.
InParanoidQ18221.
KOK11422.
OMAYCTIPPK.

Family and domain databases

Gene3D3.30.70.330. 1 hit.
InterProIPR012677. Nucleotide-bd_a/b_plait.
IPR003616. Post-SET_dom.
IPR000504. RRM_dom.
IPR001214. SET_dom.
[Graphical view]
PfamPF00076. RRM_1. 1 hit.
PF00856. SET. 1 hit.
[Graphical view]
SMARTSM00508. PostSET. 1 hit.
SM00360. RRM. 1 hit.
SM00317. SET. 1 hit.
[Graphical view]
PROSITEPS50868. POST_SET. 1 hit.
PS50280. SET. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

NextBio889112.
PROQ18221.

Entry information

Entry nameSET2_CAEEL
AccessionPrimary (citable) accession number: Q18221
Secondary accession number(s): Q95QU6, Q95QU7
Entry history
Integrated into UniProtKB/Swiss-Prot: April 23, 2003
Last sequence update: April 23, 2003
Last modified: June 11, 2014
This is version 111 of the entry and version 2 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programCaenorhabditis annotation project

Relevant documents

SIMILARITY comments

Index of protein domains and families

Caenorhabditis elegans

Caenorhabditis elegans: entries, gene names and cross-references to WormBase