Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Cleavage stimulation factor subunit 1

Gene

CSTF1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

One of the multiple factors required for polyadenylation and 3'-end cleavage of mammalian pre-mRNAs. May be responsible for the interaction of CSTF with other factors to form a stable complex on the pre-mRNA.

GO - Molecular functioni

  • poly(A) RNA binding Source: UniProtKB
  • RNA binding Source: ProtInc

GO - Biological processi

  • mRNA 3'-end processing Source: Reactome
  • mRNA cleavage Source: ProtInc
  • mRNA polyadenylation Source: ProtInc
  • mRNA splicing, via spliceosome Source: Reactome
  • RNA processing Source: ProtInc
  • termination of RNA polymerase II transcription Source: Reactome
Complete GO annotation...

Keywords - Biological processi

mRNA processing

Enzyme and pathway databases

ReactomeiR-HSA-109688. Cleavage of Growing Transcript in the Termination Region.
R-HSA-72163. mRNA Splicing - Major Pathway.
R-HSA-72187. mRNA 3'-end processing.
R-HSA-77595. Processing of Intronless Pre-mRNAs.
SignaLinkiQ05048.

Names & Taxonomyi

Protein namesi
Recommended name:
Cleavage stimulation factor subunit 1
Alternative name(s):
CF-1 50 kDa subunit
Cleavage stimulation factor 50 kDa subunit
Short name:
CSTF 50 kDa subunit
Short name:
CstF-50
Gene namesi
Name:CSTF1
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 20

Organism-specific databases

HGNCiHGNC:2483. CSTF1.

Subcellular locationi

GO - Cellular componenti

  • nucleoplasm Source: Reactome
  • nucleus Source: ProtInc
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA26985.

Polymorphism and mutation databases

BioMutaiCSTF1.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 431431Cleavage stimulation factor subunit 1PRO_0000050944Add
BLAST

Post-translational modificationi

The N-terminus is blocked.

Proteomic databases

EPDiQ05048.
PaxDbiQ05048.
PeptideAtlasiQ05048.
PRIDEiQ05048.

2D gel databases

REPRODUCTION-2DPAGEIPI00011528.

PTM databases

iPTMnetiQ05048.
PhosphoSiteiQ05048.

Expressioni

Gene expression databases

BgeeiQ05048.
CleanExiHS_CSTF1.
ExpressionAtlasiQ05048. baseline and differential.
GenevisibleiQ05048. HS.

Organism-specific databases

HPAiCAB019270.
HPA047275.
HPA050983.

Interactioni

Subunit structurei

Homodimer. The CSTF complex is composed of CSTF1 (50 kDa subunit), CSTF2 (64 kDa subunit) and CSTF3 (77 kDa subunit). Interacts directly with CSTF3. Interacts with BARD1.2 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
BARD1Q997284EBI-1789619,EBI-473181
TOLLIPQ9H0E22EBI-1789619,EBI-74615

Protein-protein interaction databases

BioGridi107859. 40 interactions.
IntActiQ05048. 11 interactions.
STRINGi9606.ENSP00000217109.

Structurei

3D structure databases

ProteinModelPortaliQ05048.
SMRiQ05048. Positions 8-59, 104-424.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Repeati106 – 14540WD 1Add
BLAST
Repeati171 – 21040WD 2Add
BLAST
Repeati215 – 25440WD 3Add
BLAST
Repeati260 – 30142WD 4Add
BLAST
Repeati303 – 34341WD 5Add
BLAST
Repeati395 – 43036WD 6Add
BLAST

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni14 – 3522HydrophobicAdd
BLAST

Domaini

The WD6 domain is required for interaction with BARD1. WD domains are responsible for interaction with CSTF3.
N-terminus mediates homodimerization.

Sequence similaritiesi

Contains 6 WD repeats.PROSITE-ProRule annotation

Keywords - Domaini

Repeat, WD repeat

Phylogenomic databases

eggNOGiKOG0640. Eukaryota.
ENOG410XPWJ. LUCA.
GeneTreeiENSGT00840000129765.
HOGENOMiHOG000234077.
HOVERGENiHBG051144.
InParanoidiQ05048.
KOiK14406.
OMAiWELSTNR.
OrthoDBiEOG7X0VH2.
PhylomeDBiQ05048.
TreeFamiTF314234.

Family and domain databases

Gene3Di2.130.10.10. 1 hit.
InterProiIPR032028. CSTF1_dimer.
IPR015943. WD40/YVTN_repeat-like_dom.
IPR001680. WD40_repeat.
IPR019775. WD40_repeat_CS.
IPR017986. WD40_repeat_dom.
[Graphical view]
PfamiPF16699. CSTF1_dimer. 1 hit.
PF00400. WD40. 5 hits.
[Graphical view]
SMARTiSM00320. WD40. 6 hits.
[Graphical view]
SUPFAMiSSF50978. SSF50978. 1 hit.
PROSITEiPS00678. WD_REPEATS_1. 1 hit.
PS50082. WD_REPEATS_2. 4 hits.
PS50294. WD_REPEATS_REGION. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q05048-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MYRTKVGLKD RQQLYKLIIS QLLYDGYISI ANGLINEIKP QSVCAPSEQL
60 70 80 90 100
LHLIKLGMEN DDTAVQYAIG RSDTVAPGTG IDLEFDADVQ TMSPEASEYE
110 120 130 140 150
TCYVTSHKGP CRVATYSRDG QLIATGSADA SIKILDTERM LAKSAMPIEV
160 170 180 190 200
MMNETAQQNM ENHPVIRTLY DHVDEVTCLA FHPTEQILAS GSRDYTLKLF
210 220 230 240 250
DYSKPSAKRA FKYIQEAEML RSISFHPSGD FILVGTQHPT LRLYDINTFQ
260 270 280 290 300
CFVSCNPQDQ HTDAICSVNY NSSANMYVTG SKDGCIKLWD GVSNRCITTF
310 320 330 340 350
EKAHDGAEVC SAIFSKNSKY ILSSGKDSVA KLWEISTGRT LVRYTGAGLS
360 370 380 390 400
GRQVHRTQAV FNHTEDYVLL PDERTISLCC WDSRTAERRN LLSLGHNNIV
410 420 430
RCIVHSPTNP GFMTCSDDFR ARFWYRRSTT D
Length:431
Mass (Da):48,358
Last modified:February 1, 1994 - v1
Checksum:i88A5BE53022AD9E3
GO

Sequence cautioni

The sequence CAC12718.2 differs from that shown. Reason: Erroneous gene model prediction. Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
L02547 mRNA. Translation: AAA35691.1.
BT007138 mRNA. Translation: AAP35802.1.
AK312774 mRNA. Translation: BAG35638.1.
AL121914 Genomic DNA. Translation: CAC12718.2. Sequence problems.
AL121914 Genomic DNA. Translation: CAI19328.1.
CH471077 Genomic DNA. Translation: EAW75548.1.
CH471077 Genomic DNA. Translation: EAW75549.1.
BC001011 mRNA. Translation: AAH01011.1.
BC007425 mRNA. Translation: AAH07425.1.
CCDSiCCDS13452.1.
PIRiA45142.
RefSeqiNP_001028693.1. NM_001033521.1.
NP_001028694.1. NM_001033522.1.
NP_001315.1. NM_001324.2.
XP_011526902.1. XM_011528600.1.
UniGeneiHs.172865.

Genome annotation databases

EnsembliENST00000217109; ENSP00000217109; ENSG00000101138.
GeneIDi1477.
KEGGihsa:1477.
UCSCiuc002xxm.2. human.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
L02547 mRNA. Translation: AAA35691.1.
BT007138 mRNA. Translation: AAP35802.1.
AK312774 mRNA. Translation: BAG35638.1.
AL121914 Genomic DNA. Translation: CAC12718.2. Sequence problems.
AL121914 Genomic DNA. Translation: CAI19328.1.
CH471077 Genomic DNA. Translation: EAW75548.1.
CH471077 Genomic DNA. Translation: EAW75549.1.
BC001011 mRNA. Translation: AAH01011.1.
BC007425 mRNA. Translation: AAH07425.1.
CCDSiCCDS13452.1.
PIRiA45142.
RefSeqiNP_001028693.1. NM_001033521.1.
NP_001028694.1. NM_001033522.1.
NP_001315.1. NM_001324.2.
XP_011526902.1. XM_011528600.1.
UniGeneiHs.172865.

3D structure databases

ProteinModelPortaliQ05048.
SMRiQ05048. Positions 8-59, 104-424.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi107859. 40 interactions.
IntActiQ05048. 11 interactions.
STRINGi9606.ENSP00000217109.

PTM databases

iPTMnetiQ05048.
PhosphoSiteiQ05048.

Polymorphism and mutation databases

BioMutaiCSTF1.

2D gel databases

REPRODUCTION-2DPAGEIPI00011528.

Proteomic databases

EPDiQ05048.
PaxDbiQ05048.
PeptideAtlasiQ05048.
PRIDEiQ05048.

Protocols and materials databases

DNASUi1477.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000217109; ENSP00000217109; ENSG00000101138.
GeneIDi1477.
KEGGihsa:1477.
UCSCiuc002xxm.2. human.

Organism-specific databases

CTDi1477.
GeneCardsiCSTF1.
H-InvDBHIX0015931.
HGNCiHGNC:2483. CSTF1.
HPAiCAB019270.
HPA047275.
HPA050983.
MIMi600369. gene.
neXtProtiNX_Q05048.
PharmGKBiPA26985.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0640. Eukaryota.
ENOG410XPWJ. LUCA.
GeneTreeiENSGT00840000129765.
HOGENOMiHOG000234077.
HOVERGENiHBG051144.
InParanoidiQ05048.
KOiK14406.
OMAiWELSTNR.
OrthoDBiEOG7X0VH2.
PhylomeDBiQ05048.
TreeFamiTF314234.

Enzyme and pathway databases

ReactomeiR-HSA-109688. Cleavage of Growing Transcript in the Termination Region.
R-HSA-72163. mRNA Splicing - Major Pathway.
R-HSA-72187. mRNA 3'-end processing.
R-HSA-77595. Processing of Intronless Pre-mRNAs.
SignaLinkiQ05048.

Miscellaneous databases

ChiTaRSiCSTF1. human.
GeneWikiiCSTF1.
GenomeRNAii1477.
PROiQ05048.
SOURCEiSearch...

Gene expression databases

BgeeiQ05048.
CleanExiHS_CSTF1.
ExpressionAtlasiQ05048. baseline and differential.
GenevisibleiQ05048. HS.

Family and domain databases

Gene3Di2.130.10.10. 1 hit.
InterProiIPR032028. CSTF1_dimer.
IPR015943. WD40/YVTN_repeat-like_dom.
IPR001680. WD40_repeat.
IPR019775. WD40_repeat_CS.
IPR017986. WD40_repeat_dom.
[Graphical view]
PfamiPF16699. CSTF1_dimer. 1 hit.
PF00400. WD40. 5 hits.
[Graphical view]
SMARTiSM00320. WD40. 6 hits.
[Graphical view]
SUPFAMiSSF50978. SSF50978. 1 hit.
PROSITEiPS00678. WD_REPEATS_1. 1 hit.
PS50082. WD_REPEATS_2. 4 hits.
PS50294. WD_REPEATS_REGION. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "A human polyadenylation factor is a G protein beta-subunit homologue."
    Takagaki Y., Manley J.L.
    J. Biol. Chem. 267:23471-23474(1992) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], PROTEIN SEQUENCE OF 101-119 AND 155-170.
  2. "Cloning of human full-length CDSs in BD Creator(TM) system donor vector."
    Kalnine N., Chen X., Rolfs A., Halleck A., Hines L., Eisenstein S., Koundinya M., Raphael J., Moreira D., Kelley T., LaBaer J., Lin Y., Phelan M., Farmer A.
    Submitted (MAY-2003) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
  3. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Testis.
  4. "The DNA sequence and comparative analysis of human chromosome 20."
    Deloukas P., Matthews L.H., Ashurst J.L., Burton J., Gilbert J.G.R., Jones M., Stavrides G., Almeida J.P., Babbage A.K., Bagguley C.L., Bailey J., Barlow K.F., Bates K.N., Beard L.M., Beare D.M., Beasley O.P., Bird C.P., Blakey S.E.
    , Bridgeman A.M., Brown A.J., Buck D., Burrill W.D., Butler A.P., Carder C., Carter N.P., Chapman J.C., Clamp M., Clark G., Clark L.N., Clark S.Y., Clee C.M., Clegg S., Cobley V.E., Collier R.E., Connor R.E., Corby N.R., Coulson A., Coville G.J., Deadman R., Dhami P.D., Dunn M., Ellington A.G., Frankland J.A., Fraser A., French L., Garner P., Grafham D.V., Griffiths C., Griffiths M.N.D., Gwilliam R., Hall R.E., Hammond S., Harley J.L., Heath P.D., Ho S., Holden J.L., Howden P.J., Huckle E., Hunt A.R., Hunt S.E., Jekosch K., Johnson C.M., Johnson D., Kay M.P., Kimberley A.M., King A., Knights A., Laird G.K., Lawlor S., Lehvaeslaiho M.H., Leversha M.A., Lloyd C., Lloyd D.M., Lovell J.D., Marsh V.L., Martin S.L., McConnachie L.J., McLay K., McMurray A.A., Milne S.A., Mistry D., Moore M.J.F., Mullikin J.C., Nickerson T., Oliver K., Parker A., Patel R., Pearce T.A.V., Peck A.I., Phillimore B.J.C.T., Prathalingam S.R., Plumb R.W., Ramsay H., Rice C.M., Ross M.T., Scott C.E., Sehra H.K., Shownkeen R., Sims S., Skuce C.D., Smith M.L., Soderlund C., Steward C.A., Sulston J.E., Swann R.M., Sycamore N., Taylor R., Tee L., Thomas D.W., Thorpe A., Tracey A., Tromans A.C., Vaudin M., Wall M., Wallis J.M., Whitehead S.L., Whittaker P., Willey D.L., Williams L., Williams S.A., Wilming L., Wray P.W., Hubbard T., Durbin R.M., Bentley D.R., Beck S., Rogers J.
    Nature 414:865-871(2001) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  5. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  6. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Muscle.
  7. "Functional interaction of BRCA1-associated BARD1 with polyadenylation factor CstF-50."
    Kleiman F.E., Manley J.L.
    Science 285:1576-1579(1999) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH BARD1.
  8. "Complex protein interactions within the human polyadenylation machinery identify a novel component."
    Takagaki Y., Manley J.L.
    Mol. Cell. Biol. 20:1515-1525(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: SUBUNIT.
  9. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].

Entry informationi

Entry nameiCSTF1_HUMAN
AccessioniPrimary (citable) accession number: Q05048
Secondary accession number(s): Q5QPD8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 1, 1994
Last sequence update: February 1, 1994
Last modified: June 8, 2016
This is version 163 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 20
    Human chromosome 20: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.