Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

Q5EG65 (POLG_HCVGL) Reviewed, UniProtKB/Swiss-Prot

Last modified July 9, 2014. Version 75. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Sequence annotation·Sequences·References·Web links·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Genome polyprotein

Cleaved into the following 6 chains:

  1. Core protein p21
    Alternative name(s):
    Capsid protein C
    p21
  2. Core protein p19
  3. Envelope glycoprotein E1
    Alternative name(s):
    gp32
    gp35
  4. Envelope glycoprotein E2
    Alternative name(s):
    NS1
    gp68
    gp70
  5. p7
  6. Protease NS2-3
    Short name=p23
    EC=3.4.22.-
OrganismHepatitis C virus (isolate Glasgow) (HCV)
Taxonomic identifier329389 [NCBI]
Taxonomic lineageVirusesssRNA positive-strand viruses, no DNA stageFlaviviridaeHepacivirus
Virus hostHomo sapiens (Human) [TaxID: 9606]

Protein attributes

Sequence length829 AA.
Sequence statusFragment.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Core protein packages viral RNA to form a viral nucleocapsid, and promotes virion budding. Modulates viral translation initiation by interacting with HCV IRES and 40S ribosomal subunit. Also regulates many host cellular functions such as signaling pathways and apoptosis. Prevents the establishment of cellular antiviral state by blocking the interferon-alpha/beta (IFN-alpha/beta) and IFN-gamma signaling pathways and by inducing human STAT1 degradation. Thought to play a role in virus-mediated cell transformation leading to hepatocellular carcinomas. Interacts with, and activates STAT3 leading to cellular transformation. May repress the promoter of p53, and sequester CREB3 and SP110 isoform 3/Sp110bin the cytoplasm. Also represses cell cycle negative regulating factor CDKN1A, thereby interrupting an important check point of normal cell cycle regulation. Targets transcription factors involved in the regulation of inflammatory responses and in the immune response: suppresses NK-kappaB activation, and activates AP-1. Could mediate apoptotic pathways through association with TNF-type receptors TNFRSF1A and LTBR, although its effect on death receptor-induced apoptosis remains controversial. Enhances TRAIL mediated apoptosis, suggesting that it might play a role in immune-mediated liver cell injury. Seric core protein is able to bind C1QR1 at the T-cell surface, resulting in down-regulation of T-lymphocytes proliferation. May transactivate human MYC, Rous sarcoma virus LTR, and SV40 promoters. May suppress the human FOS and HIV-1 LTR activity. Alters lipid metabolism by interacting with hepatocellular proteins involved in lipid accumulation and storage. Core protein induces up-regulation of FAS promoter activity, and thereby probably contributes to the increased triglyceride accumulation in hepatocytes (steatosis) By similarity.

E1 and E2 glycoproteins form a heterodimer that is involved in virus attachment to the host cell, virion internalization through clathrin-dependent endocytosis and fusion with host membrane. E1/E2 heterodimer binds to human LDLR, CD81 and SCARB1/SR-BI receptors, but this binding is not sufficient for infection, some additional liver specific cofactors may be needed. The fusion function may possibly be carried by E1. E2 inhibits human EIF2AK2/PKR activation, preventing the establishment of an antiviral state. E2 is a viral ligand for CD209/DC-SIGN and CLEC4M/DC-SIGNR, which are respectively found on dendritic cells (DCs), and on liver sinusoidal endothelial cells and macrophage-like cells of lymph node sinuses. These interactions allow capture of circulating HCV particles by these cells and subsequent transmission to permissive cells. DCs act as sentinels in various tissues where they entrap pathogens and convey them to local lymphoid tissue or lymph node for establishment of immunity. Capture of circulating HCV particles by these SIGN+ cells may facilitate virus infection of proximal hepatocytes and lymphocyte subpopulations and may be essential for the establishment of persistent infection By similarity.

P7 seems to be a heptameric ion channel protein (viroporin) and is inhibited by the antiviral drug amantadine. Also inhibited by long-alkyl-chain iminosugar derivatives. Essential for infectivity By similarity.

Protease NS2-3 is a cysteine protease responsible for the autocatalytic cleavage of NS2-NS3. Seems to undergo self-inactivation following maturation By similarity.

Enzyme regulation

Activity of auto-protease NS2-3 is dependent on zinc ions and completely inhibited by EDTA By similarity.

Subunit structure

Core protein is a homomultimer that binds the C-terminal part of E1 and interacts with numerous cellular proteins. Interaction with human STAT1 SH2 domain seems to result in decreased STAT1 phosphorylation, leading to decreased IFN-stimulated gene transcription. In addition to blocking the formation of phosphorylated STAT1, the core protein also promotes ubiquitin-mediated proteasome-dependent degradation of STAT1. Interacts with, and constitutively activates human STAT3. Associates with human LTBR and TNFRSF1A receptors and possibly induces apoptosis. Binds to human SP110 isoform 3/Sp110b HNRPK, C1QR1, YWHAE, UBE3A/E6AP, DDX3X, APOA2 and RXRA proteins. Interacts with human CREB3 nuclear transcription protein, triggering cell transformation. May interact with human p53. Also binds human cytokeratins KRT8, KRT18, KRT19 and VIM (vimentin). E1 and E2 glycoproteins form a heterodimer that binds to human LDLR, CD81 and SCARB1 receptors. E2 binds and inhibits human EIF2AK2/PKR. Also binds human CD209/DC-SIGN and CLEC4M/DC-SIGNR. p7 forms a homoheptamer in vitro By similarity. Ref.2

Subcellular location

Core protein p21: Host endoplasmic reticulum membrane; Single-pass membrane protein By similarity. Host mitochondrion membrane; Single-pass type I membrane protein By similarity. Host lipid droplet By similarity. Note: The C-terminal transmembrane domain of core protein p21 contains an ER signal leading the nascent polyprotein to the ER membrane. Only a minor proportion of core protein is present in the nucleus and an unknown proportion is secreted By similarity. Ref.3 Ref.5

Core protein p19: Virion By similarity. Host cytoplasm By similarity. Host nucleus By similarity. Secreted By similarity Ref.3 Ref.5.

Envelope glycoprotein E1: Virion membrane; Single-pass type I membrane protein Potential. Host endoplasmic reticulum membrane; Single-pass type I membrane protein By similarity. Note: The C-terminal transmembrane domain acts as a signal sequence and forms a hairpin structure before cleavage by host signal peptidase. After cleavage, the membrane sequence is retained at the C-terminus of the protein, serving as ER membrane anchor. A reorientation of the second hydrophobic stretch occurs after cleavage producing a single reoriented transmembrane domain. These events explain the final topology of the protein. ER retention of E1 is leaky and, in overexpression conditions, only a small fraction reaches the plasma membrane By similarity. Ref.3 Ref.5

Envelope glycoprotein E2: Virion membrane; Single-pass type I membrane protein Potential. Host endoplasmic reticulum membrane; Single-pass type I membrane protein By similarity. Note: The C-terminal transmembrane domain acts as a signal sequence and forms a hairpin structure before cleavage by host signal peptidase. After cleavage, the membrane sequence is retained at the C-terminus of the protein, serving as ER membrane anchor. A reorientation of the second hydrophobic stretch occurs after cleavage producing a single reoriented transmembrane domain. These events explain the final topology of the protein. ER retention of E2 is leaky and, in overexpression conditions, only a small fraction reaches the plasma membrane By similarity. Ref.3 Ref.5

p7: Host endoplasmic reticulum membrane; Multi-pass membrane protein By similarity. Host cell membrane By similarity. Note: The C-terminus of p7 membrane domain acts as a signal sequence. After cleavage by host signal peptidase, the membrane sequence is retained at the C-terminus of the protein, serving as ER membrane anchor. Only a fraction localizes to the plasma membrane By similarity. Ref.3 Ref.5

Domain

The transmembrane regions of envelope E1 and E2 glycoproteins are involved in heterodimer formation, ER localization, and assembly of these proteins. Envelope E2 glycoprotein contain two highly variable regions called hypervariable region 1 and 2 (HVR1 and HVR2) and two CD81-binding sites. HVR1 is implicated in the SCARB1-mediated cell entry. HVR2 and CD81-binding sites may be involved in sensitivity and/or resistance to IFN-alpha therapy By similarity.

Post-translational modification

Specific enzymatic cleavages in vivo yield mature proteins. The structural proteins, core, E1, E2 and p7 are produced by proteolytic processing by host signal peptidases. The core protein is synthesized as a 21 kDa precursor which is retained in the ER membrane through the hydrophobic signal peptide. Cleavage by the signal peptidase releases the 19 kDa mature core protein. The other proteins (p7 and NS2-3) are cleaved by the viral proteases By similarity. Ref.3

Envelope E1 and E2 glycoproteins are highly N-glycosylated By similarity.

Core protein is phosphorylated by host PKC and PKA By similarity.

Core protein is ubiquitinated; mediated by UBE3A and leading to core protein subsequent proteasomal degradation By similarity.

Miscellaneous

Core protein exerts viral interference on hepatitis B virus when HCV and HBV coinfect the same cell, by suppressing HBV gene expression, RNA encapsidation and budding By similarity.

Sequence similarities

Belongs to the hepacivirus polyprotein family.

Caution

The core gene probably also codes for alternative reading frame proteins (ARFPs). Many functions depicted for the core protein might belong to the ARFPs.

Ontologies

Keywords
   Biological processApoptosis
Clathrin-mediated endocytosis of virus by host
Fusion of virus membrane with host endosomal membrane
Fusion of virus membrane with host membrane
Host-virus interaction
Interferon antiviral system evasion
Ion transport
Transport
Viral attachment to host cell
Viral penetration into host cytoplasm
Virus endocytosis by host
Virus entry into host cell
   Cellular componentCapsid protein
Host cell membrane
Host cytoplasm
Host endoplasmic reticulum
Host lipid droplet
Host membrane
Host mitochondrion
Host nucleus
Membrane
Secreted
Viral envelope protein
Virion
   DiseaseOncogene
   DomainTransmembrane
Transmembrane helix
   LigandRNA-binding
Viral nucleoprotein
Zinc
   Molecular functionHydrolase
Ion channel
Protease
Ribonucleoprotein
Thiol protease
Viral ion channel
   PTMAcetylation
Glycoprotein
Phosphoprotein
Ubl conjugation
   Technical term3D-structure
Gene Ontology (GO)
   Biological_processapoptotic process

Inferred from electronic annotation. Source: UniProtKB-KW

clathrin-mediated endocytosis of virus by host cell

Inferred from electronic annotation. Source: UniProtKB-KW

evasion or tolerance by virus of host immune response

Inferred from electronic annotation. Source: UniProtKB-KW

fusion of virus membrane with host endosome membrane

Inferred from electronic annotation. Source: UniProtKB-KW

pore formation by virus in membrane of host cell

Inferred from electronic annotation. Source: UniProtKB-KW

protein oligomerization

Inferred from electronic annotation. Source: UniProtKB-KW

virion attachment to host cell

Inferred from electronic annotation. Source: UniProtKB-KW

   Cellular_componenthost cell endoplasmic reticulum membrane

Inferred from electronic annotation. Source: UniProtKB-SubCell

host cell lipid particle

Inferred from electronic annotation. Source: UniProtKB-SubCell

host cell mitochondrial membrane

Inferred from electronic annotation. Source: UniProtKB-SubCell

host cell nucleus

Inferred from electronic annotation. Source: UniProtKB-SubCell

host cell plasma membrane

Inferred from electronic annotation. Source: UniProtKB-SubCell

integral component of membrane

Inferred from electronic annotation. Source: UniProtKB-KW

integral to membrane of host cell

Inferred from electronic annotation. Source: UniProtKB-KW

ribonucleoprotein complex

Inferred from electronic annotation. Source: UniProtKB-KW

viral envelope

Inferred from electronic annotation. Source: UniProtKB-KW

viral nucleocapsid

Inferred from electronic annotation. Source: UniProtKB-KW

virion membrane

Inferred from electronic annotation. Source: UniProtKB-SubCell

   Molecular_functionRNA binding

Inferred from electronic annotation. Source: UniProtKB-KW

cysteine-type peptidase activity

Inferred from electronic annotation. Source: UniProtKB-KW

ion channel activity

Inferred from electronic annotation. Source: UniProtKB-KW

structural molecule activity

Inferred from electronic annotation. Source: InterPro

Complete GO annotation...

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Initiator methionine11Removed; by host By similarity
Chain2 – 191190Core protein p21
PRO_0000037559
Chain2 – 177176Core protein p19 By similarity
PRO_0000037560
Propeptide178 – 19114ER anchor for the core protein, removed in mature form by host signal peptidase By similarity
PRO_0000037561
Chain192 – 383192Envelope glycoprotein E1 Potential
PRO_0000037562
Chain384 – 746363Envelope glycoprotein E2 Potential
PRO_0000037563
Chain747 – 80963p7 By similarity
PRO_0000037564
Chain810 – ›829›20Protease NS2-3 Potential
PRO_0000037565

Regions

Topological domain2 – 168167Cytoplasmic Potential
Transmembrane169 – 18921Helical; Potential
Topological domain190 – 358169Lumenal Potential
Transmembrane359 – 37921Helical; Potential
Topological domain380 – 725346Lumenal Potential
Transmembrane726 – 74621Helical; Potential
Topological domain747 – 75711Lumenal Potential
Transmembrane758 – 77821Helical; Potential
Topological domain779 – 7824Cytoplasmic Potential
Transmembrane783 – 80321Helical; Potential
Topological domain804 – 81310Lumenal Potential
Transmembrane814 – ›829›16Helical; Potential
Region2 – 5958Interaction with DDX3X By similarity
Region2 – 2322Interaction with STAT1 By similarity
Region122 – 17352Interaction with APOA2 By similarity
Region150 – 15910Mitochondrial targeting signal By similarity
Region164 – 1674Important for lipid droplets localization By similarity
Region265 – 29632Fusion peptide Potential
Region385 – 41127HVR1 By similarity
Region475 – 4817HVR2 By similarity
Region482 – 49413CD81-binding 1 Potential
Region522 – 55332CD81-binding 2 Potential
Region660 – 67112PKR/eIF2-alpha phosphorylation homology domain (PePHD)
Motif5 – 139Nuclear localization signal Potential
Motif38 – 436Nuclear localization signal Potential
Motif58 – 647Nuclear localization signal Potential
Motif66 – 716Nuclear localization signal Potential
Compositional bias476 – 4794Poly-Gly
Compositional bias796 – 8038Poly-Leu

Sites

Site177 – 1782Cleavage; by host signal peptidase By similarity
Site191 – 1922Cleavage; by host signal peptidase Potential
Site383 – 3842Cleavage; by host signal peptidase Potential
Site746 – 7472Cleavage; by host signal peptidase By similarity
Site809 – 8102Cleavage; by host signal peptidase By similarity

Amino acid modifications

Modified residue21N-acetylserine; by host By similarity
Modified residue531Phosphoserine; by host By similarity
Modified residue991Phosphoserine; by host By similarity
Modified residue1161Phosphoserine; by host PKA By similarity
Glycosylation1961N-linked (GlcNAc...); by host Potential
Glycosylation2091N-linked (GlcNAc...); by host Potential
Glycosylation2341N-linked (GlcNAc...); by host Potential
Glycosylation3051N-linked (GlcNAc...); by host Potential
Glycosylation4171N-linked (GlcNAc...); by host Potential
Glycosylation4231N-linked (GlcNAc...); by host Potential
Glycosylation4301N-linked (GlcNAc...); by host Potential
Glycosylation4481N-linked (GlcNAc...); by host Potential
Glycosylation5401N-linked (GlcNAc...); by host Potential
Glycosylation5561N-linked (GlcNAc...); by host Potential
Glycosylation5761N-linked (GlcNAc...); by host Potential
Glycosylation6231N-linked (GlcNAc...); by host Potential
Glycosylation6451N-linked (GlcNAc...); by host Potential

Experimental info

Mutagenesis180 – 1845ALLSC → VLLLV: Complete loss of processing. Ref.3
Non-terminal residue8291

Secondary structure

..... 829
Helix Strand Turn

Details...

Sequences

Sequence LengthMass (Da)Tools
Q5EG65 [UniParc].

Last modified January 23, 2007. Version 3.
Checksum: 17AD3868F50B4AD4

FASTA82990,587
        10         20         30         40         50         60 
MSTNPKPQRK TKRNTNRRPQ DVKFPGGGQI VGGVYLLPRR GPRLGVRATR KTSERSQPRG 

        70         80         90        100        110        120 
RRQPIPKARR PKGRNWAQPG YPWPLYGNEG CGWAGWLPSP RGSRPSWGPN DPRRRSRNLG 

       130        140        150        160        170        180 
KVIDTLTCGF VDLMGYIPLV GAPLRGAARA LAHGVRVLED GVNYATGNLP GCSFSIFLLA 

       190        200        210        220        230        240 
LLSCLTVPAS AYQVRNSTGL YHVTNDCPNS SIVYEAVDAI LHTPGCVPCV REGNASRCWV 

       250        260        270        280        290        300 
AMTPTVATRD GRLPTTQLRR HIDLLVGSAT LCSALYVGDL CGSVFLVGQL FTFSPRRHWT 

       310        320        330        340        350        360 
TQGCNCSIYP GHITGHRMAW DMMMNWSPTT ALVVAQLLRI PQAILDMIAG AHWGVLAGMA 

       370        380        390        400        410        420 
YFSMVGNWAK VLAVLLLFAG VDAETHVTGG AAARSTLQLA GLFQPGAKQN VQLINTNGSW 

       430        440        450        460        470        480 
HVNRTALNCN DSLNTGWIAG LFYYHGFNSS GCSERLASCR SLTDFDQGWG PISYAGGGGP 

       490        500        510        520        530        540 
DHRPYCWHYP PKPCGIVPAK SVCGPVYCFT PSPVVVGTTD RSGAPTYSWG ADDTDVFVLN 

       550        560        570        580        590        600 
NTRPPLGNWF GCTWMNSTGF TKVCGAPPCV IGGVGNNTLH CPTDCFRKHP EATYSRCGSG 

       610        620        630        640        650        660 
PWLTPRCLVD YPYRLWHYPC TINHSIFKVR MYVGGVEHRL DAACNWTRGE RCDLEDRDRS 

       670        680        690        700        710        720 
ELSPLLLSTT QWQVLPCSFT TLPALSTGLI HLHQNIVDVQ YLYGVGSSIA SWAIKWEYVV 

       730        740        750        760        770        780 
LLFLLLADAR VCSCLWMMLL ISQAEAALEN LVVLNAASLA GTHGLVSFLV FFCFAWFLRG 

       790        800        810        820 
KWVPGAVYAL YGMWPLLLLL LALPQRAYAL DTEVAASCGG VVLVGLMAL 

« Hide

References

[1]"Covalent interactions are not required to permit or stabilize the non-covalent association of hepatitis C virus glycoproteins E1 and E2."
Patel J., Patel A.H., McLauchlan J.
J. Gen. Virol. 80:1681-1690(1999) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC RNA].
[2]"Hepatitis C virus core protein interacts with a human DEAD box protein DDX3."
Owsianka A.M., Patel A.H.
Virology 257:330-340(1999) [PubMed] [Europe PMC] [Abstract]
Cited for: INTERACTION OF CORE PROTEIN WITH HUMAN DDX3X.
[3]"Intramembrane proteolysis promotes trafficking of hepatitis C virus core protein to lipid droplets."
McLauchlan J., Lemberg M.K., Hope G., Martoglio B.
EMBO J. 21:3980-3988(2002) [PubMed] [Europe PMC] [Abstract]
Cited for: CLEAVAGE OF CORE PROTEIN BY THE SIGNAL PEPTIDASE, SUBCELLULAR LOCATION, MUTAGENESIS OF 180-ALA--CYS-184.
[4]"Properties of the hepatitis C virus core protein: a structural protein that modulates cellular processes."
McLauchlan J.
J. Viral Hepat. 7:2-14(2000) [PubMed] [Europe PMC] [Abstract]
Cited for: REVIEW.
[5]"Structural biology of hepatitis C virus."
Penin F., Dubuisson J., Rey F.A., Moradpour D., Pawlotsky J.-M.
Hepatology 39:5-19(2004) [PubMed] [Europe PMC] [Abstract]
Cited for: REVIEW, SUBCELLULAR LOCATION.
+Additional computationally mapped references.

Web resources

euHCVdb

The European HCV database

Virus Pathogen Resource

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AY885238 Genomic RNA. Translation: AAW78019.1.

3D structure databases

PDBe
RCSB-PDB
PDBj
EntryMethodResolution (Å)ChainPositionsPDBsum
4GAGX-ray1.80P412-423[»]
ProteinModelPortalQ5EG65.
SMRQ5EG65. Positions 2-45.
ModBaseSearch...
MobiDBSearch...

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Organism-specific databases

euHCVdbAY885238.

Family and domain databases

InterProIPR002521. HCV_core_C.
IPR002522. HCV_core_N.
IPR002519. HCV_env.
IPR002531. HCV_NS1.
[Graphical view]
PfamPF01543. HCV_capsid. 1 hit.
PF01542. HCV_core. 1 hit.
PF01539. HCV_env. 1 hit.
PF01560. HCV_NS1. 1 hit.
[Graphical view]
ProDomPD001388. HCV_env. 1 hit.
[Graphical view] [Entries sharing at least one domain]
ProtoNetSearch...

Entry information

Entry namePOLG_HCVGL
AccessionPrimary (citable) accession number: Q5EG65
Entry history
Integrated into UniProtKB/Swiss-Prot: July 19, 2005
Last sequence update: January 23, 2007
Last modified: July 9, 2014
This is version 75 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programViral Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families

PDB cross-references

Index of Protein Data Bank (PDB) cross-references