Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Gag polyprotein

Gene

gag

Organism
Woolly monkey sarcoma virus (WMSV) (Smian sarcoma-associated virus)
Status
Reviewed-Annotation score: -Protein inferred from homologyi

Functioni

Gag polyprotein: Plays a role in budding and is processed by the viral protease during virion maturation outside the cell. During budding, it recruits, in a PPXY-dependent or independent manner, Nedd4-like ubiquitin ligases that conjugate ubiquitin molecules to Gag, or to Gag binding host factors. Interaction with HECT ubiquitin ligases probably links the viral protein to the host ESCRT pathway and facilitates release.By similarity
Matrix protein p15: Targets Gag and gag-pol polyproteins to the plasma membrane via a multipartite membrane binding signal, that includes its myristoylated N-terminus. Also mediates nuclear localization of the pre-integration complex.By similarity
RNA-binding phosphoprotein p12: Constituent of the pre-integration complex (PIC) which tethers the latter to mitotic chromosomes.By similarity
Capsid protein p30: Forms the spherical core of the virion that encapsulates the genomic RNA-nucleocapsid complex.By similarity
Nucleocapsid protein p10-Gag: Involved in the packaging and encapsidation of two copies of the genome. Binds with high affinity to conserved elements within the packaging signal, located near the 5'-end of the genome. This binding is dependent on genome dimerization.By similarity

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri490 – 507CCHC-typePROSITE-ProRule annotationAdd BLAST18

GO - Molecular functioni

GO - Biological processi

Keywordsi

Molecular functionRNA-binding, Viral nucleoprotein
Biological processHost-virus interaction, Viral budding, Viral budding via the host ESCRT complexes, Viral release from host cell
LigandMetal-binding, Zinc

Names & Taxonomyi

Protein namesi
Recommended name:
Gag polyprotein
Alternative name(s):
Core polyprotein
Cleaved into the following 4 chains:
Matrix protein p15
Short name:
MA
Alternative name(s):
pp12
Capsid protein p30
Short name:
CA
Gene namesi
Name:gag
OrganismiWoolly monkey sarcoma virus (WMSV) (Smian sarcoma-associated virus)
Taxonomic identifieri11970 [NCBI]
Taxonomic lineageiVirusesRetro-transcribing virusesRetroviridaeOrthoretrovirinaeGammaretrovirus
Virus hostiLagothrix (woolly monkeys) [TaxID: 9518]
Proteomesi
  • UP000167400 Componenti: Genome
  • UP000203831 Componenti: Genome

Subcellular locationi

Gag polyprotein :
  • Virion By similarity
  • Host cell membrane By similarity; Lipid-anchor By similarity
  • Host late endosome membrane By similarity; Lipid-anchor By similarity
  • host multivesicular body By similarity
  • Note: These locations are probably linked to virus assembly sites.Curated
Matrix protein p15 :
  • Virion By similarity
Capsid protein p30 :
  • Virion By similarity
Nucleocapsid protein p10-Gag :
  • Virion By similarity
RNA-binding phosphoprotein p12 :
  • Host cytoplasm By similarity
  • Note: Localizes to the host cytoplasm early in infection and binds to the mitotic chromosomes later on.By similarity

GO - Cellular componenti

Keywords - Cellular componenti

Capsid protein, Host cell membrane, Host cytoplasm, Host endosome, Host membrane, Membrane, Viral matrix protein, Virion

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Initiator methionineiRemovedSequence analysis
ChainiPRO_00003908202 – 521Gag polyproteinAdd BLAST520
ChainiPRO_00000409602 – 128Matrix protein p15Add BLAST127
ChainiPRO_0000040961129 – 196RNA-binding phosphoprotein p12Add BLAST68
ChainiPRO_0000040962197 – 455Capsid protein p30Add BLAST259
ChainiPRO_0000040963456 – 521Nucleocapsid protein p10-GagAdd BLAST66

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Lipidationi2N-myristoyl glycine; by hostSequence analysis1

Post-translational modificationi

Gag polyprotein: Specific enzymatic cleavages by the viral protease yield mature proteins. The protease is released by autocatalytic cleavage. The polyprotein is cleaved during and after budding, this process is termed maturation.By similarity
RNA-binding phosphoprotein p12 is phosphorylated on serine residues.By similarity

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei128 – 129Cleavage; by viral proteaseBy similarity2
Sitei196 – 197Cleavage; by viral proteaseBy similarity2
Sitei455 – 456Cleavage; by viral proteaseBy similarity2

Keywords - PTMi

Lipoprotein, Myristate, Phosphoprotein

Interactioni

Subunit structurei

Capsid protein p30: Homohexamer; further associates as homomultimer. Capsid protein p30: The virus core is composed of a lattice formed from hexagonal rings, each containing six capsid monomers. Gag polyprotein: Interacts (via PPXY motif) with host NEDD4. Gag polyprotein: Interacts (via PSAP motif) with host TSG101.By similarity

Structurei

3D structure databases

ProteinModelPortaliP03330
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Coiled coil

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Coiled coili408 – 455Sequence analysisAdd BLAST48

Motif

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Motifi119 – 122PTAP/PSAP motifBy similarity4
Motifi140 – 143PPXY motifCurated4

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi92 – 217Pro-richPROSITE-ProRule annotationAdd BLAST126

Domaini

Gag polyprotein: Late-budding domains (L domains) are short sequence motifs essential for viral particle budding. They recruit proteins of the host ESCRT machinery (Endosomal Sorting Complex Required for Transport) or ESCRT-associated proteins. RNA-binding phosphoprotein p12 contains one L domain: a PPXY motif which potentially interacts with the WW domain 3 of NEDD4 E3 ubiquitin ligase. Matrix protein p15 contains one L domain: a PTAP/PSAP motif, which potentially interacts with the UEV domain of TSG101.By similarity

Zinc finger

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri490 – 507CCHC-typePROSITE-ProRule annotationAdd BLAST18

Keywords - Domaini

Coiled coil, Zinc-finger

Family and domain databases

Gene3Di1.10.150.180, 1 hit
1.10.375.10, 1 hit
InterProiView protein in InterPro
IPR000840 G_retro_matrix
IPR036946 G_retro_matrix_sf
IPR003036 Gag_P30
IPR008919 Retrov_capsid_N
IPR010999 Retrovr_matrix
IPR001878 Znf_CCHC
IPR036875 Znf_CCHC_sf
PfamiView protein in Pfam
PF01140 Gag_MA, 1 hit
PF02093 Gag_p30, 1 hit
PF00098 zf-CCHC, 1 hit
SMARTiView protein in SMART
SM00343 ZnF_C2HC, 1 hit
SUPFAMiSSF47836 SSF47836, 1 hit
SSF47943 SSF47943, 1 hit
SSF57756 SSF57756, 1 hit
PROSITEiView protein in PROSITE
PS50158 ZF_CCHC, 1 hit

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P03330-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MGQNNSTPLS LTLDHWKDVR TRAHNLSVKI RKGKWQTFCS SEWPTFGVGW
60 70 80 90 100
PPEGTFNLSV IFAVKRIVFQ ETGGHPDQVP YIVVWQDLAQ SPPPWVPPSA
110 120 130 140 150
KIAVVSSPEN TRGPSAGRPS APPRPPIYPA TDDLLLLSEP PPYPAALPPP
160 170 180 190 200
LAPPAVGPAP GQAPDSSDPE GPAAGTRSRR ARSPADDSGP DSTVILPLRA
210 220 230 240 250
IGPPAEPNGL VPLQYWPFSS ADLYNWKSNH PSFSENPAGL TGLLESLMFS
260 270 280 290 300
HQPTWDDCQQ LLQILFTTEE RERILLEARK NVLGDNGAPT QLENLINEAF
310 320 330 340 350
PLNRPQWDYN TAAGRERLLV YRRTLVAGLK GAARRPTNLA KVREVLQGPA
360 370 380 390 400
EPPSVFLERL MEAYRRYTPF DPSEEGQQAA VAMAFIGQSA PDIKKKLQRL
410 420 430 440 450
EGLQDYSLQD LVREAEKVYH KRETEEERQE REKKEAEERE RRRDRRQEKN
460 470 480 490 500
LTRILAAVVS ERGSRDRQTG NLSNRARKTP RDGRPPLDKD QCAYCKEKGH
510 520
WARECPQKKN VREAKVLALD D
Length:521
Mass (Da):58,316
Last modified:January 31, 2018 - v4
Checksum:i471F0C4FF190FDD0
GO

Sequence cautioni

The sequence CAA24514 differs from that shown.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti14D → G in CAA24514 (PubMed:6298772).1
Sequence conflicti33G → E in CAA24514 (PubMed:6298772).1
Sequence conflicti112R → Q in CAA24514 (PubMed:6298772).1
Sequence conflicti317R → L in CAA24514 (PubMed:6298772).1
Sequence conflicti383M → T in CAA24514 (PubMed:6298772).1
Sequence conflicti386I → T in CAA24514 (PubMed:6298772).1
Sequence conflicti462R → G in CAA24514 (PubMed:6298772).1
Sequence conflicti465R → G in CAA24514 (PubMed:6298772).1
Sequence conflicti468Q → R in CAA24514 (PubMed:6298772).1
Sequence conflicti473S → G in CAA24514 (PubMed:6298772).1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
V01201 Genomic DNA Translation: CAA24514.1 Sequence problems.
KT724051 Genomic DNA Translation: ALV83311.1
PIRiA03928 FOMVGS
RefSeqiYP_001165469.2, NC_009424.4
YP_003580184.1, NC_009424.4

Genome annotation databases

GeneIDi5176148
KEGGivg:5176148

Similar proteinsi

Entry informationi

Entry nameiGAG_WMSV
AccessioniPrimary (citable) accession number: P03330
Secondary accession number(s): A0A0U3TJX4
Entry historyiIntegrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: January 31, 2018
Last modified: April 25, 2018
This is version 99 of the entry and version 4 of the sequence. See complete history.
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programViral Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome

Cookie policy

We would like to use anonymized google analytics cookies to gather statistics on how uniprot.org is used in aggregate. Learn more

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health