Skip Header

Contribute Send feedback
Read comments (?) or add your own

P04591 (GAG_HV1H2) Reviewed, UniProtKB/Swiss-Prot

Last modified May 1, 2013. Version 130. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Interactions·Alt products·Sequence annotation·Sequences·References·Web links·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Gag polyprotein
Alternative name(s):
Pr55Gag

Cleaved into the following 6 chains:

  1. Matrix protein p17
    Short name=MA
  2. Capsid protein p24
    Short name=CA
  3. Spacer peptide p2
  4. Nucleocapsid protein p7
    Short name=NC
  5. Spacer peptide p1
  6. p6-gag
Gene names
Name:gag
OrganismHuman immunodeficiency virus type 1 group M subtype B (isolate HXB2) (HIV-1) [Reference proteome]
Taxonomic identifier11706 [NCBI]
Taxonomic lineageVirusesRetro-transcribing virusesRetroviridaeOrthoretrovirinaeLentivirusPrimate lentivirus group
Virus hostHomo sapiens (Human) [TaxID: 9606]

Protein attributes

Sequence length500 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Matrix protein p17 targets Gag and Gag-Pol polyproteins to the plasma membrane via a multipartite membrane binding signal, that includes its myristoylated N-terminus. Also mediates nuclear localization of the preintegration complex. Implicated in the release from host cell mediated by Vpu.

Capsid protein p24 forms the conical core of the virus that encapsulates the genomic RNA-nucleocapsid complex.

Nucleocapsid protein p7 encapsulates and protects viral dimeric unspliced (genomic) RNA. Binds these RNAs through its zinc fingers.

p6-gag plays a role in budding of the assembled particle by interacting with the host class E VPS proteins TSG101 and PDCD6IP/AIP1 By similarity.

Subunit structure

Matrix protein p17, and probably Pr55Gag form hexamer rings of trimers Probable. Oligomerization possibly creates a central hole into which the cytoplasmic tail of the gp41 envelope protein may be inserted. Pr55Gag interacts with host TRIM22; this interaction seems to disrupt proper trafficking of Gag polyprotein and may interfere with budding. p6-gag interacts with Vpr. p6-gag interacts with host TSG101. p6-gag interacts with host PDCD6IP/AIP1 By similarity. Pr55Gag interacts with host PDZD8 By similarity. Ref.5 Ref.6

Subcellular location

Matrix protein p17: Virion Potential Ref.2. Host nucleus By similarity. Host cytoplasm By similarity. Host cell membrane; Lipid-anchor Potential. Note: Following virus entry, the nuclear localization signal (NLS) of the matrix protein participates with Vpr to the nuclear localization of the viral genome. During virus production, the nuclear export activity of the matrix protein counteracts the NLS to maintain the Gag and Gag-Pol polyproteins in the cytoplasm, thereby directing unspliced RNA to the plasma membrane By similarity. Ref.2

Capsid protein p24: Virion Potential Ref.2.

Nucleocapsid protein p7: Virion Potential Ref.2.

Domain

Late-budding domains (L domains) are short sequence motifs essential for viral particle budding. They recruit proteins of the host ESCRT machinery (Endosomal Sorting Complex Required for Transport) or ESCRT-associated proteins. p6-gag contains two L domains: a PTAP/PSAP motif, which interacts with the UEV domain of TSG101 and a LYPX(n)L motif which interacts with PDCD6IP/AIP1 By similarity.

Post-translational modification

Capsid protein p24 is phosphorylated.

Specific enzymatic cleavages by the viral protease yield mature proteins. The polyprotein is cleaved during and after budding, this process is termed maturation By similarity.

Nucleocapsid protein p7 is methylated by host PRMT6, impairing its function by reducing RNA annealing and the initiation of reverse transcription By similarity.

Miscellaneous

HIV-1 lineages are divided in three main groups, M (for Major), O (for Outlier), and N (for New, or Non-M, Non-O). The vast majority of strains found worldwide belong to the group M. Group O seems to be endemic to and largely confined to Cameroon and neighboring countries in West Central Africa, where these viruses represent a small minority of HIV-1 strains. The group N is represented by a limited number of isolates from Cameroonian persons. The group M is further subdivided in 9 clades or subtypes (A to D, F to H, J and K).

Sequence similarities

Belongs to the primate lentivirus group gag polyprotein family.

Contains 2 CCHC-type zinc fingers.

Ontologies

Keywords
   Biological processHost-virus interaction
Viral budding
Viral budding via the host ESCRT complexes
Virus exit from host cell
   Cellular componentHost cell membrane
Host cytoplasm
Host membrane
Host nucleus
Membrane
Virion
   Coding sequence diversityRibosomal frameshifting
   DiseaseAIDS
   DomainRepeat
Zinc-finger
   LigandMetal-binding
RNA-binding
Viral nucleoprotein
Zinc
   Molecular functionCapsid protein
   PTMLipoprotein
Methylation
Myristate
Phosphoprotein
   Technical term3D-structure
Complete proteome
Reference proteome
Gene Ontology (GO)
   Biological_processRNA-dependent DNA replication

Traceable author statement. Source: Reactome

egress of virus within host cell

Traceable author statement. Source: Reactome

entry into host cell

Traceable author statement. Source: Reactome

uncoating of virus

Traceable author statement. Source: Reactome

viral release from host cell

Inferred from electronic annotation. Source: UniProtKB-KW

   Cellular_componentcytosol

Traceable author statement. Source: Reactome

endosome membrane

Traceable author statement. Source: Reactome

host cell cytoplasm

Inferred from electronic annotation. Source: UniProtKB-SubCell

host cell nucleus

Inferred from electronic annotation. Source: UniProtKB-SubCell

host cell plasma membrane

Inferred from electronic annotation. Source: UniProtKB-SubCell

nucleoplasm

Traceable author statement. Source: Reactome

viral capsid

Inferred from electronic annotation. Source: UniProtKB-KW

   Molecular_functionRNA binding

Inferred from electronic annotation. Source: UniProtKB-KW

structural molecule activity

Inferred from electronic annotation. Source: InterPro

zinc ion binding

Inferred from electronic annotation. Source: InterPro

Complete GO annotation...

Binary interactions

With

Entry

#Exp.

IntAct

Notes

AIMP1Q129043EBI-6179719,EBI-1045802From a different organism.
AIMP2Q131553EBI-6179719,EBI-745226From a different organism.
CTNNA3Q9UI472EBI-6179727,EBI-3937546From a different organism.
EEF1E1O433244EBI-6179719,EBI-1048486From a different organism.
LRRC47Q8N1G42EBI-6179727,EBI-2509921From a different organism.
MRPL11Q9Y3B73EBI-6179727,EBI-5453723From a different organism.
NHP2L1P557692EBI-6179727,EBI-712228From a different organism.
NOLC1Q149782EBI-6179719,EBI-396155From a different organism.
OLA1Q9NTK54EBI-6179719,EBI-766468From a different organism.
PRMT1Q998732EBI-6179727,EBI-78738From a different organism.
SDCCAG8Q86SQ72EBI-6179719,EBI-1047850From a different organism.
SEPSECSQ9HD403EBI-6163428,EBI-6163446From a different organism.
YTHDF3Q7Z7392EBI-6179727,EBI-2849837From a different organism.

Alternative products

This entry describes 2 isoforms produced by ribosomal frameshifting. [Align] [Select]

Note: Translation results in the formation of the Gag polyprotein most of the time. Ribosomal frameshifting at the gag-pol genes boundary occurs at low frequency and produces the Gag-Pol polyprotein. This strategy of translation probably allows the virus to modulate the quantity of each viral protein. Maintenance of a correct Gag to Gag-Pol ratio is essential for RNA dimerization and viral infectivity.
Isoform Gag polyprotein (identifier: P04591-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Note: Produced by conventional translation.
Isoform Gag-Pol polyprotein (identifier: P04585-1)

The sequence of this isoform can be found in the external entry P04585.
Isoforms of the same protein are often annotated in two different entries if their sequences differ significantly.
Note: Produced by -1 ribosomal frameshifting.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Initiator methionine11Removed; by host By similarity
Chain2 – 500499Gag polyprotein
PRO_0000261216
Chain2 – 132131Matrix protein p17 By similarity
PRO_0000038593
Chain133 – 363231Capsid protein p24 By similarity
PRO_0000038594
Peptide364 – 37714Spacer peptide p2 By similarity
PRO_0000038595
Chain378 – 43255Nucleocapsid protein p7 By similarity
PRO_0000038596
Peptide433 – 44816Spacer peptide p1 By similarity
PRO_0000038597
Chain449 – 50052p6-gag By similarity
PRO_0000038598

Regions

Zinc finger390 – 40718CCHC-type 1
Zinc finger411 – 42818CCHC-type 2
Motif16 – 227Nuclear export signal
Motif26 – 327Nuclear localization signal
Motif455 – 4584PTAP/PSAP motif
Motif483 – 49210LYPX(n)L motif

Sites

Site132 – 1332Cleavage; by viral protease By similarity
Site363 – 3642Cleavage; by viral protease By similarity
Site377 – 3782Cleavage; by viral protease By similarity
Site432 – 4332Cleavage; by viral protease By similarity
Site448 – 4492Cleavage; by viral protease By similarity

Amino acid modifications

Modified residue3871Asymmetric dimethylarginine; in Nucleocapsid protein p7; by host PRMT6 By similarity
Modified residue4091Asymmetric dimethylarginine; in Nucleocapsid protein p7; by host PRMT6 By similarity
Lipidation21N-myristoyl glycine; by host By similarity

Experimental info

Mutagenesis181K → A: Replication-defective, induces nuclear mislocalization of matrix protein; when associated with G-22. Ref.2
Mutagenesis221R → G: Replication-defective, induces nuclear mislocalization of matrix protein; when associated with A-18. Ref.2
Mutagenesis271K → A: No effect on subcellular localization of matrix protein; when associated with A-18 and G-22. Ref.2

Secondary structure

......................................................................... 500
Helix Strand Turn

Details...

Sequences

Sequence LengthMass (Da)Tools
Isoform Gag polyprotein [UniParc].

Last modified January 23, 2007. Version 3.
Checksum: B74C3858C20EF82C

FASTA50055,930
        10         20         30         40         50         60 
MGARASVLSG GELDRWEKIR LRPGGKKKYK LKHIVWASRE LERFAVNPGL LETSEGCRQI 

        70         80         90        100        110        120 
LGQLQPSLQT GSEELRSLYN TVATLYCVHQ RIEIKDTKEA LDKIEEEQNK SKKKAQQAAA 

       130        140        150        160        170        180 
DTGHSNQVSQ NYPIVQNIQG QMVHQAISPR TLNAWVKVVE EKAFSPEVIP MFSALSEGAT 

       190        200        210        220        230        240 
PQDLNTMLNT VGGHQAAMQM LKETINEEAA EWDRVHPVHA GPIAPGQMRE PRGSDIAGTT 

       250        260        270        280        290        300 
STLQEQIGWM TNNPPIPVGE IYKRWIILGL NKIVRMYSPT SILDIRQGPK EPFRDYVDRF 

       310        320        330        340        350        360 
YKTLRAEQAS QEVKNWMTET LLVQNANPDC KTILKALGPA ATLEEMMTAC QGVGGPGHKA 

       370        380        390        400        410        420 
RVLAEAMSQV TNSATIMMQR GNFRNQRKIV KCFNCGKEGH TARNCRAPRK KGCWKCGKEG 

       430        440        450        460        470        480 
HQMKDCTERQ ANFLGKIWPS YKGRPGNFLQ SRPEPTAPPE ESFRSGVETT TPPQKQEPID 

       490        500 
KELYPLTSLR SLFGNDPSSQ 

« Hide

Isoform Gag-Pol polyprotein [UniParc].

See P04585.

References

[1]"Complete nucleotide sequences of functional clones of the AIDS virus."
Ratner L., Fisher A., Jagodzinski L.L., Mitsuya H., Liou R.-S., Gallo R.C., Wong-Staal F.
AIDS Res. Hum. Retroviruses 3:57-69(1987) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC RNA].
[2]"A novel nuclear export activity in HIV-1 matrix protein required for viral replication."
Dupont S., Sharova N., DeHoratius C., Virbasius C.M., Zhu X., Bukrinskaya A.G., Stevenson M., Green M.R.
Nature 402:681-685(1999) [PubMed] [Europe PMC] [Abstract]
Cited for: SUBCELLULAR LOCATION OF POLYPROTEIN, MUTAGENESIS OF LYS-18; ARG-22 AND LYS-27.
[3]"Maintenance of the Gag/Gag-Pol ratio is important for human immunodeficiency virus type 1 RNA dimerization and viral infectivity."
Shehu-Xhilaga M., Crowe S.M., Mak J.
J. Virol. 75:1834-1841(2001) [PubMed] [Europe PMC] [Abstract]
Cited for: GAG/GAG-POL RATIO.
[4]"Role of HIV-1 Gag domains in viral assembly."
Scarlata S., Carter C.
Biochim. Biophys. Acta 1614:62-72(2003) [PubMed] [Europe PMC] [Abstract]
Cited for: REVIEW.
[5]"Human immunodeficiency virus type 1 matrix protein assembles on membranes as a hexamer."
Alfadhli A., Huseby D., Kapit E., Colman D., Barklis E.
J. Virol. 81:1472-1478(2007) [PubMed] [Europe PMC] [Abstract]
Cited for: SUBUNIT.
[6]"The interferon response inhibits HIV particle production by induction of TRIM22."
Barr S.D., Smiley J.R., Bushman F.D.
PLoS Pathog. 4:E1000007-E1000007(2008) [PubMed] [Europe PMC] [Abstract]
Cited for: INTERACTION WITH HUMAN TRIM22.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
K03455 Genomic RNA. Translation: AAB50258.1.
RefSeqNP_057850.1. NC_001802.1.

3D structure databases

PDBe
RCSB PDB
PDBj
EntryMethodResolution (Å)ChainPositionsPDBsum
1TSQX-ray2.00P429-437[»]
1TSUX-ray2.10P429-436[»]
ProteinModelPortalP04591.
SMRP04591. Positions 1-432, 449-500.
ModBaseSearch...

Protein-protein interaction databases

IntActP04591. 39 interactions.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Enzyme and pathway databases

ReactomeREACT_116125. Disease.

Family and domain databases

Gene3D1.10.1200.30. 1 hit.
1.10.150.90. 1 hit.
1.10.375.10. 1 hit.
4.10.60.10. 1 hit.
InterProIPR000721. Gag_p24.
IPR014817. Gag_p6.
IPR000071. Lentvrl_matrix_N.
IPR012344. Matrix_N_HIV/RSV.
IPR008916. Retrov_capsid_C.
IPR008919. Retrov_capsid_N.
IPR010999. Retrovr_matrix_N.
IPR001878. Znf_CCHC.
[Graphical view]
PfamPF00540. Gag_p17. 1 hit.
PF00607. Gag_p24. 1 hit.
PF08705. Gag_p6. 1 hit.
PF00098. zf-CCHC. 2 hits.
[Graphical view]
PRINTSPR00234. HIV1MATRIX.
SMARTSM00343. ZnF_C2HC. 2 hits.
[Graphical view]
SUPFAMSSF47353. Retrov_capsid_C. 1 hit.
SSF47943. Retrov_capsid_N. 1 hit.
SSF47836. Retrovir_matrix. 1 hit.
SSF57756. SSF57756. 1 hit.
PROSITEPS50158. ZF_CCHC. 2 hits.
[Graphical view]
ProtoNetSearch...

Other

EvolutionaryTraceP04591.

Entry information

Entry nameGAG_HV1H2
AccessionPrimary (citable) accession number: P04591
Entry history
Integrated into UniProtKB/Swiss-Prot: August 13, 1987
Last sequence update: January 23, 2007
Last modified: May 1, 2013
This is version 130 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programViral Protein Annotation Program

Relevant documents

PDB cross-references

Index of Protein Data Bank (PDB) cross-references

SIMILARITY comments

Index of protein domains and families