Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P14077 (GAG_HTL1M) Reviewed, UniProtKB/Swiss-Prot

Last modified February 19, 2014. Version 99. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Gag polyprotein
Alternative name(s):
Pr53Gag

Cleaved into the following 3 chains:

  1. Matrix protein p19
    Short name=MA
  2. Capsid protein p24
    Short name=CA
  3. Nucleocapsid protein p15-gag
    Short name=NC-gag
Gene names
Name:gag
OrganismHuman T-cell leukemia virus 1 (strain Japan MT-2 subtype A) (HTLV-1)
Taxonomic identifier11928 [NCBI]
Taxonomic lineageVirusesRetro-transcribing virusesRetroviridaeOrthoretrovirinaeDeltaretrovirus
Virus hostHomo sapiens (Human) [TaxID: 9606]

Protein attributes

Sequence length429 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Matrix protein p19 targets Gag, Gag-Pro and Gag-Pro-Pol polyproteins to the plasma membrane via a multipartite membrane binding signal, that includes its myristoylated N-terminus. Also mediates nuclear localization of the preintegration complex By similarity.

Capsid protein p24 forms the conical core of the virus that encapsulates the genomic RNA-nucleocapsid complex By similarity.

Nucleocapsid protein p15 is involved in the packaging and encapsidation of two copies of the genome By similarity.

Subunit structure

Interacts with human TSG101 and NEDD4. These interactions are essential for budding and release of viral particles By similarity.

Subcellular location

Matrix protein p19: Virion Potential.

Capsid protein p24: Virion Potential.

Nucleocapsid protein p15-gag: Virion Potential.

Domain

Late-budding domains (L domains) are short sequence motifs essential for viral particle release. They can occur individually or in close proximity within structural proteins. They interacts with sorting cellular proteins of the multivesicular body (MVB) pathway. Most of these proteins are class E vacuolar protein sorting factors belonging to ESCRT-I, ESCRT-II or ESCRT-III complexes. Matrix protein p19 contains two L domains: a PTAP/PSAP motif which interacts with the UEV domain of TSG101, and a PPXY motif which binds to the WW domains of HECT (homologous to E6-AP C-terminus) E3 ubiquitin ligases, like NEDD4 By similarity.

The capsid protein N-terminus seems to be involved in Gag-Gag interactions By similarity.

Post-translational modification

Specific enzymatic cleavages by the viral protease yield mature proteins. The polyprotein is cleaved during and after budding, this process is termed maturation By similarity.

Phosphorylation of the matrix protein p19 by MAPK1 seems to play a role in budding By similarity.

Miscellaneous

HTLV-1 lineages are divided in four clades, A (Cosmopolitan), B (Central African group), C (Melanesian group) and D (New Central African group).

Sequence similarities

Contains 2 CCHC-type zinc fingers.

Ontologies

Keywords
   Biological processHost-virus interaction
   Cellular componentCapsid protein
Virion
   Coding sequence diversityRibosomal frameshifting
   DomainRepeat
Zinc-finger
   LigandMetal-binding
Viral nucleoprotein
Zinc
   PTMLipoprotein
Myristate
Phosphoprotein
   Technical term3D-structure
Gene Ontology (GO)
   Biological_processviral process

Inferred from electronic annotation. Source: UniProtKB-KW

   Cellular_componentviral nucleocapsid

Inferred from electronic annotation. Source: UniProtKB-KW

   Molecular_functionnucleic acid binding

Inferred from electronic annotation. Source: InterPro

structural molecule activity

Inferred from electronic annotation. Source: InterPro

zinc ion binding

Inferred from electronic annotation. Source: InterPro

Complete GO annotation...

Alternative products

This entry describes 3 isoforms produced by ribosomal frameshifting. [Select]

Note: This strategy of translation probably allows the virus to modulate the quantity of each viral protein.
Isoform Gag polyprotein (identifier: P14077-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Note: Produced by conventional translation.
Isoform Gag-Pro polyprotein (identifier: P14077-2)

The sequence of this isoform is not available.
Note: Produced by -1 ribosomal frameshifting at the gag-pro genes boundary.
Isoform Gag-Pol polyprotein (identifier: P14077-3)

The sequence of this isoform is not available.
Note: Produced by -1 ribosomal frameshifting at the gag-pol genes boundary.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Initiator methionine11Removed; by host
Chain2 – 429428Gag polyprotein
PRO_0000259773
Chain2 – 130129Matrix protein p19 By similarity
PRO_0000038817
Chain131 – 344214Capsid protein p24 By similarity
PRO_0000038818
Chain345 – 42985Nucleocapsid protein p15-gag By similarity
PRO_0000038819

Regions

Zinc finger355 – 37218CCHC-type 1
Zinc finger378 – 39518CCHC-type 2
Motif118 – 1214PPXY motif
Motif124 – 1274PTAP/PSAP motif
Compositional bias95 – 14450Pro-rich

Sites

Site130 – 1312Cleavage; by viral protease By similarity
Site344 – 3452Cleavage; by viral protease By similarity

Amino acid modifications

Modified residue1051Phosphoserine; by host MAPK1 By similarity
Lipidation21N-myristoyl glycine; by host Ref.2

Secondary structure

..................... 429
Helix Strand Turn

Details...

Sequences

Sequence LengthMass (Da)Tools
Isoform Gag polyprotein [UniParc].

Last modified January 23, 2007. Version 3.
Checksum: EF5201C934EF0291

FASTA42947,585
        10         20         30         40         50         60 
MGQIFSRSAS PIPRPPRGLA AHHWLNFLQA AYRLEPGPSS YDFHQLKKFL KIALETPVWI 

        70         80         90        100        110        120 
CPINYSLLAS LLPKGYPGRV NEILHILIQT QAQIPSRPAP PPPSSPTHDP PDSDPQIPPP 

       130        140        150        160        170        180 
YVEPTAPQVL PVMHPHGAPP NHRPWQMKDL QAIKQEVSQA APGSPQFMQT IRLAVQQFDP 

       190        200        210        220        230        240 
TAKDLQDLLQ YLCSSLVASL HHQQLDSLIS EAETRGITSY NPLAGPLRVQ ANNPQQQGLR 

       250        260        270        280        290        300 
REYQQLWLAA FAALPGSAKD PSWASILQGL EEPYHAFVER LNIALDNGLP EGTPKDPILR 

       310        320        330        340        350        360 
SLAYSNANKE CQKLLQARGH TNSPLGDMLR ACQTWTPKDK TKVLVVQPKK PPPNQPCFRC 

       370        380        390        400        410        420 
GKAGHWSRDC TQPRPPPGPC PLCQDPTHWK RDCPRLKPTI PEPEPEEDAL LLDLPADIPH 


PKNSIGGEV 

« Hide

Isoform Gag-Pro polyprotein (Sequence not available).
Isoform Gag-Pol polyprotein (Sequence not available).

References

[1]"Nucleotide sequence of the core (gag) gene from HTLV-1 isolate MT-2."
Gray G.S., Bartman T., White M.
Nucleic Acids Res. 17:7998-7998(1989) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC RNA].
[2]"Antibodies to an NH2-terminal myristoyl glycine moiety can detect NH2-terminal myristoylated proteins in the retrovirus-infected cells."
Shoji S., Tashiro A., Furuishi K., Takenaka O., Kida Y., Horiuchi S., Funakoshi T., Kubota Y.
Biochem. Biophys. Res. Commun. 162:724-732(1989) [PubMed] [Europe PMC] [Abstract]
Cited for: MYRISTOYLATION AT GLY-2.
[3]"Structural analysis of the N-terminal domain of the human T-cell leukemia virus capsid protein."
Cornilescu C.C., Bouamr F., Yao X., Carter C., Tjandra N.
J. Mol. Biol. 306:783-797(2001) [PubMed] [Europe PMC] [Abstract]
Cited for: STRUCTURE BY NMR OF 131-264.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
X15951 Genomic RNA. Translation: CAA34075.1.
PIRS06073.

3D structure databases

PDBe
RCSB PDB
PDBj
EntryMethodResolution (Å)ChainPositionsPDBsum
1G03NMR-A131-264[»]
ProteinModelPortalP14077.
SMRP14077. Positions 1-344.
ModBaseSearch...
MobiDBSearch...

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Family and domain databases

Gene3D1.10.1200.30. 1 hit.
1.10.185.10. 1 hit.
1.10.375.10. 1 hit.
4.10.60.10. 1 hit.
InterProIPR003139. D_retro_matrix_N.
IPR000721. Gag_p24.
IPR008916. Retrov_capsid_C.
IPR008919. Retrov_capsid_N.
IPR010999. Retrovr_matrix_N.
IPR001878. Znf_CCHC.
[Graphical view]
PfamPF02228. Gag_p19. 1 hit.
PF00607. Gag_p24. 1 hit.
PF00098. zf-CCHC. 1 hit.
[Graphical view]
SMARTSM00343. ZnF_C2HC. 2 hits.
[Graphical view]
SUPFAMSSF47353. SSF47353. 1 hit.
SSF47836. SSF47836. 1 hit.
SSF47943. SSF47943. 1 hit.
SSF57756. SSF57756. 1 hit.
PROSITEPS50158. ZF_CCHC. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

EvolutionaryTraceP14077.

Entry information

Entry nameGAG_HTL1M
AccessionPrimary (citable) accession number: P14077
Entry history
Integrated into UniProtKB/Swiss-Prot: January 1, 1990
Last sequence update: January 23, 2007
Last modified: February 19, 2014
This is version 99 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programViral Protein Annotation Program

Relevant documents

SIMILARITY comments

Index of protein domains and families

PDB cross-references

Index of Protein Data Bank (PDB) cross-references