ID GAG_HTL1M Reviewed; 429 AA. AC P14077; DT 01-JAN-1990, integrated into UniProtKB/Swiss-Prot. DT 23-JAN-2007, sequence version 3. DT 24-JAN-2024, entry version 125. DE RecName: Full=Gag polyprotein; DE AltName: Full=Pr53Gag; DE Contains: DE RecName: Full=Matrix protein p19; DE Short=MA; DE Contains: DE RecName: Full=Capsid protein p24; DE Short=CA; DE Contains: DE RecName: Full=Nucleocapsid protein p15-gag; DE Short=NC-gag; GN Name=gag; OS Human T-cell leukemia virus 1 (strain Japan MT-2 subtype A) (HTLV-1). OC Viruses; Riboviria; Pararnavirae; Artverviricota; Revtraviricetes; OC Ortervirales; Retroviridae; Orthoretrovirinae; Deltaretrovirus; OC Primate T-lymphotropic virus 1. OX NCBI_TaxID=11928; OH NCBI_TaxID=9606; Homo sapiens (Human). RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX PubMed=2678008; DOI=10.1093/nar/17.19.7998; RA Gray G.S., Bartman T., White M.; RT "Nucleotide sequence of the core (gag) gene from HTLV-1 isolate MT-2."; RL Nucleic Acids Res. 17:7998-7998(1989). RN [2] RP MYRISTOYLATION AT GLY-2. RX PubMed=2547372; DOI=10.1016/0006-291x(89)92370-x; RA Shoji S., Tashiro A., Furuishi K., Takenaka O., Kida Y., Horiuchi S., RA Funakoshi T., Kubota Y.; RT "Antibodies to an NH2-terminal myristoyl glycine moiety can detect NH2- RT terminal myristoylated proteins in the retrovirus-infected cells."; RL Biochem. Biophys. Res. Commun. 162:724-732(1989). RN [3] RP STRUCTURE BY NMR OF 131-264. RX PubMed=11243788; DOI=10.1006/jmbi.2000.4395; RA Cornilescu C.C., Bouamr F., Yao X., Carter C., Tjandra N.; RT "Structural analysis of the N-terminal domain of the human T-cell leukemia RT virus capsid protein."; RL J. Mol. Biol. 306:783-797(2001). CC -!- FUNCTION: [Gag polyprotein]: The matrix domain targets Gag, Gag-Pro and CC Gag-Pro-Pol polyproteins to the plasma membrane via a multipartite CC membrane binding signal, that includes its myristoylated N-terminus. CC {ECO:0000250|UniProtKB:P03345}. CC -!- FUNCTION: [Matrix protein p19]: Matrix protein. CC {ECO:0000250|UniProtKB:P03345}. CC -!- FUNCTION: [Capsid protein p24]: Forms the spherical core of the virus CC that encapsulates the genomic RNA-nucleocapsid complex. CC {ECO:0000250|UniProtKB:P03345}. CC -!- FUNCTION: [Nucleocapsid protein p15-gag]: Binds strongly to viral CC nucleic acids and promote their aggregation. Also destabilizes the CC nucleic acids duplexes via highly structured zinc-binding motifs. CC {ECO:0000250|UniProtKB:P03345}. CC -!- SUBUNIT: [Gag polyprotein]: Homodimer; the homodimers are part of the CC immature particles. Interacts with human TSG101 and NEDD4; these CC interactions are essential for budding and release of viral particles. CC {ECO:0000250|UniProtKB:P03345}. CC -!- SUBUNIT: [Matrix protein p19]: Homodimer; further assembles as CC homohexamers. {ECO:0000250|UniProtKB:P03345}. CC -!- SUBCELLULAR LOCATION: [Matrix protein p19]: Virion CC {ECO:0000250|UniProtKB:P03345}. CC -!- SUBCELLULAR LOCATION: [Capsid protein p24]: Virion CC {ECO:0000250|UniProtKB:P03345}. CC -!- SUBCELLULAR LOCATION: [Nucleocapsid protein p15-gag]: Virion CC {ECO:0000250|UniProtKB:P03345}. CC -!- ALTERNATIVE PRODUCTS: CC Event=Ribosomal frameshifting; Named isoforms=3; CC Comment=This strategy of translation probably allows the virus to CC modulate the quantity of each viral protein. {ECO:0000305}; CC Name=Gag polyprotein; CC IsoId=P14077-1; Sequence=Displayed; CC Name=Gag-Pro polyprotein; CC IsoId=P14077-2; Sequence=Not described; CC Name=Gag-Pol polyprotein; CC IsoId=P14077-3; Sequence=Not described; CC -!- DOMAIN: [Gag polyprotein]: Late-budding domains (L domains) are short CC sequence motifs essential for viral particle release. They can occur CC individually or in close proximity within structural proteins. They CC interacts with sorting cellular proteins of the multivesicular body CC (MVB) pathway. Most of these proteins are class E vacuolar protein CC sorting factors belonging to ESCRT-I, ESCRT-II or ESCRT-III complexes. CC Matrix protein p19 contains two L domains: a PTAP/PSAP motif which CC interacts with the UEV domain of TSG101, and a PPXY motif which binds CC to the WW domains of the ubiquitin ligase NEDD4. CC {ECO:0000250|UniProtKB:P03345}. CC -!- DOMAIN: [Capsid protein p24]: The capsid protein N-terminus seems to be CC involved in Gag-Gag interactions. {ECO:0000250|UniProtKB:P03345}. CC -!- DOMAIN: [Nucleocapsid protein p15-gag]: The C-terminus is acidic. CC {ECO:0000250|UniProtKB:P03345}. CC -!- PTM: [Gag polyprotein]: Specific enzymatic cleavages by the viral CC protease yield mature proteins. The polyprotein is cleaved during and CC after budding, this process is termed maturation. CC {ECO:0000250|UniProtKB:P03345}. CC -!- PTM: [Matrix protein p19]: Phosphorylation of the matrix protein p19 by CC MAPK1 seems to play a role in budding. {ECO:0000250|UniProtKB:P03345}. CC -!- PTM: [Gag polyprotein]: Ubiquitinated by host NEDD4. CC {ECO:0000250|UniProtKB:P03345}. CC -!- PTM: [Gag polyprotein]: Myristoylated (PubMed:2547372). Myristoylation CC of the matrix (MA) domain mediates the transport and binding of Gag CC polyproteins to the host plasma membrane and is required for the CC assembly of viral particles. {ECO:0000250|UniProtKB:P03345, CC ECO:0000269|PubMed:2547372}. CC -!- MISCELLANEOUS: HTLV-1 lineages are divided in four clades, A CC (Cosmopolitan), B (Central African group), C (Melanesian group) and D CC (New Central African group). {ECO:0000305}. CC -!- MISCELLANEOUS: [Isoform Gag polyprotein]: Produced by conventional CC translation. {ECO:0000250|UniProtKB:P03345}. CC -!- MISCELLANEOUS: [Isoform Gag-Pro polyprotein]: Produced by -1 ribosomal CC frameshifting at the gag-pro genes boundary. CC {ECO:0000250|UniProtKB:P03345}. CC -!- MISCELLANEOUS: [Isoform Gag-Pol polyprotein]: Produced by -1 ribosomal CC frameshifting at the gag-pol genes boundary. CC {ECO:0000250|UniProtKB:P03345}. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; X15951; CAA34075.1; -; Genomic_RNA. DR PIR; S06073; S06073. DR PDB; 1G03; NMR; -; A=131-264. DR PDBsum; 1G03; -. DR SMR; P14077; -. DR iPTMnet; P14077; -. DR EvolutionaryTrace; P14077; -. DR GO; GO:0019013; C:viral nucleocapsid; IEA:UniProtKB-KW. DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro. DR GO; GO:0005198; F:structural molecule activity; IEA:InterPro. DR GO; GO:0008270; F:zinc ion binding; IEA:InterPro. DR GO; GO:0016032; P:viral process; IEA:InterPro. DR Gene3D; 1.10.1200.30; -; 1. DR Gene3D; 1.10.185.10; Delta-retroviral matrix; 1. DR Gene3D; 1.10.375.10; Human Immunodeficiency Virus Type 1 Capsid Protein; 1. DR Gene3D; 4.10.60.10; Zinc finger, CCHC-type; 1. DR InterPro; IPR003139; D_retro_matrix. DR InterPro; IPR045345; Gag_p24_C. DR InterPro; IPR008916; Retrov_capsid_C. DR InterPro; IPR008919; Retrov_capsid_N. DR InterPro; IPR010999; Retrovr_matrix. DR InterPro; IPR001878; Znf_CCHC. DR InterPro; IPR036875; Znf_CCHC_sf. DR PANTHER; PTHR40389; ENDOGENOUS RETROVIRUS GROUP K MEMBER 24 GAG POLYPROTEIN-RELATED; 1. DR PANTHER; PTHR40389:SF3; IGE-BINDING PROTEIN; 1. DR Pfam; PF02228; Gag_p19; 1. DR Pfam; PF00607; Gag_p24; 1. DR Pfam; PF19317; Gag_p24_C; 1. DR Pfam; PF00098; zf-CCHC; 1. DR SMART; SM00343; ZnF_C2HC; 2. DR SUPFAM; SSF47836; Retroviral matrix proteins; 1. DR SUPFAM; SSF47353; Retrovirus capsid dimerization domain-like; 1. DR SUPFAM; SSF47943; Retrovirus capsid protein, N-terminal core domain; 1. DR SUPFAM; SSF57756; Retrovirus zinc finger-like domains; 1. DR PROSITE; PS50158; ZF_CCHC; 1. PE 1: Evidence at protein level; KW 3D-structure; Capsid protein; Disulfide bond; Host-virus interaction; KW Lipoprotein; Metal-binding; Myristate; Phosphoprotein; Repeat; KW Ribosomal frameshifting; Ubl conjugation; Viral nucleoprotein; Virion; KW Zinc; Zinc-finger. FT INIT_MET 1 FT /note="Removed; by host" FT /evidence="ECO:0000255" FT CHAIN 2..429 FT /note="Gag polyprotein" FT /id="PRO_0000259773" FT CHAIN 2..130 FT /note="Matrix protein p19" FT /id="PRO_0000038817" FT CHAIN 131..344 FT /note="Capsid protein p24" FT /id="PRO_0000038818" FT CHAIN 345..429 FT /note="Nucleocapsid protein p15-gag" FT /id="PRO_0000038819" FT ZN_FING 355..372 FT /note="CCHC-type 1" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00047" FT ZN_FING 378..395 FT /note="CCHC-type 2" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00047" FT REGION 93..143 FT /note="Disordered" FT /evidence="ECO:0000256|SAM:MobiDB-lite" FT MOTIF 118..121 FT /note="PPXY motif" FT /evidence="ECO:0000250|UniProtKB:P03345" FT MOTIF 124..127 FT /note="PTAP/PSAP motif" FT /evidence="ECO:0000250|UniProtKB:P03345" FT COMPBIAS 95..126 FT /note="Pro residues" FT /evidence="ECO:0000256|SAM:MobiDB-lite" FT SITE 130..131 FT /note="Cleavage; by viral protease" FT /evidence="ECO:0000250|UniProtKB:P03345" FT SITE 344..345 FT /note="Cleavage; by viral protease" FT /evidence="ECO:0000250|UniProtKB:P03345" FT MOD_RES 105 FT /note="Phosphoserine; by host MAPK1" FT /evidence="ECO:0000250|UniProtKB:P03345" FT LIPID 2 FT /note="N-myristoyl glycine; by host" FT /evidence="ECO:0000255, ECO:0000269|PubMed:2547372" FT DISULFID 61 FT /note="Interchain" FT /evidence="ECO:0000250|UniProtKB:P03345" FT STRAND 136..138 FT /evidence="ECO:0007829|PDB:1G03" FT HELIX 148..159 FT /evidence="ECO:0007829|PDB:1G03" FT STRAND 160..163 FT /evidence="ECO:0007829|PDB:1G03" FT HELIX 166..177 FT /evidence="ECO:0007829|PDB:1G03" FT HELIX 182..190 FT /evidence="ECO:0007829|PDB:1G03" FT HELIX 195..216 FT /evidence="ECO:0007829|PDB:1G03" FT STRAND 217..219 FT /evidence="ECO:0007829|PDB:1G03" FT HELIX 227..231 FT /evidence="ECO:0007829|PDB:1G03" FT HELIX 238..252 FT /evidence="ECO:0007829|PDB:1G03" FT TURN 254..257 FT /evidence="ECO:0007829|PDB:1G03" FT TURN 261..263 FT /evidence="ECO:0007829|PDB:1G03" SQ SEQUENCE 429 AA; 47585 MW; EF5201C934EF0291 CRC64; MGQIFSRSAS PIPRPPRGLA AHHWLNFLQA AYRLEPGPSS YDFHQLKKFL KIALETPVWI CPINYSLLAS LLPKGYPGRV NEILHILIQT QAQIPSRPAP PPPSSPTHDP PDSDPQIPPP YVEPTAPQVL PVMHPHGAPP NHRPWQMKDL QAIKQEVSQA APGSPQFMQT IRLAVQQFDP TAKDLQDLLQ YLCSSLVASL HHQQLDSLIS EAETRGITSY NPLAGPLRVQ ANNPQQQGLR REYQQLWLAA FAALPGSAKD PSWASILQGL EEPYHAFVER LNIALDNGLP EGTPKDPILR SLAYSNANKE CQKLLQARGH TNSPLGDMLR ACQTWTPKDK TKVLVVQPKK PPPNQPCFRC GKAGHWSRDC TQPRPPPGPC PLCQDPTHWK RDCPRLKPTI PEPEPEEDAL LLDLPADIPH PKNSIGGEV //