Skip Header

Contribute Send feedback
Read comments (?) or add your own

O92956 (POL_RSVSB) Reviewed, UniProtKB/Swiss-Prot

Last modified December 14, 2011. Version 77. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (3) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Gag-Pro-Pol polyprotein
Gene names
Name:gag-pro-pol
OrganismRous sarcoma virus (strain Schmidt-Ruppin B) (RSV-SRB)
Taxonomic identifier269447 [NCBI]
Taxonomic lineageVirusesRetro-transcribing virusesRetroviridaeOrthoretrovirinaeAlpharetrovirus
Virus hostGallus gallus (Chicken) [TaxID: 9031]

Protein attributes

Sequence length1603 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Capsid protein p27 forms the spherical core of the virus that encapsulates the genomic RNA-nucleocapsid complex By similarity.

The aspartyl protease mediates proteolytic cleavages of Gag and Gag-Pol polyproteins during or shortly after the release of the virion from the plasma membrane. Cleavages take place as an ordered, step-wise cascade to yield mature proteins. This process is called maturation. Displays maximal activity during the budding process just prior to particle release from the cell By similarity.

Catalytic activity

Deoxynucleoside triphosphate + DNA(n) = diphosphate + DNA(n+1).

Endonucleolytic cleavage to 5'-phosphomonoester.

Cofactor

Binds 2 magnesium ions for reverse transcriptase polymerase activity By similarity.

Binds 2 magnesium ions for ribonuclease H (RNase H) activity. Substrate-binding is a precondition for magnesium binding By similarity.

Binds 8 magnesium ions per integrase homotetramer By similarity.

Subunit structure

The protease is active as a homodimer By similarity. The integrase forms a homotetramer. Reverse transcriptase is a heterodimer of alpha and beta subunits. Three forms of RT exist: alpha-alpha (alpha-Pol), beta-beta (beta-Pol), and alpha-beta, with the major form being the heterodimer. Both the polymerase and RNase H active sites are located in the alpha subunit of heterodimeric RT alpha-beta.

Subcellular location

Matrix protein p19: Virion Potential.

Capsid protein p27: Virion Potential.

Nucleocapsid protein p12: Virion Potential.

Domain

Late-budding domains (L domains) are short sequence motifs essential for viral particle release. They can occur individually or in close proximity within structural proteins. They interacts with sorting cellular proteins of the multivesicular body (MVB) pathway. Most of these proteins are class E vacuolar protein sorting factors belonging to ESCRT-I, ESCRT-II or ESCRT-III complexes. P2B contains one L domain: a PPXY motif which probably binds to the WW domains of HECT (homologous to E6-AP C-terminus) E3 ubiquitin ligases By similarity.

Integrase core domain contains the D-x(n)-D-x(35)-E motif, named for the phylogenetically conserved glutamic acid and aspartic acid residues and the invariant 35 amino acid spacing between the second and third acidic residues. Each acidic residue of the D,D(35)E motif is independently essential for the 3'-processing and strand transfer activities of purified integrase protein By similarity.

Post-translational modification

Specific enzymatic cleavages in vivo yield mature proteins.

Miscellaneous

The reverse transcriptase is an error-prone enzyme that lacks a proof-reading function. High mutations rate is a direct consequence of this characteristic. RT also displays frequent template switching leading to high recombination rate. Recombination mostly occurs between homologous regions of the two copackaged RNA genomes. If these two RNA molecules derive from different viral strains, reverse transcription will give rise to highly recombinated proviral DNAs.

Sequence similarities

Contains 2 CCHC-type zinc fingers.

Contains 1 integrase catalytic domain.

Contains 1 integrase-type DNA-binding domain.

Contains 1 integrase-type zinc finger.

Contains 1 peptidase A2 domain.

Contains 1 reverse transcriptase domain.

Contains 1 RNase H domain.

Sequence caution

The sequence AAC08988.1 differs from that shown. Reason: Erroneous initiation.

Ontologies

Keywords
   Biological processDNA integration
DNA recombination
Initiation of viral infection
Viral genome integration
   Cellular componentVirion
   Coding sequence diversityRibosomal frameshifting
   DomainZinc-finger
   LigandDNA-binding
Magnesium
Metal-binding
Zinc
   Molecular functionEndonuclease
Hydrolase
Nuclease
Nucleotidyltransferase
RNA-directed DNA polymerase
Transferase
   Technical term3D-structure
Multifunctional enzyme
Gene Ontology (GO)
   Biological processDNA integration

Inferred from electronic annotation. Source: UniProtKB-KW

DNA recombination

Inferred from electronic annotation. Source: UniProtKB-KW

RNA-dependent DNA replication

Inferred from electronic annotation. Source: InterPro

proteolysis

Inferred from electronic annotation. Source: InterPro

viral reproduction

Inferred from electronic annotation. Source: InterPro

   Cellular componentviral capsid

Inferred from electronic annotation. Source: InterPro

   Molecular functionDNA binding

Inferred from electronic annotation. Source: UniProtKB-KW

DNA-directed DNA polymerase activity

Inferred from electronic annotation. Source: EC

RNA binding

Inferred from electronic annotation. Source: InterPro

RNA-directed DNA polymerase activity

Inferred from electronic annotation. Source: UniProtKB-KW

aspartic-type endopeptidase activity

Inferred from electronic annotation. Source: InterPro

ribonuclease H activity

Inferred from electronic annotation. Source: EC

structural molecule activity

Inferred from electronic annotation. Source: InterPro

zinc ion binding

Inferred from electronic annotation. Source: InterPro

Complete GO annotation...

Alternative products

This entry describes 2 isoforms produced by ribosomal frameshifting. [Align] [Select]

Note: Translation results in the formation of the Gag-Pro. Ribosomal frameshifting at the gag-pro/pol genes boundary produces the Gag-Pro-Pol polyprotein.
Isoform Gag-Pro-Pol polyprotein (identifier: O92956-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Note: Produced by -1 ribosomal frameshifting.
Isoform Gag-Pro polyprotein (identifier: O92954-1)

The sequence of this isoform can be found in the external entry O92954.
Isoforms of the same protein are often annotated in two different entries if their sequences differ significantly.
Note: Produced by conventional translation.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Chain1 – 155155Matrix protein p19 By similarity
PRO_0000397058
Chain156 – 16611p2A By similarity
PRO_0000397059
Chain167 – 17711p2B By similarity
PRO_0000397060
Chain178 – 23962p10 By similarity
PRO_0000397061
Chain240 – 479240Capsid protein p27 By similarity
PRO_0000397062
Chain480 – 4889p3 By similarity
PRO_0000397063
Chain489 – 57789Nucleocapsid protein p12 By similarity
PRO_0000397064
Chain578 – 708131Protease p15 By similarity
PRO_0000397065
Chain709 – 1567859Reverse transcriptase beta-subunit By similarity
PRO_0000397066
Chain709 – 1280572Reverse transcriptase alpha-subunit By similarity
PRO_0000040984
Chain1281 – 1567287Integrase By similarity
PRO_0000040985
Chain1568 – 160336p4 By similarity
PRO_0000397067

Regions

Domain609 – 69082Peptidase A2
Domain750 – 938189Reverse transcriptase
Domain1163 – 1280118RNase H
Domain1333 – 1496164Integrase catalytic
Zinc finger533 – 55018CCHC-type 2
Zinc finger1280 – 132142Integrase-type
DNA binding1502 – 155049Integrase-type By similarity
Motif172 – 1754PPXY motif By similarity
Compositional bias171 – 1744Poly-Pro

Sites

Active site6141For protease activity; shared with dimeric partner By similarity
Metal binding8151Magnesium; catalytic; for reverse transcriptase activity By similarity
Metal binding8901Magnesium; catalytic; for reverse transcriptase activity By similarity
Metal binding8911Magnesium; catalytic; for reverse transcriptase activity By similarity
Metal binding11581Magnesium; catalytic; for RNase H activity By similarity
Metal binding11921Magnesium; catalytic; for RNase H activity By similarity
Metal binding12131Magnesium; catalytic; for RNase H activity By similarity
Metal binding12721Magnesium; catalytic; for RNase H activity By similarity
Metal binding13441Magnesium; catalytic; for integrase activity
Metal binding14011Magnesium; catalytic; for integrase activity
Metal binding14371Magnesium; catalytic; for integrase activity
Site155 – 1562Cleavage; by viral protease p15 By similarity
Site166 – 1672Cleavage; by viral protease p15 By similarity
Site177 – 1782Cleavage; by viral protease p15 By similarity
Site239 – 2402Cleavage; by viral protease p15 By similarity
Site479 – 4802Cleavage; by viral protease p15 By similarity
Site488 – 4892Cleavage; by viral protease p15 By similarity
Site577 – 5782Cleavage; by viral protease p15 By similarity
Site708 – 7092Cleavage; by viral protease p15 By similarity
Site1280 – 12812Cleavage; by viral protease p15 By similarity
Site1567 – 15682Cleavage; by viral protease p15 By similarity

Experimental info

Mutagenesis13441D → N: Complete loss of activity. Ref.6
Sequence conflict13761V → A Ref.3
Sequence conflict14411R → K Ref.3

Secondary structure

.......................... 1603
Helix Strand Turn

Details...

Sequences

Sequence LengthMass (Da)Tools
Isoform Gag-Pro-Pol polyprotein [UniParc].

Last modified August 10, 2010. Version 2.
Checksum: 3BEFAA397C54CDE7

FASTA1,603173,990
        10         20         30         40         50         60 
MEAVIKVISS ACKTYCGKTS PSKKEIGAML SLLQKEGLLM SPSDLYSPGS WDPITAALSQ 

        70         80         90        100        110        120 
RAMVLGKSGE LKTWGLVLGA LKAAREEQVT SEQAKFWLGL GGGRVSPPGP ECIEKPATER 

       130        140        150        160        170        180 
RIDKGEEVGE TTVQRDAKMA PEETATPKTV GTSCYHCGTA IGCNCATASA PPPPYVGSGL 

       190        200        210        220        230        240 
YPSLAGVGEQ QGQGGDTPRG AEQPRAEPGH AGLAPGPALT DWARIREELA STGPPVVAMP 

       250        260        270        280        290        300 
VVIKTEGPAW TPLEPKLITR LADTVRTKGL RSPITMAEVE ALMSSPLLPH DVTNLMRVIL 

       310        320        330        340        350        360 
GPAPYALWMD AWGVQLQTVI AAATRDPRHP ANGQGRGERT NLDRLKGLAD GMVGNPQGQA 

       370        380        390        400        410        420 
ALLRPGELVA ITASALQAFR EVARLAEPAG PWADITQGPS ESFVDFANRL IKAVEGSDLP 

       430        440        450        460        470        480 
PSARAPVIID CFRQKSQPDI QQLIRAAPST LTTPGEIIKY VLDRQKIAPL TDQGIAAAMS 

       490        500        510        520        530        540 
SAIQPLVMAV VNRERDGQTG SGGRARRLCY TCGSPGHYQA QCPKKRKSGN SRERCQLCDG 

       550        560        570        580        590        600 
MGHNAKQCRR RDSNQGQRPG RGLSSGPWPV SEQPAVSLAM TMEHKDRPLV RVILTNTGSH 

       610        620        630        640        650        660 
PVKQRSVYIT ALLDSGADIT IISEEDWPTD WPVVDTANPQ IHGIGGGIPM RKSRDMIELG 

       670        680        690        700        710        720 
VINRDGSLER PLLLFPAVAM VRGSILGRDC LQGLGLRLTN LVGRATVLTV ALHLAIPLKW 

       730        740        750        760        770        780 
KPDHTPVWID QWPLPEGKLV ALTQLVEKEL QLGHIEPSLS CWNTPVFVIR KASGSYRLLH 

       790        800        810        820        830        840 
DLRAVNAKLV PFGAVQQGAP VLSALPRGWP LMVLDLKDCF FSIPLAEQDR EAFAFTLPSV 

       850        860        870        880        890        900 
NNQAPARRFQ WKVLPQGMTC SPTICQLVVG QVLEPLRLKH PSLRMLHYMD DLLLAASSHD 

       910        920        930        940        950        960 
GLEAAGEEVI NTLERAGFTI SPDKIQREPG VQYLGYKLGS TYVAPVGLVA EPRIATLWDV 

       970        980        990       1000       1010       1020 
QKLVGSLQWL RPALGIPPRL MGPFYEQLRG SDPNEAREWN LDMKMAWREI VQLSTTAALE 

      1030       1040       1050       1060       1070       1080 
RWDPALPLEG AVVRCEQGAI GVLGQGLSTH PRPCLWLFST QPTKAFTAWL EVLTLLITKL 

      1090       1100       1110       1120       1130       1140 
RASAVRTFGK EVDILLLPAC FREDLPLPEG ILLALRGFAG KIRSSDTPSI FDIARPLHVS 

      1150       1160       1170       1180       1190       1200 
LKVRVTDHPV PGPTVFTDAS SSTHKGVVVW REGPRWEIKE IADLGASVQQ LEARAVAMAL 

      1210       1220       1230       1240       1250       1260 
LLWPTTPTNV VTDSAFVAKM LLKMGQEGVP STAAAFILED ALSQRSAMAA VLHVRSHSEV 

      1270       1280       1290       1300       1310       1320 
PGFFTEGNDV ADSQATFQAY PLREAKDLHT TLHIGPRALS KACNISMQQA REVVQTCPHC 

      1330       1340       1350       1360       1370       1380 
NSAPALEAGV NPRGLGPLQI WQTDFTLEPR MAPRSWLAVT VDTASSAIVV TQHGRVTSVA 

      1390       1400       1410       1420       1430       1440 
AQHHWATAIA VLGRPKAIKT DNGSCFTSKS TREWLARWGI AHTTGIPGNS QGQAMVERAN 

      1450       1460       1470       1480       1490       1500 
RLLKDKIRVL AEGDGFMKRI PASKQGELLA KAMYALNHFE RGENTKTPVQ KHWRPTVLTE 

      1510       1520       1530       1540       1550       1560 
GPPVKIRIET GEWEKGWNVL VWGRGYAAVK NRDTDKVIWV PSRKVKPDIT QKDEVTKKDE 

      1570       1580       1590       1600 
ASPLFAGSSD WIPWGDEQEG LQEEAASNKQ EGPGEDTLAA NES 

« Hide

Isoform Gag-Pro polyprotein [UniParc].

See O92954.

References

[1]"Complete nucleotide sequence of avian sarcoma virus."
Bouck J., Skalka A.M., Katz R.A.
Submitted (MAR-1998) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA].
[2]"High-resolution structure of the catalytic domain of avian sarcoma virus integrase."
Bujacz G., Jaskolski M., Alexandratos J., Wlodawer A., Merkel G., Katz R.A., Skalka A.M.
J. Mol. Biol. 253:333-346(1995) [PubMed: 7563093] [Abstract]
Cited for: X-RAY CRYSTALLOGRAPHY (1.7 ANGSTROMS) OF 1332-1487.
[3]"The catalytic domain of avian sarcoma virus integrase: conformation of the active-site residues in the presence of divalent cations."
Bujacz G., Jaskolski M., Alexandratos J., Wlodawer A., Merkel G., Katz R.A., Skalka A.M.
Structure 4:89-96(1996) [PubMed: 8805516] [Abstract]
Cited for: X-RAY CRYSTALLOGRAPHY (1.7 ANGSTROMS) OF 1334-1477.
[4]"Binding of different divalent cations to the active site of avian sarcoma virus integrase and their effects on enzymatic activity."
Bujacz G., Alexandratos J., Wlodawer A., Merkel G., Andrake M., Katz R.A., Skalka A.M.
J. Biol. Chem. 272:18161-18168(1997) [PubMed: 9218451] [Abstract]
Cited for: X-RAY CRYSTALLOGRAPHY (1.95 ANGSTROMS) OF 1334-1479, ACTIVE SITE.
[5]"Structure of the catalytic domain of avian sarcoma virus integrase with a bound HIV-1 integrase-targeted inhibitor."
Lubkowski J., Yang F., Alexandratos J., Wlodawer A., Zhao H., Burke T.R. Jr., Neamati N., Pommier Y., Merkel G., Skalka A.M.
Proc. Natl. Acad. Sci. U.S.A. 95:4831-4836(1998) [PubMed: 9560188] [Abstract]
Cited for: X-RAY CRYSTALLOGRAPHY (1.9 ANGSTROMS) OF 1332-1487 IN COMPLEX WITH A HIV-1 INTEGRASE-TARGETED INHIBITOR.
[6]"Structural basis for inactivating mutations and pH-dependent activity of avian sarcoma virus integrase."
Lubkowski J., Yang F., Alexandratos J., Merkel G., Katz R.A., Gravuer K., Skalka A.M., Wlodawer A.
J. Biol. Chem. 273:32685-32689(1998) [PubMed: 9830010] [Abstract]
Cited for: X-RAY CRYSTALLOGRAPHY (2.2 ANGSTROMS) OF 1334-1479, MUTAGENESIS OF ASP-1344.
[7]"Atomic resolution structures of the core domain of avian sarcoma virus integrase and its D64N mutant."
Lubkowski J., Dauter Z., Yang F., Alexandratos J., Merkel G., Skalka A.M., Wlodawer A.
Biochemistry 38:13512-13522(1999) [PubMed: 10521258] [Abstract]
Cited for: X-RAY CRYSTALLOGRAPHY (1.02 ANGSTROMS) OF 1333-1487.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AF052428 Genomic DNA. Translation: AAC08988.1. Different initiation.

3D structure databases

PDBe
RCSB PDB
PDBj
EntryMethodResolution (Å)ChainPositionsPDBsum
1A5VX-ray1.90A1332-1487[»]
1A5WX-ray2.00A1332-1487[»]
1A5XX-ray1.90A1332-1487[»]
1ASUX-ray1.70A1332-1487[»]
1ASVX-ray2.20A1332-1487[»]
1ASWX-ray1.80A1332-1487[»]
1CXQX-ray1.02A1333-1487[»]
1CXUX-ray1.42A1333-1487[»]
1CZ9X-ray1.20A1333-1487[»]
1CZBX-ray1.06A1333-1487[»]
1VSDX-ray1.70A1334-1479[»]
1VSEX-ray2.20A1334-1479[»]
1VSFX-ray2.05A1334-1479[»]
1VSHX-ray1.95A1334-1479[»]
1VSIX-ray2.20A1334-1479[»]
1VSJX-ray2.10A1334-1479[»]
1VSKX-ray2.20A1334-1479[»]
1VSLX-ray2.20A1334-1479[»]
1VSMX-ray2.15A1334-1479[»]
3O4NX-ray1.80A/B1333-1479[»]
3O4QX-ray1.55A1333-1479[»]
ProteinModelPortalO92956.
ModBaseSearch...

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Family and domain databases

InterProIPR004028. Gag_M.
IPR000721. Gag_p24.
IPR001037. Integrase_C_retrovir.
IPR001584. Integrase_cat-core.
IPR017856. Integrase_Zn-bd_dom-like_N.
IPR003308. Integrase_Zn-bd_dom_N.
IPR012344. Matrix_N_HIV/RSV.
IPR018061. Pept_A2A_retrovirus_sg.
IPR001995. Peptidase_A2_cat.
IPR021109. Peptidase_aspartic.
IPR001969. Peptidase_aspartic_AS.
IPR009007. Peptidase_aspartic_catalytic.
IPR008916. Retrov_capsid_C.
IPR008919. Retrov_capsid_N.
IPR010999. Retrovr_matrix_N.
IPR012337. RNaseH-like_dom.
IPR002156. RNaseH_domain.
IPR000477. RVT.
IPR013084. Znf_CCH_retrovir.
IPR001878. Znf_CCHC.
[Graphical view]
Gene3DG3DSA:2.30.30.10. Integrase_C. 1 hit.
G3DSA:1.10.10.200. Intgrase_N_Zn_bd. 1 hit.
G3DSA:1.10.150.90. Matrix_HIV/RSV_N. 1 hit.
G3DSA:2.40.70.10. Pept_Aspartc_cat. 1 hit.
G3DSA:1.10.1200.30. Retrov_capsid_C. 1 hit.
G3DSA:1.10.375.10. Retrov_capsid_N. 1 hit.
G3DSA:4.10.60.10. Znf_CCH_retrovir. 1 hit.
PfamPF00607. Gag_p24. 1 hit.
PF00552. IN_DBD_C. 1 hit.
PF02022. Integrase_Zn. 1 hit.
PF02813. Retro_M. 1 hit.
PF00075. RNase_H. 1 hit.
PF00665. rve. 1 hit.
PF00077. RVP. 1 hit.
PF00078. RVT_1. 1 hit.
PF00098. zf-CCHC. 1 hit.
[Graphical view]
SMARTSM00343. ZnF_C2HC. 2 hits.
[Graphical view]
SUPFAMSSF50122. Integrase_C. 1 hit.
SSF46919. Integrase_Zn_N. 1 hit.
SSF50630. Pept_Aspartic. 1 hit.
SSF47353. Retrov_capsid_C. 1 hit.
SSF47943. Retrov_capsid_N. 1 hit.
SSF47836. Retrovir_matrix. 1 hit.
SSF53098. RNaseH_fold. 2 hits.
PROSITEPS50175. ASP_PROT_RETROV. 1 hit.
PS00141. ASP_PROTEASE. 1 hit.
PS50994. INTEGRASE. 1 hit.
PS51027. INTEGRASE_DBD. 1 hit.
PS50879. RNASE_H. 1 hit.
PS50878. RT_POL. 1 hit.
PS50158. ZF_CCHC. 1 hit.
PS50876. ZF_INTEGRASE. 1 hit.
[Graphical view]
ProtoNetSearch...

Entry information

Entry namePOL_RSVSB
AccessionPrimary (citable) accession number: O92956
Entry history
Integrated into UniProtKB/Swiss-Prot: September 27, 2004
Last sequence update: August 10, 2010
Last modified: December 14, 2011
This is version 77 of the entry and version 2 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programViral Protein Annotation Program

Relevant documents

Peptidase families

Classification of peptidase families and list of entries

PDB cross-references

Index of Protein Data Bank (PDB) cross-references

SIMILARITY comments

Index of protein domains and families