Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Copia protein

Gene

GIP

Organism
Drosophila melanogaster (Fruit fly)
Status
Reviewed-Annotation score: Annotation score: 3 out of 5-Experimental evidence at protein leveli

Functioni

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Active sitei292 – 2921For protease activityBy similarity

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri230 – 24718CCHC-typePROSITE-ProRule annotationAdd
BLAST

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Aspartyl protease, Hydrolase, Protease

Keywords - Ligandi

ATP-binding, Metal-binding, Nucleotide-binding, Zinc

Protein family/group databases

MEROPSiA11.001.

Names & Taxonomyi

Protein namesi
Recommended name:
Copia protein
Alternative name(s):
Gag-int-pol protein
Cleaved into the following 2 chains:
Gene namesi
Name:GIP
Synonyms:COPIA
OrganismiDrosophila melanogaster (Fruit fly)
Taxonomic identifieri7227 [NCBI]
Taxonomic lineageiEukaryotaMetazoaEcdysozoaArthropodaHexapodaInsectaPterygotaNeopteraEndopterygotaDipteraBrachyceraMuscomorphaEphydroideaDrosophilidaeDrosophilaSophophora

Organism-specific databases

FlyBaseiFBgn0013437. copia\GIP.

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi292 – 2921D → A: Loss of activity. 1 Publication

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 270270Copia VLP proteinSequence analysisPRO_0000026135Add
BLAST
Chaini271 – 14091139Copia proteaseSequence analysisPRO_0000026136Add
BLAST

Proteomic databases

PRIDEiP04146.

Structurei

3D structure databases

ProteinModelPortaliP04146.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini476 – 644169Integrase catalyticPROSITE-ProRule annotationAdd
BLAST

Sequence similaritiesi

Contains 1 CCHC-type zinc finger.PROSITE-ProRule annotation
Contains 1 integrase catalytic domain.PROSITE-ProRule annotation
Contains 1 peptidase A11 domain.Curated

Zinc finger

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri230 – 24718CCHC-typePROSITE-ProRule annotationAdd
BLAST

Keywords - Domaini

Zinc-finger

Family and domain databases

Gene3Di3.30.420.10. 1 hit.
4.10.60.10. 1 hit.
InterProiIPR025724. GAG-pre-integrase_dom.
IPR001584. Integrase_cat-core.
IPR012337. RNaseH-like_dom.
IPR013103. RVT_2.
IPR001878. Znf_CCHC.
[Graphical view]
PfamiPF13976. gag_pre-integrs. 1 hit.
PF00665. rve. 1 hit.
PF07727. RVT_2. 1 hit.
[Graphical view]
SUPFAMiSSF53098. SSF53098. 1 hit.
SSF57756. SSF57756. 1 hit.
PROSITEiPS50994. INTEGRASE. 1 hit.
PS50158. ZF_CCHC. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform Long (identifier: P04146-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MDKAKRNIKP FDGEKYAIWK FRIRALLAEQ DVLKVVDGLM PNEVDDSWKK
60 70 80 90 100
AERCAKSTII EYLSDSFLNF ATSDITARQI LENLDAVYER KSLASQLALR
110 120 130 140 150
KRLLSLKLSS EMSLLSHFHI FDELISELLA AGAKIEEMDK ISHLLITLPS
160 170 180 190 200
CYDGIITAIE TLSEENLTLA FVKNRLLDQE IKIKNDHNDT SKKVMNAIVH
210 220 230 240 250
NNNNTYKNNL FKNRVTKPKK IFKGNSKYKV KCHHCGREGH IKKDCFHYKR
260 270 280 290 300
ILNNKNKENE KQVQTATSHG IAFMVKEVNN TSVMDNCGFV LDSGASDHLI
310 320 330 340 350
NDESLYTDSV EVVPPLKIAV AKQGEFIYAT KRGIVRLRND HEITLEDVLF
360 370 380 390 400
CKEAAGNLMS VKRLQEAGMS IEFDKSGVTI SKNGLMVVKN SGMLNNVPVI
410 420 430 440 450
NFQAYSINAK HKNNFRLWHE RFGHISDGKL LEIKRKNMFS DQSLLNNLEL
460 470 480 490 500
SCEICEPCLN GKQARLPFKQ LKDKTHIKRP LFVVHSDVCG PITPVTLDDK
510 520 530 540 550
NYFVIFVDQF THYCVTYLIK YKSDVFSMFQ DFVAKSEAHF NLKVVYLYID
560 570 580 590 600
NGREYLSNEM RQFCVKKGIS YHLTVPHTPQ LNGVSERMIR TITEKARTMV
610 620 630 640 650
SGAKLDKSFW GEAVLTATYL INRIPSRALV DSSKTPYEMW HNKKPYLKHL
660 670 680 690 700
RVFGATVYVH IKNKQGKFDD KSFKSIFVGY EPNGFKLWDA VNEKFIVARD
710 720 730 740 750
VVVDETNMVN SRAVKFETVF LKDSKESENK NFPNDSRKII QTEFPNESKE
760 770 780 790 800
CDNIQFLKDS KESENKNFPN DSRKIIQTEF PNESKECDNI QFLKDSKESN
810 820 830 840 850
KYFLNESKKR KRDDHLNESK GSGNPNESRE SETAEHLKEI GIDNPTKNDG
860 870 880 890 900
IEIINRRSER LKTKPQISYN EEDNSLNKVV LNAHTIFNDV PNSFDEIQYR
910 920 930 940 950
DDKSSWEEAI NTELNAHKIN NTWTITKRPE NKNIVDSRWV FSVKYNELGN
960 970 980 990 1000
PIRYKARLVA RGFTQKYQID YEETFAPVAR ISSFRFILSL VIQYNLKVHQ
1010 1020 1030 1040 1050
MDVKTAFLNG TLKEEIYMRL PQGISCNSDN VCKLNKAIYG LKQAARCWFE
1060 1070 1080 1090 1100
VFEQALKECE FVNSSVDRCI YILDKGNINE NIYVLLYVDD VVIATGDMTR
1110 1120 1130 1140 1150
MNNFKRYLME KFRMTDLNEI KHFIGIRIEM QEDKIYLSQS AYVKKILSKF
1160 1170 1180 1190 1200
NMENCNAVST PLPSKINYEL LNSDEDCNTP CRSLIGCLMY IMLCTRPDLT
1210 1220 1230 1240 1250
TAVNILSRYS SKNNSELWQN LKRVLRYLKG TIDMKLIFKK NLAFENKIIG
1260 1270 1280 1290 1300
YVDSDWAGSE IDRKSTTGYL FKMFDFNLIC WNTKRQNSVA ASSTEAEYMA
1310 1320 1330 1340 1350
LFEAVREALW LKFLLTSINI KLENPIKIYE DNQGCISIAN NPSCHKRAKH
1360 1370 1380 1390 1400
IDIKYHFARE QVQNNVICLE YIPTENQLAD IFTKPLPAAR FVELRDKLGL

LQDDQSNAE
Length:1,409
Mass (Da):162,818
Last modified:February 21, 2001 - v3
Checksum:iBE89440763A47691
GO
Isoform Short (identifier: P04146-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     392-1374: Missing.

Show »
Length:426
Mass (Da):48,281
Checksum:i843AF790653512AF
GO

Sequence cautioni

The sequence CAD27357.1 differs from that shown. Reason: Erroneous initiation. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti191 – 1911S → N in CAA26447 (PubMed:2409449).Curated
Sequence conflicti300 – 3001I → V in CAA26447 (PubMed:2409449).Curated
Sequence conflicti866 – 8661Q → E in CAA26447 (PubMed:2409449).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti1265 – 128824STTGY…KRQNS → VQQGIYSKCLILISFVGIQR DRTQ in variant copia-related.
Add
BLAST
Natural varianti1289 – 1409121Missing in variant copia-related.
Add
BLAST

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei392 – 1374983Missing in isoform Short. 1 PublicationVSP_005226Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X04456 Genomic DNA. Translation: CAA28054.2.
X04456 Genomic DNA. Translation: CAD27357.1. Different initiation.
X02599 Genomic DNA. Translation: CAA26444.1.
X02599 Genomic DNA. Translation: CAA26445.1.
X02600 mRNA. Translation: CAA26446.1.
X02600 mRNA. Translation: CAA26447.1.
X13719 mRNA. Translation: CAA31997.1.
X54147 Genomic DNA. Translation: CAA38086.1.
BT011428 mRNA. Translation: AAR99086.1.
PIRiA03324. OFFFCP.

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X04456 Genomic DNA. Translation: CAA28054.2.
X04456 Genomic DNA. Translation: CAD27357.1. Different initiation.
X02599 Genomic DNA. Translation: CAA26444.1.
X02599 Genomic DNA. Translation: CAA26445.1.
X02600 mRNA. Translation: CAA26446.1.
X02600 mRNA. Translation: CAA26447.1.
X13719 mRNA. Translation: CAA31997.1.
X54147 Genomic DNA. Translation: CAA38086.1.
BT011428 mRNA. Translation: AAR99086.1.
PIRiA03324. OFFFCP.

3D structure databases

ProteinModelPortaliP04146.
ModBaseiSearch...
MobiDBiSearch...

Protein family/group databases

MEROPSiA11.001.

Proteomic databases

PRIDEiP04146.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Organism-specific databases

FlyBaseiFBgn0013437. copia\GIP.

Miscellaneous databases

PROiP04146.

Family and domain databases

Gene3Di3.30.420.10. 1 hit.
4.10.60.10. 1 hit.
InterProiIPR025724. GAG-pre-integrase_dom.
IPR001584. Integrase_cat-core.
IPR012337. RNaseH-like_dom.
IPR013103. RVT_2.
IPR001878. Znf_CCHC.
[Graphical view]
PfamiPF13976. gag_pre-integrs. 1 hit.
PF00665. rve. 1 hit.
PF07727. RVT_2. 1 hit.
[Graphical view]
SUPFAMiSSF53098. SSF53098. 1 hit.
SSF57756. SSF57756. 1 hit.
PROSITEiPS50994. INTEGRASE. 1 hit.
PS50158. ZF_CCHC. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Complete nucleotide sequence of the Drosophila transposable element copia: homology between copia and retroviral proteins."
    Mount S.M., Rubin G.M.
    Mol. Cell. Biol. 5:1630-1638(1985) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA].
  2. "The nucleotide sequences of copia and copia-related RNA in Drosophila virus-like particles."
    Emori Y., Shiba T., Kanaya S., Inouye S., Yuki S., Saigo K.
    Nature 315:773-776(1985) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA / MRNA], PROTEIN SEQUENCE OF 2-10, ALTERNATIVE SPLICING.
  3. "The nucleotide sequence of Drosophila melanogaster copia-specific 2.1-kb mRNA."
    Miller K., Rosenbaum J., Zbrzezna V., Pogo A.O.
    Nucleic Acids Res. 17:2134-2134(1989) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM SHORT).
  4. "Virus-like particle formation of Drosophila copia through autocatalytic processing."
    Yoshioka K., Honma H., Zushi M., Kondo S., Togashi S., Miyake T., Shiba T.
    EMBO J. 9:535-541(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] (ISOFORM SHORT), MUTAGENESIS OF ASP-292.
    Tissue: Larva.
  5. Stapleton M., Carlson J.W., Chavez C., Frise E., George R.A., Pacleb J.M., Park S., Wan K.H., Yu C., Rubin G.M., Celniker S.E.
    Submitted (JAN-2004) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM LONG).
    Strain: Berkeley.
    Tissue: Testis.

Entry informationi

Entry nameiCOPIA_DROME
AccessioniPrimary (citable) accession number: P04146
Secondary accession number(s): Q03728
, Q24280, Q24555, Q24585, Q24586, Q24587, Q53XF8, Q8T391
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 1, 1986
Last sequence update: February 21, 2001
Last modified: May 11, 2016
This is version 114 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programDrosophila annotation project

Miscellaneousi

Keywords - Technical termi

Direct protein sequencing, Transposable element

Documents

  1. Drosophila
    Drosophila: entries, gene names and cross-references to FlyBase
  2. Peptidase families
    Classification of peptidase families and list of entries
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.