Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.

Swiss-Prot release 11.0

Published July 10, 1989


             SWISS-PROT RELEASE 11.0 RELEASE NOTES


   Date:     July 10, 1989
   Author:   A. Bairoch


                         1. INTRODUCTION

   1.1  Evolution

   Release 11.0 of SWISS-PROT contains 10856 sequence entries,
   comprising 3'265'966  amino  acids  abstracted  from  10775
   references. This  represents an increase of 9% over release
   10.0. The  recent growth  of the  data bank  is  summarised
   below:

   Release    Date   Number of entries     Nb of amino acids

   3.0        11/86               4160               969 641
   4.0        04/87               4387             1 036 010
   5.0        09/87               5205             1 327 683
   6.0        01/88               6102             1 653 982
   7.0        04/88               6821             1 885 771
   8.0        08/88               7724             2 224 465
   9.0        11/88               8702             2 498 140
   10.0       03/89              10008             2 952 613
   11.0       07/89              10856             3 265 966



   1.2  Source of data

   Release 11.0  has been  updated using protein sequence data
   from  release  20.0  of  the  PIR  (Protein  Identification
   Resource) protein  data bank,  as well  as  translation  of
   nucleotide sequence  data from  release 19.0  of  the  EMBL
   nucleotide sequence Data Library.

   As an  indication to the source of the sequence data in the
   SWISS-PROT data bank we list here the statistics concerning
   the DR (Databank Reference) pointer lines:

   Entries with pointer(s) to only PIR entri(es):          2992
   Entries with pointer(s) to only EMBL entri(es):         3875
   Entries with pointer(s) to both EMBL and PIR entri(es): 3272
   Entries with no pointers lines (entered in house):       717



     2. DESCRIPTION OF THE CHANGES MADE TO SWISS-PROT SINCE
                           RELEASE 10


   2.1  Sequences and annotations

   Some 848  new sequences  have been  added  since  the  last
   release, the sequence data of 113 existing entries has been
   updated and  the annotations  of  1366  entries  have  been
   revised. In  particular we  have used  reviews articles  to
   update the  annotations of the following groups or families
   of proteins:

      Adenylate kinases
      Bacterial restriction systems proteins
      Bacterial transduction systems proteins
      Caseins
      Chitin-binding proteins
      Cutinases
      Cytochromes P450
      DNA polymerases
      DNA topoisomerases type I
      Esterases
      Heat shock hsp70 proteins
      Lipases
      Microtubule-associated proteins
      2-oxo acid dehydrogenases complex components
      Paramyxoviruses proteins
      Protein disulfide isomerases
      Purine/pyrimidine phosphoribosyl transferases
      Rhabdoviruses proteins
      Ribonucleotide reductases
      Rotaviruses proteins
      Serine hydroxymethyltransferases
      Small, acid-soluble spore proteins
      Xylose isomerases


   2.2  Standardized journal abbreviations

   Journal  names   are  now   abbreviated  according  to  the
   conventions  used  by  the  National  Library  of  Medicine
   (Washington D.C.,  USA) and  are based  on the existing ISO
   and ANSI  standards. In  most cases  the changes are small,
   and the  new abbreviations  are at  least as  meaningful as
   the old ones. As in previous releases the abbreviations for
   the journals cited in SWISS-PROT are listed in the document
   file JOURLIST.TXT


   2.3  New feature key

   A new  feature key  has been  introduced in  this  release:
   THIOETH, which  describes  a  thioether  bond  between  two
   residues.



                       3. THE NEXT RELEASE

   SWISS-PROT release 12.0 will be available in November 1989.




                     4. WE NEED YOUR HELP !

   We welcome any feedback from our users. We especially would
   appreciate that  you notify  us if  you find that sequences
   belonging to  your field  of expertise are missing from the
   data  bank.  We  also  would  like  to  be  notified  about
   annotations to  be updated,  as for example if the function
   of  a   protein  has   been  clarified   or  if  new  post-
   translational information has become available.



                   APPENDIX A: SOME STATISTICS


   A.1  Amino acid composition

        A.1.1  Composition in percent for the complete data
        bank

   Ala (A) 7.74   Gln (Q) 4.11   Leu (L) 9.08   Ser (S) 7.01
   Arg (R) 5.22   Glu (E) 6.19   Lys (K) 5.83   Thr (T) 5.84
   Asn (N) 4.38   Gly (G) 7.27   Met (M) 2.27   Trp (W) 1.34
   Asp (D) 5.22   His (H) 2.29   Phe (F) 3.94   Tyr (Y) 3.23
   Cys (C) 1.88   Ile (I) 5.31   Pro (P) 5.17   Val (V) 6.51

   Asx (B) 0.01   Glx (Z) 0.01   Xaa (X) 0.03


        A.1.2  Classification of the amino acids by their
        frequency

   Leu, Ala, Gly, Ser, Val, Glu, Thr, Lys, Ile, Arg = Asp,
   Pro, Asn, Gln, Phe, Tyr, His, Met, Cys, Trp


   A.2   Repartition of  the sequences  by their  organism  of
   origin

   Total number  of species represented in this release of the
   data bank: 1687

   Species represented 1x: 785
                       2x: 304
                       3x: 169
                       4x: 102
                       5x:  69
                       6x:  47
                       7x:  29
                       8x:  28
                       9x:  34
                      10x:  16
                  11- 20x:  51
                  21-100x:  42
                    >100x:  11


        A.2.2  Table of the most represented species

    Number   Frequency          Species
         1        1003          Human
         2         885          Escherichia coli
         3         555          Mouse
         4         458          Rat
         5         354          Baker's  yeast (Saccharomyces cerevisiae)
         6         313          Bovine
         7         185          Fruit fly (Drosophila melanogaster)
         8         183          Chicken
         9         151          Rabbit
        10         131          Pig
        11         102          African clawed frog (Xenopus laevis)
        12          96          Bacillus subtilis
        13          84          Salmonella typhimurium
        14          83          Maize
        15          79          Bacteriophage T4
        16          70          Herpes virus (Type 1, Strain 17)
                    70          Tobacco
        18          67          Varicella-Zoster virus (Strain Dumas)
        19          62          Bacteriophage Lambda
                    62          Vaccinia Virus
                    62          Wheat


   A.3  Repartition of the sequences by size

        From   To   Number            From   To   Number
           1-  50      626            1001-1100       77
          51- 100     1368            1101-1200       53
         101- 150     2159            1201-1300       44
         151- 200     1130            1301-1400       26
         201- 250      876            1401-1500       17
         251- 300      729            1501-1600       10
         301- 350      643            1601-1700       14
         351- 400      608            1701-1800       12
         401- 450      468            1801-1900        8
         451- 500      527            1901-2000        7
         501- 550      401                >2000       54
         551- 600      244
         601- 650      188
         651- 700      130
         701- 750      108
         751- 800       76
         801- 850       77
         851- 900       94
         901- 950       43
         951-1000       39

   Currently the two largest sequences are:

   APB$HUMAN   4563 a.a.
   APOA$HUMAN  4548 a.a.


                APPENDIX B: DISKS FOR SWISS-PROT


   B.1  IBM PC/AT 1.2 Mb disks

   SWISS-PROT is  stored on  fourteen 1.2  Mb disks.  Each  of
   these disk  contains a  single bulk  file (PRT11_01.BLK  to
   PRT11_14.BLK):

   Disk     First sequence        Last Sequence
    1       10KA$MYCTU            B1AR$HUMAN
    2       B1AR$MELGA            COLI$SQUAC
    3       COLI$STRCA            DPOL$HPBVY
    4       DPOL$HPBVZ            GC2$HUMAN
    5       GC3$HUMAN             HEMA$INCMI
    6       HEMA$INCP1            K1CS$BOVIN
    7       K1CS$HUMAN            MAP2$HUMAN
    8       MAS1$YEAST            ODB1$BOVIN
    9       ODB2$HUMAN            PRP2$MOUSE
   10       PRP2$RAT              RRPO$BPSP
   11       RRPO$CARMV            TKNG$RAT
   12       TKNK$BOVIN            VGLG$VSVJ
   13       VGLG$VSVO             YVL6$HCMVA
   14       YWL1$HCMVA            ZP3$MOUSE



   B.2  IBM PS/2 1.4 Mb disks

   The number  and content  of the  1.4 Mb  disks for the PS/2
   systems are  exactly identical to those of the 1.2 Mb disks
   (see above).