Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.


Proteinsi <p>Number of protein entries associated with this proteome: UniProtKB entries for regular proteomes or UniParc entries for redundant proteomes (<a href="/help/proteome%5Fredundancy">more...</a>)</p> 4,870
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="">proteome</a>. It consists of the characters 'UP' followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000008216
Taxonomy405955 - Escherichia coli O1:K1 / APEC
Last modifiedAugust 22, 2020
Genome assembly and annotationi <p>Identifier for the genome assembly (<a href="">more...</a>)</p> GCA_000014845.1 from ENA/EMBL full
Pan proteomei <p>A pan proteome is the full set of proteins thought to be expressed by a group of highly related organisms (e.g. multiple strains of the same bacterial species).<p><a href='/help/pan_proteomes' target='_top'>More...</a></p> This proteome is part of the Escherichia coli (strain K12) pan proteome (fasta)
Buscoi <p>The Benchmarking Universal Single-Copy Ortholog (BUSCO) assessment tool is used, for eukaryotic and bacterial proteomes, to provide quantitative measures of UniProt proteome data completeness in terms of expected gene content. BUSCO scores include percentages of complete (C) single-copy (S) genes, complete (C) duplicated (D) genes, fragmented (F) and missing (F) genes, as well as the total number of orthologous clusters (n) used in the BUSCO assessment.</p> C:90.7%[S:90%,D:0.7%],F:0.2%,M:9.1%,n:440 enterobacterales_odb10
Completenessi <p>Complete Proteome Detector (CPD) is an algorithm which employs statistical evaluation of the completeness and quality of proteomes in UniProt, by looking at the sizes of taxonomically close proteomes. Possible values are 'Standard', 'Close to Standard' and 'Outlier'.</p> Standard

Escherichia coli is a Gram-negative straight rod, which either uses peritrichous flagella for mobility or is nonmotile. It is a facultatively anaerobic chemoorganotroph capable of both respiratory and fermentative metabolism. E.coli serves a useful function in the body by suppressing the growth of harmful bacterial species and by synthesising appreciable amounts of vitamins. It is an important component of the biosphere. It colonizes the lower gut of animals and survives when released to the natural environment, allowing widespread dissemination to new hosts. Pathogenic E.coli strains are responsible for infection of the enteric, urinary, pulmonary and nervous systems. Comparison of 20 E.coli/Shigella strains shows the core genome to be about 2000 genes while the pan-genome has over 18,000 genes. There are multiple, striking integration hotspots that are conserved across the genomes, corresponding to regions of abundant and parallel insertions and deletions of genetic material.

This strain is an avian pathogenic E.coli (APEC), and was isolated from the lung of a chicken with colisepticemia. E.coli APEC O1 is an O1:K1:H7 strain belonging to phylogroup B2 and was chosen for sequencing as it possesses traits characteristics of E.coli which cause disease outside of the intestinal tract i.e. APEC and UPEC (uropathogenic E.coli) strains. It is highly virulent in chickens. It is closely related to E.coli UTI89, a UPEC strain of E.coli (ECOUT). It contains 4 plasmids, pAPEC-O1-ColBM, pAPEC-O1-R, pAPEC-O1-Cryptic1 and pAPEC-O1-Cryptic2. Plasmid pAPEC-O1-ColBM is an F-type plasmid that produces colicins B and M and encodes a putative virulence cluster. Plasmid pAPEC-O1-R encodes resistance to eight antimicrobial agents. The cryptic plasmids are somewhat related to Yersinia-type plasmids and do not confer any apparent phenotypes.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)
Component representationProteins
Plasmid pAPEC-O1-ColBM192
Plasmid pAPEC-O1-R222


  1. "Complete DNA sequence of a ColBM plasmid from avian pathogenic Escherichia coli suggests that it evolved from closely related ColV virulence plasmids."
    Johnson T.J., Johnson S.J., Nolan L.K.
    J. Bacteriol. 188:5975-5983(2006) [PubMed] [Europe PMC] [Abstract]
  2. "The genome sequence of avian pathogenic Escherichia coli strain O1:K1:H7 shares strong similarities with human extraintestinal pathogenic E. coli genomes."
    Johnson T.J., Kariyawasam S., Wannemuehler Y., Mangiamele P., Johnson S.J., Doetkott C., Skyberg J.A., Lynne A.M., Johnson J.R., Nolan L.K.
    J. Bacteriol. 189:3228-3236(2007) [PubMed] [Europe PMC] [Abstract]
  3. "Complete DNA sequence, comparative genomics, and prevalence of an IncHI2 plasmid occurring among extraintestinal pathogenic Escherichia coli isolates."
    Johnson T.J., Wannemeuhler Y.M., Scaccianoce J.A., Johnson S.J., Nolan L.K.
    Antimicrob. Agents Chemother. 50:3929-3933(2006) [PubMed] [Europe PMC] [Abstract]
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again