Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.


Proteinsi <p>Identifier for the genome assembly (<a href="">more...</a>)</p> 6,585
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="">proteome</a>. It consists of the characters ‘UP’ followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000006158
Taxonomy246196 - Mycobacterium smegmatis (strain ATCC 700084 / mc(2)155)
Strainmc2 155
Last modifiedMarch 29, 2019
Genome assembly and annotationi <p>Identifier for the genome assembly (<a href="">more...</a>)</p> GCA_000283295.1 from ENA/EMBL
Pan proteomei <p>A pan proteome is the full set of proteins thought to be expressed by a group of highly related organisms (e.g. multiple strains of the same bacterial species).<p><a href='/help/pan_proteomes' target='_top'>More...</a></p> This proteome is part of the Mycobacterium smegmatis (strain ATCC 700084 / mc(2)155) pan proteome (fasta)

Mycobacterium smegmatis is a fast-growing, usually non-pathogenic Mycobacterium widely used as a model system. It was originally isolated from human smegma. M.smegmatis mc(2)155 was isolated in 1990, and unlike other M.smegmatis can be easily transformed by electroporation. Mycobacteria have an unusual outer membrane approximately 8nm thick, despite being considered Gram-positive. The outer membrane and the mycolic acid-arabinoglactan-peptidoglycan polymer form the cell wall, which constitutes an efficient permeability barrier in conjunction with the cell inner membrane.

The second genome of this strain (dated 2007-2012) is a re-sequencing of some small regions of the genome but is not a de novo sequencing of the full genome. The group of O. Lecompte wanted to submit the sequence as a third part annotation, but Genbank considered it was a new sequence since some bases are different from the initial TIGR sequence. UniProtKB has decided to give it the status of Complete proteome as it is based on the original TIGR sequence.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)


  1. "Ortho-proteogenomics: multiple proteomes investigation through orthology and a new MS-based protocol."
    Gallien S., Perrodou E., Carapito C., Deshayes C., Reyrat J.-M., Van Dorsselaer A., Poch O., Schaeffer C., Lecompte O.
    Genome Res. 2009:128-135(2009) [PubMed] [Europe PMC] [Abstract]
  2. "Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors?"
    Deshayes C., Perrodou E., Gallien S., Euphrasie D., Schaeffer C., Van-Dorsselaer A., Poch O., Lecompte O., Reyrat J.-M.
    Genome Biol. 2007:R20.1-R20.9(2007) [PubMed] [Europe PMC] [Abstract]
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again