Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Overview

StatusReference proteome
Proteinsi <p>Number of protein entries associated with this proteome: UniProtKB entries for regular proteomes or UniParc entries for redundant proteomes (<a href="/help/proteome_redundancy">more...</a>)</p> 9,320
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="http://www.uniprot.org/manual/proteomes_manual">proteome</a>. It consists of the characters ‘UP’ followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000002139
Taxonomy448385 - Sorangium cellulosum (strain So ce56)
StrainSo ce56
Last modifiedNovember 5, 2019
Genome assembly and annotationi <p>Identifier for the genome assembly (<a href="https://www.ensembl.org/Help/Faq?id=216">more...</a>)</p> GCA_000067165.1 from ENA/EMBL full
Pan proteomei <p>A pan proteome is the full set of proteins thought to be expressed by a group of highly related organisms (e.g. multiple strains of the same bacterial species).<p><a href='/help/pan_proteomes' target='_top'>More...</a></p> This proteome is part of the Sorangium cellulosum (strain So ce56) (Polyangium cellulosum (strain So ce56)) pan proteome (fasta)
BuscoC:86.5%[S:80.4%,D:6.1%],F:4.7%,M:8.8%,n:296
CompletenessStandard

Sorangium cellulosum (strain So ce56) is an obligate anaerobic, Gram-negative bacterium. Its genome, with 13033779 base pairs, is the largest bacterial genome sequenced to date and contains 9367 predicted protein coding genes. 34.7 % of the CDS (3248 genes) have no significant similarity to predicted proteins in the databases. The noncoding regions represent 14 % of the genome.

It possesses 317 ELKs (eukaryotic protein kinase-like kinases) all of which contain the 11 subdomains known to be critical for activity. Sorangium cellulosum (strain So ce56) produces the metabolite etnangien, an inhibitor of bacterial and viral nucleic acid polymerases. It has numerous enzymes with biotechnological potential, including lipolytic activities, proteases, cellulases, nitrilases, amidases and hydantoinases. It appears to encode a rich supply of S8 family peptidases, subtilisin-like enzymes that are widely exploited as additives in laundry detergents.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)
Component representationProteins
Chromosome9320
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again