Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.


StatusReference proteome
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="">proteome</a>. It consists of the characters ‘UP’ followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000000314
Taxonomy644223 - Komagataella phaffii (strain GS115 / ATCC 20864)
StrainGS115 / ATCC 20864
Last modifiedNovember 9, 2018
Genome assembly and annotationi GCA_000027005.1 from ENA/EMBL
Pan proteomei <p>A pan proteome is the full set of proteins thought to be expressed by a group of highly related organisms (e.g. multiple strains of the same bacterial species).<p><a href='/help/pan_proteomes' target='_top'>More...</a></p> This proteome is part of the Komagataella phaffii (strain GS115 / ATCC 20864) (Yeast) (Pichia pastoris) pan proteome (fasta)

Pichia pastoris has been reassigned to the genus Komagataella after phylogenetic analysis of gene sequences, and split into three species, K. pastoris, K. phaffii, and K. pseudopastoris. These species are facultative methylotrophic yeasts: they are able to use single carbon compounds (e.g. methanol or methane) or multi-carbon compounds with no carbon-carbon bonds (e.g. dimethyl ether or dimethylamine) as a food source.

Pichia pastoris has been used extensively as an expression system for protein production because of its high growth rate, low cost of culture, and because it is able to produce proteins with disulfide bonds and glycosylation.

The genome sequence was published in 2009 for Komagataella pastoris strain GS115. It consists of 9.4Mb in 4 chromosomes with 5,313 predicted protein-coding genes.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)
Chromosome 11538
Chromosome 1, Contig 3434
Chromosome 21333
Chromosome 31198
Chromosome 4971
Chromosome 4, Contig 1573


UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again