Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.


StatusReference proteome
Proteinsi <p>Number of protein entries associated with this proteome: UniProtKB entries for regular proteomes or UniParc entries for redundant proteomes (<a href="/help/proteome%5Fredundancy">more...</a>)</p> 34,652
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="">proteome</a>. It consists of the characters 'UP' followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000004994
Taxonomy4081 - Solanum lycopersicum
Straincv. Heinz 1706
Last modifiedDecember 21, 2020
Genome assembly and annotationi <p>Identifier for the genome assembly (<a href="">more...</a>)</p> GCA_000188115.3 from EnsemblPlants full
Buscoi <p>The Benchmarking Universal Single-Copy Ortholog (BUSCO) assessment tool is used, for eukaryotic and bacterial proteomes, to provide quantitative measures of UniProt proteome data completeness in terms of expected gene content. BUSCO scores include percentages of complete (C) single-copy (S) genes, complete (C) duplicated (D) genes, fragmented (F) and missing (F) genes, as well as the total number of orthologous clusters (n) used in the BUSCO assessment.</p> C:92.3%[S:90.2%,D:2.2%],F:2.6%,M:5.1%,n:5950 solanales_odb10
Completenessi <p>Complete Proteome Detector (CPD) is an algorithm which employs statistical evaluation of the completeness and quality of proteomes in UniProt, by looking at the sizes of taxonomically close proteomes. Possible values are 'Standard', 'Close to Standard' and 'Outlier'.</p> Standard

Solanum lycopersicum (tomato) is the second most important vegetable crop after the closely related potato (Solanum tuberosum) and is cultivated in over 140 different countries. The species originated in South America, where a number of wild relatives can be found. It is a perennial, though as a crop plant is often grown as an annual.

Solanum lycopersicum is particularly studied as a model organism for fruit development.

The tomato reference proteome is derived from the genome for the inbred cultivar Heinz 1706, which was sequenced by the Tomato Genome Consortium and published in 2012. The tomato genome has a haploid chromosome number of 12, containing 900 Mb and 35,000 protein-coding genes.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)
Component representationProteins
Chromosome 14220
Chromosome 23399
Chromosome 33335
Chromosome 42726
Chromosome 52506
Chromosome 62793
Chromosome 72474
Chromosome 82434
Chromosome 92473
Chromosome 102534
Chromosome 112304
Chromosome 122421


  1. "Sequence of the tomato chloroplast DNA and evolutionary comparison of solanaceous plastid genomes."
    Kahlau S., Aspinall S., Gray J.C., Bock R.
    J. Mol. Evol. 63:194-207(2006) [PubMed] [Europe PMC] [Abstract]
  2. "The tomato genome sequence provides insights into fleshy fruit evolution."
    Tomato Genome Consortium
    Nature 485:635-641(2012) [PubMed] [Europe PMC] [Abstract]
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again