Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Overview

StatusReference proteome
Proteinsi <p>Number of protein entries associated with this proteome: UniProtKB entries for regular proteomes or UniParc entries for redundant proteomes (<a href="/help/proteome%5Fredundancy">more...</a>)</p> 11,718
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="http://www.uniprot.org/manual/proteomes%5Fmanual">proteome</a>. It consists of the characters 'UP' followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000001449
Taxonomy35128 - Thalassiosira pseudonana
StrainCCMP1335
Last modifiedFebruary 26, 2021
Genome assembly and annotationi <p>Identifier for the genome assembly (<a href="https://www.ensembl.org/Help/Faq?id=216">more...</a>)</p> GCA_000149405.2 from ENA/EMBL full
Buscoi <p>The Benchmarking Universal Single-Copy Ortholog (BUSCO) assessment tool is used, for eukaryotic and bacterial proteomes, to provide quantitative measures of UniProt proteome data completeness in terms of expected gene content. BUSCO scores include percentages of complete (C) single-copy (S) genes, complete (C) duplicated (D) genes, fragmented (F) and missing (F) genes, as well as the total number of orthologous clusters (n) used in the BUSCO assessment.</p> C:97%[S:95%,D:2%],F:3%,M:0%,n:100 stramenopiles_odb10

Thalassiosira pseudonana is a marine diatom, a unicellular photosynthetic alga. A hallmark of diatoms is their intricately patterned silicified cell wall. In contrast to higher plants, diatoms acquired their plastids via secondary endosymbiosis, the merging of a eukaryotic host cell with a eukaryotic unicellular photosynthetic alga. The genome, including the mitochondrial and plastid genomes, was published in 2004. The nuclear genome comprises 34 Mb and about 11,000 predicted protein-coding genes. The plastid genome comprises about 0.13 Mb and 144 predicted protein-coding genes. The mitochondrial genome comprises about 0.04 Mb and 40 predicted protein-coding genes.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)
Component representationProteins
Unassembled WGS sequence1219
Chromosome 7701
Chromosome 18303
Chloroplast127
Chromosome 21020
Chromosome 15302
Chromosome 17215
Chromosome 11177
Chromosome 9393
Chromosome 23152
Chromosome 14351
Chromosome 22324
Chromosome 4912
Chromosome 2486
Chromosome 8483
Chromosome 10386
Chromosome 13376
Chromosome 6772
Chromosome 3985
Chromosome 5871
Chromosome 12348
Chromosome 20256

Publications

  1. "Chloroplast genomes of the diatoms Phaeodactylum tricornutum and Thalassiosira pseudonana: comparison with other plastid genomes of the red lineage."
    Oudot-Le Secq M.-P., Grimwood J., Shapiro H., Armbrust E.V., Bowler C., Green B.R.
    Mol. Genet. Genomics 277:427-439(2007) [PubMed] [Europe PMC] [Abstract]
  2. "The Phaeodactylum genome reveals the evolutionary history of diatom genomes."
    Bowler C., Allen A.E., Badger J.H., Grimwood J., Jabbari K., Kuo A., Maheswari U., Martens C., Maumus F., Otillar R.P., Rayko E., Salamov A., Vandepoele K., Beszteri B., Gruber A., Heijde M., Katinka M., Mock T.
    Grigoriev I.V.
    Nature 456:239-244(2008) [PubMed] [Europe PMC] [Abstract]
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again