Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.


StatusReference proteome
Proteinsi <p>Number of protein entries associated with this proteome: UniProtKB entries for regular proteomes or UniParc entries for redundant proteomes (<a href="/help/proteome_redundancy">more...</a>)</p> 2,426
Gene counti <p>This is the total number of unique genes found in the proteome set, algorithmically computed. For each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives.</p> - Download one protein sequence per gene (FASTA)
Proteome IDi <p>The proteome identifier (UPID) is the unique identifier assigned to the set of proteins that constitute the <a href="">proteome</a>. It consists of the characters 'UP' followed by 9 digits, is stable across releases and can therefore be used to cite a UniProt proteome.<p><a href='/help/proteome_id' target='_top'>More...</a></p>UP000000554
Taxonomy64091 - Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1)
StrainATCC 700922 / JCM 11081 / NRC-1
Last modifiedMay 24, 2021
Genome assembly and annotationi <p>Identifier for the genome assembly (<a href="">more...</a>)</p> GCA_000006805.1 from ENA/EMBL full
Pan proteomei <p>A pan proteome is the full set of proteins thought to be expressed by a group of highly related organisms (e.g. multiple strains of the same bacterial species).<p><a href='/help/pan_proteomes' target='_top'>More...</a></p> This proteome is part of the Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) (Halobacterium halobium) pan proteome (fasta)
Buscoi <p>The Benchmarking Universal Single-Copy Ortholog (BUSCO) assessment tool is used, for eukaryotic and bacterial proteomes, to provide quantitative measures of UniProt proteome data completeness in terms of expected gene content. BUSCO scores include percentages of complete (C) single-copy (S) genes, complete (C) duplicated (D) genes, fragmented (F) and missing (M) genes, as well as the total number of orthologous clusters (n) used in the BUSCO assessment, and the name of the taxonomic lineage dataset used.</p> C:86.4%[S:86.1%,D:0.3%],F:3.1%,M:10.5%,n:904 halobacteriales_odb10
Completenessi <p>Complete Proteome Detector (CPD) is an algorithm which employs statistical evaluation of the completeness and quality of proteomes in UniProt, by looking at the sizes of taxonomically close proteomes. Possible values are 'Standard', 'Close to Standard' and 'Outlier'.</p> Standard

Aerobic halophilic chemoorganotroph growing on the degradation products of less halophilic organisms as the salinity reaches near saturation. Halobacterium species have adapted to optimal growth under conditions of extremely high salinity (10 times that of sea water).

Halobacterium salinarum (ATCC 700922 / JCM 11081 / NRC-1) has 1 chromosome and 2 plasmids. The chromosome has a very high GC content of 68 % whereas the plasmids have a lower GC content of 58.8 %. The chromosome of strain R1 is completely colinear and virtually identical to that of strain NRC-1. Besides differences due to insertion elements, there are only 12 other differences: four point mutations, five frameshifts and three insertion/deletion events. Between strain R1 and strain NRC-1 it is possible to match more than 350 kb of plasmid sequence that are virtually identical at the DNA sequence level. This is contrasted sharply by a highly different overall plasmid architecture: the number of plasmids is different, the patterns of the large-scale duplications are highly dissimilar in the two strains, the regions of colinearity are short and all colinearity breakpoints are associated with insertion elements. These differences in plasmid architecture may reflect biological variations among the strains. Alternatively, the excessive duplication may have resulted in sequence assembly errors. Despite the near identity of the DNA sequences of strains R1 and NRC-1, major differences in the protein-coding set have been found. There are 111 CDS that have not been annotated for strain NRC-1. A total of 2375 CDS map to each other in the two strains, among which 475 differ, mainly because of alternative start codon selection. This illustrates the difficulty of a correct ORF prediction in GC-rich genomes. Based on several lines of evidence, it appears that strains R1 and NRC-1 do not represent independent strains but very probably originate from the same cultivation event of a natural isolate. In this view, the differences between the two strains originate from evolution in the laboratory.

Componentsi <p>Genomic components encoding the proteome</p>

Component nameGenome Accession(s)
Component representationProteins
Plasmid pNRC100138
Plasmid pNRC200261
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again