Last modified September 9, 2013
This subsection of the ‘Sequence annotation (Features)’ section reports difference(s) between the canonical sequence (displayed by default in the entry) and the different sequence submissions merged in the entry. These various submissions may originate from different sequencing projects, different types of experiments, or different biological samples. Sequence conflicts are usually of unknown origin.
The ‘Sequence conflict’ subsection indicates the position at which a submitted sequence differs from the canonical sequence, the resulting change in the protein sequence and the reference corresponding to the differing sequence report.
We usually define the canonical sequence as the most frequent protein sequence or the most conserved in orthologous species. For organisms whose genome has been fully sequenced, we generally define the canonical protein sequence as the one derived from the translation of the genomic sequence. All sequence conflicts are described with reference to this canonical sequence.
The sequence differences described in this subsection complement those in the ‘Sequence caution’ subsection of the ‘General annotation (Comments)’ section, which describes substantial sequence differences, such as errors in translation, frameshifts or erroneous gene model predictions.
Sequence conflicts are usually of unknown origin: they may originate from sequencing errors or may be a yet uncharacterized natural polymorphism (SAP). If a sequence conflict turns out to be an actual SAP, the annotation is updated and the variation deleted from this subsection and reported in the ‘Natural variations’.