Last modified December 6, 2011
This subsection of the ‘Sequence annotation (Features)’ section denotes the presence of an N-terminal signal peptide.
Signal peptides are found in proteins that are targeted to the endoplasmic reticulum and eventually destined to be either secreted/extracellular/periplasmic/etc., retained in the lumen of the endoplasmic reticulum, of the lysosome or of any other organelle along the secretory pathway or to be I single-pass membrane proteins.
The signal sequence is usually removed in the mature protein; in these cases, the comment ‘The displayed sequence is further processed into a mature form’ is added in the ‘Sequence processing’ subsection of the ‘Protein attributes’ section.
1. Annotation of experimentally proven signal peptides
We annotate experimentally proven signal peptides when the cleavage site has been determined by direct protein sequencing.
This information can then be propagated ‘By similarity’ to closely related species providing the signal sequence is conserved.
When it is clear that a protein is cleaved (according to experimental data or its similarity with a family of proteins), but direct protein sequencing has not been used to determined the precise cleavage position, we use a question mark instead of a position.
In some rare cases, signal sequences are not cleaved. This is indicated by the note ‘Not cleaved’ in the feature description.
2. Annotation of predicted signal peptides
We annotate signal peptides which are predicted by the application of the predictive tools Phobius, Predotar, SignalP and TargetP. At least two methods must return a positive signal peptide prediction in order for the prediction to be annotated in UniProtKB. When predicted N-terminal signal peptides and transmembrane regions overlap, then the prediction returned by Phobius is used to discriminate between the two possibilities.
3. Annotation of Tat signal sequences in bacteria and plants
This subsection is also used for the annotation of proteins with a Tat (Twin-arginine translocation) signal sequence, which serves to translocate folded proteins in bacteria and plants. Substrate proteins are directed to the Tat apparatus by distinctive N-terminal signal peptides containing a consensus SRRxFLK ‘twin-arginine’ motif. The related comment “Predicted to be exported by the Tat system” can be found in the ‘General annotation (comments)’ section, in the ‘Post-translational modification’ subsection.
Predicted TAT signal sequences are identified using the TIGRFAM HMM TIGR01409 In all cases predicted cleavage sites are annotated as ‘Potential’.
Related keyword: Signal
See also: Non-experimental qualifiers