
4.3 Consensus sequence
A consensus sequence can be useful in some cases. The EMBOSS
program showalign
can take a multiple FastA sequence and present it in a format similar ot the mega
format just showing differences with an added consensus sequence. Other options can show dissimilarities, similarities, identities, non-identities, etc.
To show dissimilarities:
We can check the result with:
10 20 30 40 50 60
----:----|----:----|----:----|----:----|----:----|----:----|
QIU81885.1 ............................................................
QIU80913.1 .................................................L..........
QIU81585.1 ............................................................
QIU80973.1 ..........................V.................................
QIS61422.1 ............................................................
Note: Your results may be different as the number of sequences augment with time and no changes might be visible in the portion shown.
A consensus sequence is visible as the last sequence, here shown at the very end:
QIJ96493.1 .............
QII57278.1 .............
Consensus SEPVLKGVKLHYT
The computation of the consensus sequence can be modulated with optional parameter plurality
and its 50%
default value and other options affecting the similarity calculations based on the chosen “scoring matrix.” Other options can be used to influence the aspect with uppercase/lowercase, ruler etc.
Other EMBOSS
apps can also be used to calculate consensus or display the alignment in graphical formats.
Explore
Try the following EMBOSS
apps suggested below, varying the options shown here as example that avoid user manual input:
prettyplot -boxcol -consensus -cidentity grey -graph png spike_filtered_omega.fa
/usr/lib/emboss/cons spike_filtered_omega.fa -out consensus.fasta
Note: in some installation cons
is not readily available unless full path is given.