
Chapter 4 Sequence alignment
A simple web browser search could easily locate many multiple sequence alignment (MSA) web servers. But our purpose here is to find ways to easily “do the analysis again” in a “reproducible way” and “without manual input” for all the various analysis steps, that could even ideally be placed within a script.
The spike protein sequences within file spike_filtered.fa
are distinct just by one to a handful amino acids, and therefore most MSA program would have no difficulty aligning them.
We’ll use one of the latest MSA software: clustal omega8 (Sievers et al. (2011)) in a command-line version.
Clustal omega can be downloaded and installed but it is also available as a docker container, therefore avoiding all the installation process. (See Introduction Chapter 1 for material suggestion to learn how to use docker
.)
Whether you use a docker
container or a locally installed verion the commands should remain the same. Here we’ll start a container to access the local directory.
References
Sievers, F., A. Wilm, D. Dineen, T. J. Gibson, K. Karplus, W. Li, R. Lopez, et al. 2011. “Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.” Mol. Syst. Biol. 7 (October): 539.