A sequence alignment and analysis of SARS-CoV-2 spike glycoprotein
Last Updated: 2020-06-29
Preamble
In this age of Next Gen. Sequencing and complete genomes it is easy to forget about the “earlier days of sequencing” of much smaller portions but complete sequences of genes or proteins.
There are free tools on web servers that can be found to be useful. However, this tutorial is to explore command-line options that have the advantage to be scriptable and scalable to a much larger scale and number of files to handle.
Creating this tutorial document became more complex and intricated than anticipated at first.
The original aim of creating just a few examples of sequence alignment became a more elaborate project that:
- provides examples of the power of using simple Unix tools to create powerful manipulations without programming
- details methods for sequence retrieval online
- shows examples of 3D structure download without visiting a web site
- helps create automated sequence alignment with command-line tools
In some way this document represents “how I work” and hopefully will be useful or perhaps provide inspiration to even “casual” users.