This is a summary of notes during the presentation of guest speaker Dr Kumaran Baskaran, author of the RBMRB package
Note: the interactive plots are only visible on the HTML version of this document. PDF and Docx versions will not show them.
RBRMB
is an R
package to access data from the Biological Magnetic Resonance Data Bank - BMRB, a repository for data from NMR spectroscopy on proteins, peptides, nucleic acids, and other biomolecules.
This short document is a set of notes from a “hands-on” session, but more complete “how to” and descriptions can be found on the R
BMRB documentation1.
The RBMRB
can be installed from the R
line command with:
install.packages('RBMRB)
or from the Packages
tab within **[RStudio]https://www.rstudio.com/products/rstudio/download/)**.
Note: There are many dependencies that will be installed automatically at the same time.
Before it can be used the package needs to be loaded within R
with the command:
library(RBMRB)
The R
BMRB package contains commands to access data directly from the within the web page of Biological Magnetic Resonance Data Bank - BMRB online database.
At the top of the web page is a search field to request accession code of compounds of interest.
The database contains on chemical shift data, while 3D coordinates are available from the Protein Data Bank{^2].
Therefore wtih the R
BMRB package we will only be interested in chemical shift meta data. These can be fetched by accession code (see above.) For example here we get data for entry 15060
(NMR structure of the murine DLC2 (deleted in liver cancer -2) SAM (sterile alpha motif) domain*.)
We’ll save the chemical shift data in the R
object cs
:
cs <- fetch_entry_chemical_shifts(15060)
You can explore the data table with these commands:
To see a spreadsheet like format:
View(cs)
The see the first 6 lines of data:
head(cs)
You will see that the data contains 24 variables (columns.) If you explore further you may also find that there are 974 observations (rows.) This can be confirmed by checking the dimension:
dim(cs)
[1] 974 24
We are mostly interested in the column Val
containing the value of the chemical shift. This is the 11th column in the data. We could access this column simply by using the command:
cs$Val
which would display all values within that column on the screen.
The following command is meant to simplify the data by keeping only the relevant columns. The command convert_cs_to_n15hsqc
is part of the RBMRB
package. See ?convert_cs_to_n15hsqc
for help.
spect_data <- convert_cs_to_n15hsqc(cs)
The usualy “base R
” commands are available, by choosing the columns we need. For example:
plot(spect_data$H, spect_data$N)
Which could also be written as
with(spect_data, plot(H,N))
The package contains built-in graphical functions that take advantage of modern plotting packages installed as dependencies such as ggplot2
and plotly
providing interactive plots. This is one reason why a plot is sent first towards an R
object, here plt
.
plt <- HSQC_15N(15060)
plt
See Wikipedia definition of TOCSY2.
This plot will connect all oxygen atoms of one residue to try to identify amino acids.
plt2 <- TOCSY(15060)
plt2
However this plot is very busy and not so useful here.
In BMRB assigments have already been done and can be used for comparison.
plt3 <- HSQC_15N(c(11505, 11547))
plt3
Each entry will have a different color
The plot can be modified by adding a connecting line between related residues:
plt3 <- HSQC_15N(c(11505, 11547), 'line')
plt3
Important note: those far away on the plot represent change in frequency NOT spacial distance! An example where this can happen could be closeness to water.
plt4 <- HSQC_15N(18857)
plt4
Modify to connect by line to see the changes:
plt4 <- HSQC_15N(18857, 'line')
plt4
The command HSQC_15N
simulates H1-N15 HSQC spectra for a given entry or list of entries from BMRB. More information is available in the help:
?HSQC_15N
One example within the help page can be used to try to find ‘active sites’ by interaction with ligand(s).
plot_hsqc<-HSQC_15N(c(17074,17076,17077),'line')
plot_hsqc
The plot can be done in 2 ways:
Note: many of the command download ALL the data from the database and may require time and…. lots computer RAM!
Example: all CB atoms for all ALA residues in the whole database:
cs_hist <- chemical_shift_hist('ALA','CB')
cs_hist
A similar command will apply to all atoms of ALA residues by replacing CB
with *
.
cs_hist <- chemical_shift_hist('ALA','*')
cs_hist