2024-06-10

Intermission: data wrangling

NHANES datasets are “curated” and are created following standard practice resulting in datasets listed in tabular data formatted in a way well suited for R.

This section is here as an “intermission” in the form of a lecture by Garrett Grolemund, Data Scientist and Master Instrutor at RStudio, split into 4 YouTube videos.

The whole four parts are listed here, but the most important for treating NHANES data would be Part 3 about the dplyr Tidyverse package.

Intermission: data wrangling

Part 1 would review what was learned in the previous chapter (tidyverse another R universe) and Part 2 is about the tidyr package that helps reformat the data, a very useful tool but not really necessary for NHANES data.

Intermission: data wrangling

Description of the RStudio videos:

Data wrangling is too often the most time-consuming part of data science and applied statistics.

Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier.

These videos introduce you to these tools. (Table on next slide.)

Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks..

Intermission: data wrangling

Part 3 embedded here

HTML version (book or slides) has Part 3 embedded here. (See next slide for time selection)

  • .

Part 3 time selection