In this one-day hands-on workshop, RStudio certified instructor Jeroen Janssens will walk you through the so-called tidyverse to transform data. The tidyverse is an ecosystem of R packages that share an underlying design philosophy, grammar, and data structures.
We’ll start at the beginning, with importing CSV data using
readr and spreadsheets using
readxl. We’ll cover the most important functions from
tidyr for generic data wrangling and cleaning. We’ll also look at dealing with dates, factors, and textual data specifically using the packages
stringr, respectively. Note that this workshop does not cover
ggplot2; for that we recommend our one-day workshop Data Visualisation with R and ggplot2.
By the end of this workshop, you’ll have a good understanding of the tidyverse ecosystem and you’ll be able to apply many of its packages to your own data.
- The concept of tidy data
- Filtering rows
- Selecting columns
- Replacing values
- Handling missing values
- Cleaning column names
- Making groups
- Computing summary statistics
- Dealing with dates, factors, and textual data
You’re expected to have some experience with programming in R and RStudio. Our workshop Introduction to Programming in R is one option that can help you with that.
Participants are kindly requested to have the following items installed prior to the start of the workshop:
- R version 4.0 or later
- RStudio v1.3 or later
- The latest version the tidyverse, by running:
install.packages("tidyverse"), dependencies = TRUE)
About your instructor
Jeroen is an RStudio Certified Instructor who enjoys visualizing data, building machine learning models, and automating things using either Python, R, or Bash. Previously, he was an assistant professor at Jheronimus Academy of Data Science and a data scientist at Elsevier in Amsterdam and various startups in New York City. He is the author of Data Science at the Command Line. Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University.
We’ve previously delivered this workshop at:
Photos and testimonials
“Data Science Workshops organised for KPN a ten-week course on Data Science with R. The combination of training, on-site coaching, and remote support ensured that our analysts are applying the new knowledge and skills in their daily projects. For instance, they’re now capable to implement complex predictive models using R. We’re looking forward to the follow-up course on Advanced Machine Learning.”