Welcome

These are the materials for workshops on tidymodels presented at the 2022 RStudio conference. This workshop provides an introduction to machine learning with R using the tidymodels framework, a collection of packages for modeling and machine learning using tidyverse principles. We will build, evaluate, compare, and tune predictive models. Along the way, we’ll learn about key concepts in machine learning including overfitting, resampling, and feature engineering. Learners will gain knowledge about good predictive modeling practices, as well as hands-on experience using tidymodels packages like parsnip, rsample, recipes, yardstick, tune, and workflows.

Is this workshop for me?

This course assumes intermediate R knowledge. This workshop is for you if:

  • You can use the magrittr pipe %>% and/or native pipe |>
  • You are familiar with functions from dplyr, tidyr, and ggplot2
  • You can read data into R, transform and reshape data, and make a wide variety of graphs

We expect participants to have some exposure to basic statistical concepts, but NOT intermediate or expert familiarity with modeling or machine learning.

Preparation

Please join the workshop with a computer that has the following installed (all available for free):

install.packages(c("DALEXtra", "doParallel", "embed", "forcats",
                   "lme4", "ranger", "remotes", "rpart", 
                   "rpart.plot", "stacks", "tidymodels",
                   "vetiver", "xgboost"))

remotes::install_github("topepo/ongoal@v0.0.2")

Slides

These slides are designed to use with live teaching and are published for workshop participants’ convenience. There are not meant as standalone learning materials. For that, we recommend tidymodels.org and Tidy Modeling with R.

Day One

Day Two

There’s also a page for slide annotations; these are extra notes for selected slides.

Code

Quarto files for working along are available on GitHub. (Don’t worry if you haven’t used Quarto before; it will feel familiar to R Markdown users.)

Acknowledgments

This website, including the slides, is made with Quarto. Please submit an issue on the GitHub repo for this workshop if you find something that could be fixed or improved.

Reuse and licensing

Unless otherwise noted (i.e. not an original creation and reused from another source), these educational materials are licensed under Creative Commons Attribution CC BY-SA 4.0.