# First, install the pak package:
install.packages("pak")
# Then the packages for both days
<-
pkgs c("bonsai", "doParallel", "finetune", "lightgbm", "lme4", "plumber",
"probably", "ranger", "rpart", "rpart.plot", "stacks", "textrecipes",
"tidymodels", "tidymodels/modeldatatoo", "vetiver")
::pak(pkgs) pak
Welcome
These are the materials for workshops on tidymodels. This workshop provides an introduction to machine learning with R using the tidymodels framework, a collection of packages for modeling and machine learning using tidyverse principles. We will build, evaluate, compare, and tune predictive models. Along the way, we’ll learn about key concepts in machine learning including overfitting, resampling, and feature engineering. Learners will gain knowledge about good predictive modeling practices, as well as hands-on experience using tidymodels packages like parsnip, rsample, recipes, yardstick, tune, and workflows.
Is this workshop for me?
This course assumes intermediate R knowledge. This workshop is for you if:
- You can use the magrittr pipe
%>%
and/or native pipe|>
- You are familiar with functions from dplyr, tidyr, and ggplot2
- You can read data into R, transform and reshape data, and make a wide variety of graphs
We expect participants to have some exposure to basic statistical concepts, but NOT intermediate or expert familiarity with modeling or machine learning.
Preparation
Please join the workshop with a computer that has the following installed (all available for free):
- A recent version of R, available at https://cran.r-project.org/
- A recent version of RStudio Desktop (RStudio Desktop Open Source License, at least v2022.02), available at https://www.rstudio.com/download
- The following R packages, which you can install from the R console:
Slides
These slides are designed to use with live teaching and are published for workshop participants’ convenience. There are not meant as standalone learning materials. For that, we recommend tidymodels.org and Tidy Modeling with R.
Introduction to tidymodels
- 01: Introduction
- 02: Your data budget
- 03: What makes a model?
- 04: Evaluating models
Advanced tidymodels
Extra content (time permitting)
There’s also a page for slide annotations; these are extra notes for selected slides.
Code
Quarto files (version 1.4.104) for working along are available on GitHub. (Don’t worry if you haven’t used Quarto before; it will feel familiar to R Markdown users.)
Past workshops
- July 2022 at rstudio::conf()
- August 2022 in Reykjavik
- July 2023 at the New York R Conference
Acknowledgments
This website, including the slides, is made with Quarto. Please submit an issue on the GitHub repo for this workshop if you find something that could be fixed or improved.
Reuse and licensing
Unless otherwise noted (i.e. not an original creation and reused from another source), these educational materials are licensed under Creative Commons Attribution CC BY-SA 4.0.