Welcome

These are the materials for workshops on tidymodels offered at the 2025 New York Data Science & AI Conference. The tidymodels framework is a collection of packages for modeling and machine learning using tidyverse principles. This website hosts the materials for both the Introduction to Machine Learning in R with tidymodels and Getting More Out of Feature Engineering and Tuning for Machine Learning courses.

Introduction to Machine Learning in R with tidymodels will teach you core tidymodels packages and their uses: data splitting/resampling with rsample, model fitting with parsnip, measuring model performance with yardstick, and model optimization using the tune package. Time permitting, you’ll be introduced to basic pre-processing with recipes. You’ll learn tidymodels syntax as well as the process of predictive modeling for tabular data.

Getting More Out of Feature Engineering and Tuning for Machine Learning will teach you about model optimization using the tune and finetune packages, including racing and iterative methods. You’ll be able to do more sophisticated feature engineering with recipes. Time permitting, model ensembles via stacking will be introduced. This course is focused on the analysis of tabular data and does not include deep learning methods.

Is this workshop for me?

Depending on your background, one of Introduction to Machine Learning in R with tidymodels or Getting More Out of Feature Engineering and Tuning for Machine Learning might serve you better than the other.

Introduction to Machine Learning in R with tidymodels

This workshop is for you if you:

  • are comfortable using tidyverse packages to read data into R, transform and reshape data, and make a variety of graphs, and
  • have had some exposure to basic statistical concepts such as linear models, residuals, etc.

Intermediate or expert familiarity with modeling or machine learning is not required. Interested students who have intermediate or expert familiarity with modeling or machine learning may be interested in the Getting More Out of Feature Engineering and Tuning for Machine Learning workshop.

Getting More Out of Feature Engineering and Tuning for Machine Learning

This workshop is for you if you:

  • have the prerequisite skills listed for the Introduction to Machine Learning in R with tidymodels workshops,
  • have used tidymodels packages like recipes, rsample, and parsnip, and
  • have some experience with evaluating statistical models using resampling techniques like v-fold cross-validation or the bootstrap.

Participants who are new to tidymodels or machine learning will benefit from taking the Introduction to Machine Learning in R with tidymodels workshop before joining this one. Participants who have completed the “Introduction to Machine Learning in R with tidymodels” workshop previously will be well-prepared for this course.

Preparation

The process to set up your computer for either workshop will look the same. Please join the workshop with a computer that has the following installed (all available for free):

# Install the packages for the workshop
pkgs <- 
  c("bonsai", "Cubist", "doParallel", "earth", "embed", "finetune",
    "forested", "lightgbm", "lme4", "pak", "parallelly", "plumber", 
    "probably", "ranger", "rpart", "rpart.plot", "rules", "splines2", 
    "stacks", "text2vec", "textrecipes", "tidymodels", "vetiver")

install.packages(pkgs)

If you’re a Windows user and encounter an error message during installation noting a missing Rtools installation, install Rtools using the installer linked here.

Slides

These slides are designed to use with live teaching and are published for workshop participants’ convenience. They are not meant as standalone learning materials. For that, we recommend tidymodels.org and Tidy Modeling with R.

Introduction to Machine Learning in R with tidymodels

Getting More Out of Feature Engineering and Tuning for Machine Learning

Extra content (time permitting)

There’s also a page for slide annotations; these are extra notes for selected slides.

Code

Quarto files for working along are available on GitHub. (Don’t worry if you haven’t used Quarto before; it will feel familiar to R Markdown users.)

Past workshops

English

Spanish

Acknowledgments

This website, including the slides, is made with Quarto. Please submit an issue on the GitHub repo for this workshop if you find something that could be fixed or improved.

Reuse and licensing

Unless otherwise noted (i.e. not an original creation and reused from another source), these educational materials are licensed under Creative Commons Attribution CC BY-SA 4.0.