Machine learning with tidymodels

Welcome!

Wi-Fi network name

TODO-ADD-LATER

Wi-Fi password

TODO-ADD-LATER

Venue information

There are gender neutral bathrooms located on levels 3, 4, 5, 6 & 7
A meditation/prayer room is located in 503
(Mon & Tue 7am - 7pm, and Wed 7am - 5pm)
A lactation room is located in 509
(Mon & Tue 7am - 7pm, and Wed 7am - 5pm)

Workshop policies

Please review the posit::conf code of conduct, which applies to all workshops: https://posit.co/code-of-conduct
CoC site has info on how to report a problem (in person, email, phone)
Please do not photograph people wearing red lanyards

Who are you?

You can use the magrittr %>% or base R |> pipe
You are familiar with functions from dplyr, tidyr, ggplot2
You have some exposure to basic statistical concepts like linear models and residuals
You do not need intermediate or expert familiarity with modeling or ML

Who are tidymodels?

Simon Couch
Hannah Frick
Emil Hvitfeldt
Max Kuhn

+ our TA today, Sara Altman!

Many thanks to Davis Vaughan, Julia Silge, David Robinson, Julie Jung, Alison Hill, and Desirée De Leon for their role in creating these materials!

Asking for help

🟪 “I’m stuck and need help!”

🟩 “I finished the exercise”

Discord

pos.it/conf-event-portal (login)
Click on “Join Discord, the virtual networking platform!”
Browse Channels -> #workshop-tidymodels-intro

👀

Plan for this workshop

Your data budget
What makes a model
Evaluating models
Tuning models

Introduce yourself to your neighbors 👋

Log in to Posit Cloud (free): TODO-ADD-LATER

What is machine learning?

What is machine learning? (2024 edition)

What is machine learning?

Your turn

How are statistics and machine learning related?

How are they similar? Different?

03:00

What is tidymodels?

library(tidymodels)
#> ── Attaching packages ──────────────────────────── tidymodels 1.2.0 ──
#> ✔ broom        1.0.6     ✔ rsample      1.2.1
#> ✔ dials        1.3.0     ✔ tibble       3.2.1
#> ✔ dplyr        1.1.4     ✔ tidyr        1.3.1
#> ✔ infer        1.0.7     ✔ tune         1.2.1
#> ✔ modeldata    1.4.0     ✔ workflows    1.1.4
#> ✔ parsnip      1.2.1     ✔ workflowsets 1.1.0
#> ✔ purrr        1.0.2     ✔ yardstick    1.3.1
#> ✔ recipes      1.1.0
#> ── Conflicts ─────────────────────────────── tidymodels_conflicts() ──
#> ✖ purrr::discard() masks scales::discard()
#> ✖ dplyr::filter()  masks stats::filter()
#> ✖ dplyr::lag()     masks stats::lag()
#> ✖ recipes::step()  masks stats::step()
#> • Use tidymodels_prefer() to resolve common conflicts.

The whole game

Roadmap for today
Minimal version of predictive modeling process
Feature engineering and tuning as iterative extensions

The whole game

Let’s install some packages

If you are using your own laptop instead of Posit Cloud:

# Install the packages for the workshop
pkgs <- 
  c("bonsai", "Cubist", "doParallel", "earth", "embed", "finetune", 
    "forested", "lightgbm", "lme4", "parallelly", "plumber", "probably", 
    "ranger", "rpart", "rpart.plot", "rules", "splines2", "stacks", 
    "text2vec", "textrecipes", "tidymodels", "vetiver")

install.packages(pkgs)

Our versions

R version 4.4.1 (2024-06-14), Quarto (1.6.1)

package	version
bonsai	0.3.1
broom	1.0.6
Cubist	0.4.4
dials	1.3.0
doParallel	1.0.17
dplyr	1.1.4
earth	5.3.3
embed	1.1.4
finetune	1.2.0
forested	0.1.0
Formula	1.2-5

package	version
ggplot2	3.5.1
lattice	0.22-6
lightgbm	4.3.0
lme4	1.1-35.5
modeldata	1.4.0
parallelly	1.38.0
parsnip	1.2.1
plotmo	3.6.3
plotrix	3.8-4
plumber	1.2.2
probably	1.0.3

package	version
purrr	1.0.2
ranger	0.16.0
recipes	1.1.0
rpart	4.1.23
rpart.plot	3.1.2
rsample	1.2.1
rules	1.0.2
scales	1.3.0
splines2	0.5.2
stacks	1.0.4
text2vec	0.6.4

package	version
textrecipes	1.0.6
tibble	3.2.1
tidymodels	1.2.0
tidyr	1.3.1
tune	1.2.1
vetiver	0.2.5
workflows	1.1.4
workflowsets	1.1.0
yardstick	1.3.1

1 - Introduction

Venue information

Workshop policies

Who are you?

Who are tidymodels?

Asking for help

Discord

👀

👀

Plan for this workshop

Introduce yourself to your neighbors 👋

Log in to Posit Cloud (free): TODO-ADD-LATER

What is machine learning?

What is machine learning?

What is machine learning? (2024 edition)

What is machine learning?

Your turn

What is tidymodels?

The whole game

The whole game

The whole game

The whole game

The whole game

The whole game

The whole game

The whole game

Let’s install some packages

Our versions