03:00
Machine learning with tidymodels

How do you fit a linear model in R?
How many different ways can you think of?
03:00
lm for linear model
glm for generalized linear model (e.g. logistic regression)
glmnet for regularized regression
keras for regression using TensorFlow
stan for Bayesian regression
spark for large data sets



All available models are listed at https://www.tidymodels.org/find/parsnip/


Run the tree_spec chunk in your .qmd.
Edit this code so it creates a different model.
05:00
All available models are listed at https://www.tidymodels.org/find/parsnip/
\(\mbox{latency} = \beta_0 + \beta_1\cdot\mbox{age} + \epsilon\)
Series of splits or if/then statements based on predictors
First the tree grows until some condition is met (maximum depth, no more data)
Then the tree is pruned to reduce its complexity
workflow()? fit() and predict() apply to the preprocessing steps in addition to the actual model fittree_spec <-
decision_tree() %>%
set_mode("regression")
tree_spec %>%
fit(latency ~ ., data = frog_train)
#> parsnip model object
#>
#> n= 456
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 456 2197966.00 92.90351
#> 2) age>=4.947975 256 252347.40 60.89844
#> 4) treatment=control 131 91424.06 48.42748 *
#> 5) treatment=gentamicin 125 119197.90 73.96800 *
#> 3) age< 4.947975 200 1347741.00 133.87000
#> 6) treatment=control 140 986790.70 118.25710
#> 12) reflex=mid,full 129 754363.70 111.56590 *
#> 13) reflex=low 11 158918.20 196.72730 *
#> 7) treatment=gentamicin 60 247194.60 170.30000
#> 14) age< 4.664439 30 102190.20 147.83330
#> 28) age>=4.566638 22 53953.86 129.77270 *
#> 29) age< 4.566638 8 21326.00 197.50000 *
#> 15) age>=4.664439 30 114719.40 192.76670 *tree_spec <-
decision_tree() %>%
set_mode("regression")
workflow() %>%
add_formula(latency ~ .) %>%
add_model(tree_spec) %>%
fit(data = frog_train)
#> ══ Workflow [trained] ════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: decision_tree()
#>
#> ── Preprocessor ──────────────────────────────────────────────────────
#> latency ~ .
#>
#> ── Model ─────────────────────────────────────────────────────────────
#> n= 456
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 456 2197966.00 92.90351
#> 2) age>=4.947975 256 252347.40 60.89844
#> 4) treatment=control 131 91424.06 48.42748 *
#> 5) treatment=gentamicin 125 119197.90 73.96800 *
#> 3) age< 4.947975 200 1347741.00 133.87000
#> 6) treatment=control 140 986790.70 118.25710
#> 12) reflex=mid,full 129 754363.70 111.56590 *
#> 13) reflex=low 11 158918.20 196.72730 *
#> 7) treatment=gentamicin 60 247194.60 170.30000
#> 14) age< 4.664439 30 102190.20 147.83330
#> 28) age>=4.566638 22 53953.86 129.77270 *
#> 29) age< 4.566638 8 21326.00 197.50000 *
#> 15) age>=4.664439 30 114719.40 192.76670 *tree_spec <-
decision_tree() %>%
set_mode("regression")
workflow(latency ~ ., tree_spec) %>%
fit(data = frog_train)
#> ══ Workflow [trained] ════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: decision_tree()
#>
#> ── Preprocessor ──────────────────────────────────────────────────────
#> latency ~ .
#>
#> ── Model ─────────────────────────────────────────────────────────────
#> n= 456
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 456 2197966.00 92.90351
#> 2) age>=4.947975 256 252347.40 60.89844
#> 4) treatment=control 131 91424.06 48.42748 *
#> 5) treatment=gentamicin 125 119197.90 73.96800 *
#> 3) age< 4.947975 200 1347741.00 133.87000
#> 6) treatment=control 140 986790.70 118.25710
#> 12) reflex=mid,full 129 754363.70 111.56590 *
#> 13) reflex=low 11 158918.20 196.72730 *
#> 7) treatment=gentamicin 60 247194.60 170.30000
#> 14) age< 4.664439 30 102190.20 147.83330
#> 28) age>=4.566638 22 53953.86 129.77270 *
#> 29) age< 4.566638 8 21326.00 197.50000 *
#> 15) age>=4.664439 30 114719.40 192.76670 *
Run the tree_wflow chunk in your .qmd.
Edit this code so it uses a linear model.
05:00
How do you use your new tree_fit model?

Run:
predict(tree_fit, new_data = frog_test)
What do you get?
03:00

Run:
augment(tree_fit, new_data = frog_test)
What do you get?
03:00
new_data and the output are the sameHow do you understand your new tree_fit model?
How do you understand your new tree_fit model?
You can extract_*() several components of your fitted workflow.
How do you understand your new tree_fit model?
You can use your fitted workflow for model and/or prediction explanations:
Learn more at https://www.tmwr.org/explain.html

Extract the model engine object from your fitted linear workflow.
⚠️ Never predict() with any extracted components!
05:00
How do you use your new tree_fit model in production?
Learn more at https://vetiver.rstudio.com
How do you use your new model tree_fit in production?
library(plumber)
pr() %>%
vetiver_api(v)
#> # Plumber router with 2 endpoints, 4 filters, and 1 sub-router.
#> # Use `pr_run()` on this object to start the API.
#> ├──[queryString]
#> ├──[body]
#> ├──[cookieParser]
#> ├──[sharedSecret]
#> ├──/logo
#> │ │ # Plumber static router serving from directory: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/vetiver
#> ├──/ping (GET)
#> └──/predict (POST)Learn more at https://vetiver.rstudio.com

Run the vetiver chunk in your .qmd.
Check out the automated visual documentation.
05:00