03:00
Machine learning with tidymodels
How do you fit a linear model in R?
How many different ways can you think of?
03:00
lm
for linear model
glm
for generalized linear model (e.g. logistic regression)
glmnet
for regularized regression
keras
for regression using TensorFlow
stan
for Bayesian regression
spark
for large data sets
All available models are listed at https://www.tidymodels.org/find/parsnip/
Run the tree_spec
chunk in your .qmd
.
Edit this code so it creates a different model.
05:00
All available models are listed at https://www.tidymodels.org/find/parsnip/
\(\mbox{latency} = \beta_0 + \beta_1\cdot\mbox{age} + \epsilon\)
Series of splits or if/then statements based on predictors
First the tree grows until some condition is met (maximum depth, no more data)
Then the tree is pruned to reduce its complexity
workflow()
? fit()
and predict()
apply to the preprocessing steps in addition to the actual model fittree_spec <-
decision_tree() %>%
set_mode("regression")
tree_spec %>%
fit(latency ~ ., data = frog_train)
#> parsnip model object
#>
#> n= 456
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 456 2197966.00 92.90351
#> 2) age>=4.947975 256 252347.40 60.89844
#> 4) treatment=control 131 91424.06 48.42748 *
#> 5) treatment=gentamicin 125 119197.90 73.96800 *
#> 3) age< 4.947975 200 1347741.00 133.87000
#> 6) treatment=control 140 986790.70 118.25710
#> 12) reflex=mid,full 129 754363.70 111.56590 *
#> 13) reflex=low 11 158918.20 196.72730 *
#> 7) treatment=gentamicin 60 247194.60 170.30000
#> 14) age< 4.664439 30 102190.20 147.83330
#> 28) age>=4.566638 22 53953.86 129.77270 *
#> 29) age< 4.566638 8 21326.00 197.50000 *
#> 15) age>=4.664439 30 114719.40 192.76670 *
tree_spec <-
decision_tree() %>%
set_mode("regression")
workflow() %>%
add_formula(latency ~ .) %>%
add_model(tree_spec) %>%
fit(data = frog_train)
#> ══ Workflow [trained] ════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: decision_tree()
#>
#> ── Preprocessor ──────────────────────────────────────────────────────
#> latency ~ .
#>
#> ── Model ─────────────────────────────────────────────────────────────
#> n= 456
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 456 2197966.00 92.90351
#> 2) age>=4.947975 256 252347.40 60.89844
#> 4) treatment=control 131 91424.06 48.42748 *
#> 5) treatment=gentamicin 125 119197.90 73.96800 *
#> 3) age< 4.947975 200 1347741.00 133.87000
#> 6) treatment=control 140 986790.70 118.25710
#> 12) reflex=mid,full 129 754363.70 111.56590 *
#> 13) reflex=low 11 158918.20 196.72730 *
#> 7) treatment=gentamicin 60 247194.60 170.30000
#> 14) age< 4.664439 30 102190.20 147.83330
#> 28) age>=4.566638 22 53953.86 129.77270 *
#> 29) age< 4.566638 8 21326.00 197.50000 *
#> 15) age>=4.664439 30 114719.40 192.76670 *
tree_spec <-
decision_tree() %>%
set_mode("regression")
workflow(latency ~ ., tree_spec) %>%
fit(data = frog_train)
#> ══ Workflow [trained] ════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: decision_tree()
#>
#> ── Preprocessor ──────────────────────────────────────────────────────
#> latency ~ .
#>
#> ── Model ─────────────────────────────────────────────────────────────
#> n= 456
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 456 2197966.00 92.90351
#> 2) age>=4.947975 256 252347.40 60.89844
#> 4) treatment=control 131 91424.06 48.42748 *
#> 5) treatment=gentamicin 125 119197.90 73.96800 *
#> 3) age< 4.947975 200 1347741.00 133.87000
#> 6) treatment=control 140 986790.70 118.25710
#> 12) reflex=mid,full 129 754363.70 111.56590 *
#> 13) reflex=low 11 158918.20 196.72730 *
#> 7) treatment=gentamicin 60 247194.60 170.30000
#> 14) age< 4.664439 30 102190.20 147.83330
#> 28) age>=4.566638 22 53953.86 129.77270 *
#> 29) age< 4.566638 8 21326.00 197.50000 *
#> 15) age>=4.664439 30 114719.40 192.76670 *
Run the tree_wflow
chunk in your .qmd
.
Edit this code so it uses a linear model.
05:00
How do you use your new tree_fit
model?
Run:
predict(tree_fit, new_data = frog_test)
What do you get?
03:00
Run:
augment(tree_fit, new_data = frog_test)
What do you get?
03:00
new_data
and the output are the sameHow do you understand your new tree_fit
model?
How do you understand your new tree_fit
model?
You can extract_*()
several components of your fitted workflow.
How do you understand your new tree_fit
model?
You can use your fitted workflow for model and/or prediction explanations:
Learn more at https://www.tmwr.org/explain.html
Extract the model engine object from your fitted linear workflow.
⚠️ Never predict()
with any extracted components!
05:00
How do you use your new tree_fit
model in production?
Learn more at https://vetiver.rstudio.com
How do you use your new model tree_fit
in production?
library(plumber)
pr() %>%
vetiver_api(v)
#> # Plumber router with 2 endpoints, 4 filters, and 1 sub-router.
#> # Use `pr_run()` on this object to start the API.
#> ├──[queryString]
#> ├──[body]
#> ├──[cookieParser]
#> ├──[sharedSecret]
#> ├──/logo
#> │ │ # Plumber static router serving from directory: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/vetiver
#> ├──/ping (GET)
#> └──/predict (POST)
Learn more at https://vetiver.rstudio.com
Run the vetiver
chunk in your .qmd
.
Check out the automated visual documentation.
05:00