Grid search, combined with resampling, requires fitting a lot of models!
These models don’t depend on one another and can be run in parallel.
We can use a parallel backend to do this:
cores <- parallel::detectCores(logical =FALSE)cl <- parallel::makePSOCKcluster(cores)doParallel::registerDoParallel(cl)# Now call `tune_grid()`!# Shut it down with:foreach::registerDoSEQ()parallel::stopCluster(cl)
Running in parallel
Speed-ups are fairly linear up to the number of physical cores (10 here).
test_res <- glm_spline_wflow %>%last_fit(split = nhl_split)test_res#> # Resampling results#> # Manual resampling #> # A tibble: 1 × 6#> splits id .metrics .notes .predictions .workflow #> <list> <chr> <list> <list> <list> <list> #> 1 <split [9110/3037]> train/test split <tibble [2 × 4]> <tibble [1 × 3]> <tibble [3,037 × 6]> <workflow>#> #> There were issues with some computations:#> #> - Warning(s) x1: prediction from a rank-deficient fit may be misleading#> #> Run `show_notes(.Last.tune.result)` for more information.
Remember that last_fit() fits one time with the combined training and validation set, then evaluates one time with the testing set.
Your turn
Finalize your workflow with the best parameters.
Create a final fit.
08:00
Estimates of ROC AUC
Validation results from tuning:
glm_spline_res %>%show_best(metric ="roc_auc", n =1) %>%select(.metric, mean, n, std_err)#> # A tibble: 1 × 4#> .metric mean n std_err#> <chr> <dbl> <int> <dbl>#> 1 roc_auc 0.653 1 NA
Extract the final fitted workflow, fit using the training set:
final_glm_spline_wflow <- test_res %>%extract_workflow()# use this object to predict or deploypredict(final_glm_spline_wflow, nhl_test[1:3,])#> # A tibble: 3 × 1#> .pred_class#> <fct> #> 1 no #> 2 yes #> 3 no