Grid search, combined with resampling, requires fitting a lot of models!
These models don’t depend on one another and can be run in parallel.
We can use a parallel backend to do this:
cores <- parallelly::availableCores(logical =FALSE)cl <- parallel::makePSOCKcluster(cores)doParallel::registerDoParallel(cl)# Now call `tune_grid()`!# Shut it down with:foreach::registerDoSEQ()parallel::stopCluster(cl)
Running in parallel
Speed-ups are fairly linear up to the number of physical cores (10 here).
Remember that last_fit() fits one time with the combined training and validation set, then evaluates one time with the testing set.
Your turn
Finalize your workflow with the best parameters.
Create a final fit.
08:00
Estimates of ROC AUC
Validation results from tuning:
glm_spline_res %>%show_best(metric ="roc_auc", n =1) %>%select(.metric, mean, n, std_err)#> # A tibble: 1 × 4#> .metric mean n std_err#> <chr> <dbl> <int> <dbl>#> 1 roc_auc 0.820 1 NA
Extract the final fitted workflow, fit using the training set:
final_glm_spline_wflow <- test_res %>%extract_workflow()# use this object to predict or deploypredict(final_glm_spline_wflow, nhl_test[1:3,])#> # A tibble: 3 × 1#> .pred_class#> <fct> #> 1 yes #> 2 yes #> 3 no