hotel_rec <- recipe(avg_price_per_room ~ ., data = hotel_train) %>% step_YeoJohnson(lead_time) %>% step_dummy_hash(agent, num_terms = tune("agent hash")) %>% step_dummy_hash(company, num_terms = tune("company hash")) %>% step_zv(all_predictors()) lgbm_spec <- boost_tree(trees = tune(), learn_rate = tune(), min_n = tune()) %>% set_mode("regression") %>% set_engine("lightgbm", num_threads = 1) lgbm_wflow <- workflow(hotel_rec, lgbm_spec) lgbm_param <- lgbm_wflow %>% extract_parameter_set_dials() %>% update(`agent hash` = num_hash(c(3, 8)), `company hash` = num_hash(c(3, 8)))
In the last section, we evaluated 250 models (25 candidates times 10 resamples).
We can make this go faster using parallel processing.
Also, for some models, we can fit far fewer models than the number that are being evaluated.
Xtrees can often predict on candidates with less than
Both of these methods can lead to enormous speed-ups.
Racing is an old tool that we can use to go even faster.
This can result in fitting a small number of models.
How do we eliminate tuning parameter combinations?
There are a few methods to do so. We’ll use one based on analysis of variance (ANOVA).
However… there is typically a large difference between resamples in the results.
Here are some realistic (but simulated) examples of two candidate models.
An error estimate is measured for each of 10 resamples.
There is usually a significant resample-to-resample effect (rank corr: 0.83).
One way to evaluate these models is to do a paired t-test
With \(n = 10\) resamples, the confidence interval is (0.99, 2.8), indicating that candidate number 2 has smaller error.
What if we were to compare each model candidate to the current best at each resample?
One shows superiority when 4 resamples have been evaluated.
One version of racing uses a mixed model ANOVA to construct one-sided confidence intervals for each candidate versus the current best.
Any candidates whose bound does not include zero are discarded. Here is an animation.
The resamples are analyzed in a random order.
Kuhn (2014) has examples and simulations to show that the method works.
The finetune package has functions
The syntax and helper functions are extremely similar to those shown for
show_best(lgbm_race_res, metric = "mae") #> # A tibble: 2 × 11 #> trees min_n learn_rate `agent hash` `company hash` .metric .estimator mean n std_err .config #> <int> <int> <dbl> <int> <int> <chr> <chr> <dbl> <int> <dbl> <chr> #> 1 1516 7 0.0421 176 12 mae standard 9.60 10 0.181 Preprocessor42_Model1 #> 2 1014 5 0.0791 35 181 mae standard 9.61 10 0.179 Preprocessor06_Model1
tune_race_anova()with a different seed.