3.3 Nested Resampling
In order to obtain unbiased performance estimates for learners, all parts of the model building (preprocessing and model selection steps) should be included in the resampling, i.e., repeated for every pair of training/test data. For steps that themselves require resampling like hyperparameter tuning or feature-selection (via the wrapper approach) this results in two nested resampling loops.
The graphic above illustrates nested resampling for parameter tuning with 3-fold cross-validation in the outer and 4-fold cross-validation in the inner loop.
In the outer resampling loop, we have three pairs of training/test sets. On each of these outer training sets parameter tuning is done, thereby executing the inner resampling loop. This way, we get one set of selected hyperparameters for each outer training set. Then the learner is fitted on each outer training set using the corresponding selected hyperparameters. Subsequently, we can evaluate the performance of the learner on the outer test sets.
- Generate a wrapped Learner via class
mlr3filters::AutoSelect(not yet implemented).
- Specify all required settings - see section “Automating the Tuning” for help.
- Call function
benchmark()with the created
You can freely combine different inner and outer resampling strategies.
A common setup is prediction and performance evaluation on a fixed outer test set.
This can be achieved by passing the
Resampling strategy (
rsmp("holdout")) as the outer resampling instance to either
The inner resampling strategy could be a cross-validation one (
rsmp("cv")) as the sizes of the outer training sets might differ.
Per default, the inner resample description is instantiated once for every outer training set.
Note that nested resampling is computationally expensive. For this reason we use relatively small search spaces and a low number of resampling iterations in the examples shown below. In practice, you normally have to increase both. As this is computationally intensive you might want to have a look at the section on Parallelization.
To optimize hyperparameters or conduct feature selection in a nested resampling you need to create learners using either:
mlr3filters::AutoSelectclass (not yet implemented)
library("mlr3tuning") task = tsk("iris") learner = lrn("classif.rpart") resampling = rsmp("holdout") measures = msr("classif.ce") param_set = paradox::ParamSet$new( params = list(paradox::ParamDbl$new("cp", lower = 0.001, upper = 0.1))) terminator = term("evals", n_evals = 5) tuner = tnr("grid_search", resolution = 10) at = AutoTuner$new(learner, resampling, measures = measures, param_set, terminator, tuner = tuner)
Now construct the
For example, we can query the aggregated performance result:
## classif.ce ## 0.06
Check for any errors in the folds during execution (if there is not output, warnings or errors recorded, this is an empty
## Empty data.table (0 rows and 2 cols): iteration,msg
Or take a look at the confusion matrix of the joined predictions:
## truth ## response setosa versicolor virginica ## setosa 50 0 0 ## versicolor 0 45 4 ## virginica 0 5 46