9  Hyperparameter Tuning

Hyperparameters are the parameters of the learners that control how a model is fit to the data. They are sometimes called second-level or second-order parameters of machine learning – the parameters of the models are the first-order parameters and “fit” to the data during model training. The hyperparameters of a learner can have a major impact on the performance of a learned model, but are often only optimized in an ad-hoc manner or not at all. This process is often called model ‘tuning’.

Hyperparameter tuning is supported via the mlr3tuning extension package. Below you can find an illustration of the general process:

At the heart of mlr3tuning are the R6 classes

9.1 The TuningInstance* Classes

We will examine the optimization of a simple classification tree on the Pima Indian Diabetes data set as an introductory example here.

library("mlr3verse")
task = tsk("pima")
print(task)
<TaskClassif:pima> (768 x 9): Pima Indian Diabetes
* Target: diabetes
* Properties: twoclass
* Features (8):
  - dbl (8): age, glucose, insulin, mass, pedigree, pregnant, pressure,
    triceps

We use the rpart classification tree and choose a subset of the hyperparameters we want to tune. This is often referred to as the “tuning space”. First, let’s look at all the hyperparameters that are available. Information on what they do can be found in the documentation of the learner.

learner = lrn("classif.rpart")
learner$param_set
<ParamSet>
                id    class lower upper nlevels        default value
 1:             cp ParamDbl     0     1     Inf           0.01      
 2:     keep_model ParamLgl    NA    NA       2          FALSE      
 3:     maxcompete ParamInt     0   Inf     Inf              4      
 4:       maxdepth ParamInt     1    30      30             30      
 5:   maxsurrogate ParamInt     0   Inf     Inf              5      
 6:      minbucket ParamInt     1   Inf     Inf <NoDefault[3]>      
 7:       minsplit ParamInt     1   Inf     Inf             20      
 8: surrogatestyle ParamInt     0     1       2              0      
 9:   usesurrogate ParamInt     0     2       3              2      
10:           xval ParamInt     0   Inf     Inf             10     0

Here, we opt to tune two hyperparameters:

  • The complexity hyperparameter cp that controls when the learner considers introducing another branch.
  • The minsplit hyperparameter that controls how many observations must be present in a leaf for another split to be attempted.

The tuning space needs to be bounded with lower and upper bounds for the values of the hyperparameters:

search_space = ps(
  cp = p_dbl(lower = 0.001, upper = 0.1),
  minsplit = p_int(lower = 1, upper = 10)
)
search_space
<ParamSet>
         id    class lower upper nlevels        default value
1:       cp ParamDbl 0.001   0.1     Inf <NoDefault[3]>      
2: minsplit ParamInt 1.000  10.0      10 <NoDefault[3]>      

The bounds are usually set based on experience.

Next, we need to specify how to evaluate the performance of a trained model. For this, we need to choose a resampling strategy and a performance measure.

hout = rsmp("holdout")
measure = msr("classif.ce")

Finally, we have to specify the budget available for tuning. This is a crucial step, as exhaustively evaluating all possible hyperparameter configurations is usually not feasible. mlr3 allows to specify complex termination criteria by selecting one of the available Terminators:

For this short introduction, we specify a budget of 20 iterations and then put everything together into a TuningInstanceSingleCrit:

Loading required package: paradox
evals20 = trm("evals", n_evals = 20)

instance = TuningInstanceSingleCrit$new(
  task = task,
  learner = learner,
  resampling = hout,
  measure = measure,
  search_space = search_space,
  terminator = evals20
)
instance
<TuningInstanceSingleCrit>
* State:  Not optimized
* Objective: <ObjectiveTuning:classif.rpart_on_pima>
* Search Space:
         id    class lower upper nlevels
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
* Terminator: <TerminatorEvals>

To start the tuning, we still need to select how the optimization should take place. In other words, we need to choose the optimization algorithm via the Tuner class.

9.2 The Tuner Class

The following algorithms are currently implemented in mlr3tuning:

If you’re interested in learning more about these approaches, the Wikipedia page on hyperparameter optimization is a good place to start.

In this example, we will use a simple grid search with a grid resolution of 5.

tuner = tnr("grid_search", resolution = 5)

As we have only numeric parameters, TunerGridSearch will create an equidistant grid between the respective upper and lower bounds. Our two-dimensional grid of resolution 5 consists of \(5^2 = 25\) configurations. Each configuration is a distinct setting of hyperparameter values for the previously defined Learner which is then fitted to the task and evaluated using the provided Resampling. All configurations will be examined by the tuner (in a random order), until either all configurations are evaluated or the Terminator signals that the budget is exhausted, i.e. here the tuner will stop after evaluating 20 of the 25 total configurations.

9.3 Triggering the Tuning

To start the tuning, we simply pass the TuningInstanceSingleCrit to the $optimize() method of the initialized Tuner. The tuner proceeds as follows:

  1. The Tuner proposes at least one hyperparameter configuration to evaluate (the Tuner may propose multiple points to be able to evaluate them in parallel, which can be controlled via the setting batch_size).
  2. For each configuration, the given Learner is fitted on the Task and evaluated using the provided Resampling. 1 All evaluations are stored in the archive of the TuningInstanceSingleCrit.
  3. The Terminator is queried if the budget is exhausted. 1 If the budget is not exhausted, go back to 1), else terminate.
  4. Determine the configurations with the best observed performance from the archive.
  5. Store the best configurations as result in the tuning instance object. The best hyperparameter settings ($result_learner_param_vals) and the corresponding measured performance ($result_y) can be retrieved from the tuning instance.
tuner$optimize(instance)
INFO  [21:32:51.906] [bbotk] Starting to optimize 2 parameter(s) with '<TunerGridSearch>' and '<TerminatorEvals> [n_evals=20, k=0]' 
INFO  [21:32:51.950] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:51.975] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.081] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.122] [mlr3] Finished benchmark 
INFO  [21:32:52.161] [bbotk] Result of batch 1: 
INFO  [21:32:52.163] [bbotk]      cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.163] [bbotk]  0.0505        5  0.2890625        0      0            0.021 
INFO  [21:32:52.163] [bbotk]                                 uhash 
INFO  [21:32:52.163] [bbotk]  4968d354-691a-4157-a9a8-612e5710fcdb 
INFO  [21:32:52.165] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.180] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.187] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.212] [mlr3] Finished benchmark 
INFO  [21:32:52.254] [bbotk] Result of batch 2: 
INFO  [21:32:52.256] [bbotk]     cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.256] [bbotk]  0.001        5    0.28125        0      0            0.015 
INFO  [21:32:52.256] [bbotk]                                 uhash 
INFO  [21:32:52.256] [bbotk]  51d4d469-fad5-469d-ac58-48e2410a4664 
INFO  [21:32:52.258] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.272] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.279] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.304] [mlr3] Finished benchmark 
INFO  [21:32:52.339] [bbotk] Result of batch 3: 
INFO  [21:32:52.341] [bbotk]       cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.341] [bbotk]  0.02575        1  0.2890625        0      0            0.015 
INFO  [21:32:52.341] [bbotk]                                 uhash 
INFO  [21:32:52.341] [bbotk]  375d3998-5a28-46c2-8577-52e904263a35 
INFO  [21:32:52.343] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.357] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.369] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.394] [mlr3] Finished benchmark 
INFO  [21:32:52.430] [bbotk] Result of batch 4: 
INFO  [21:32:52.432] [bbotk]       cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.432] [bbotk]  0.07525        5  0.2890625        0      0            0.015 
INFO  [21:32:52.432] [bbotk]                                 uhash 
INFO  [21:32:52.432] [bbotk]  0b966ad1-450f-4bd9-ba69-9d76e0517657 
INFO  [21:32:52.434] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.449] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.456] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.481] [mlr3] Finished benchmark 
INFO  [21:32:52.524] [bbotk] Result of batch 5: 
INFO  [21:32:52.526] [bbotk]      cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.526] [bbotk]  0.0505       10  0.2890625        0      0            0.013 
INFO  [21:32:52.526] [bbotk]                                 uhash 
INFO  [21:32:52.526] [bbotk]  ecbab2d5-03a5-43fa-900e-8f3ee5fc466d 
INFO  [21:32:52.528] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.542] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.549] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.574] [mlr3] Finished benchmark 
INFO  [21:32:52.612] [bbotk] Result of batch 6: 
INFO  [21:32:52.614] [bbotk]      cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.614] [bbotk]  0.0505        3  0.2890625        0      0            0.014 
INFO  [21:32:52.614] [bbotk]                                 uhash 
INFO  [21:32:52.614] [bbotk]  85eac96a-121e-4ac4-85f6-384ace82301b 
INFO  [21:32:52.617] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.631] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.645] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.669] [mlr3] Finished benchmark 
INFO  [21:32:52.707] [bbotk] Result of batch 7: 
INFO  [21:32:52.710] [bbotk]       cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.710] [bbotk]  0.07525        1  0.2890625        0      0            0.014 
INFO  [21:32:52.710] [bbotk]                                 uhash 
INFO  [21:32:52.710] [bbotk]  260ba49d-e295-4a92-a133-cf557696a7f2 
INFO  [21:32:52.712] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.727] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.734] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.762] [mlr3] Finished benchmark 
INFO  [21:32:52.805] [bbotk] Result of batch 8: 
INFO  [21:32:52.807] [bbotk]   cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.807] [bbotk]  0.1        1  0.2890625        0      0            0.017 
INFO  [21:32:52.807] [bbotk]                                 uhash 
INFO  [21:32:52.807] [bbotk]  69fbaded-f0bb-4b49-b76a-b98c21ade33b 
INFO  [21:32:52.809] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.824] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.831] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.855] [mlr3] Finished benchmark 
INFO  [21:32:52.894] [bbotk] Result of batch 9: 
INFO  [21:32:52.897] [bbotk]   cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.897] [bbotk]  0.1        8  0.2890625        0      0            0.014 
INFO  [21:32:52.897] [bbotk]                                 uhash 
INFO  [21:32:52.897] [bbotk]  e5ede0d6-1a33-41f6-93ca-080012a3b1be 
INFO  [21:32:52.898] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:52.922] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:52.929] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:52.955] [mlr3] Finished benchmark 
INFO  [21:32:52.992] [bbotk] Result of batch 10: 
INFO  [21:32:52.994] [bbotk]     cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:52.994] [bbotk]  0.001        8  0.2851562        0      0            0.016 
INFO  [21:32:52.994] [bbotk]                                 uhash 
INFO  [21:32:52.994] [bbotk]  5dbacc61-4495-4f53-a074-22102282cd9c 
INFO  [21:32:52.996] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.011] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.019] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.045] [mlr3] Finished benchmark 
INFO  [21:32:53.090] [bbotk] Result of batch 11: 
INFO  [21:32:53.093] [bbotk]       cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.093] [bbotk]  0.02575        5  0.2890625        0      0            0.015 
INFO  [21:32:53.093] [bbotk]                                 uhash 
INFO  [21:32:53.093] [bbotk]  b53504cc-fd76-4764-ba7c-db2d8b333a95 
INFO  [21:32:53.095] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.109] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.116] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.143] [mlr3] Finished benchmark 
INFO  [21:32:53.183] [bbotk] Result of batch 12: 
INFO  [21:32:53.185] [bbotk]     cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.185] [bbotk]  0.001        3  0.3242188        0      0            0.017 
INFO  [21:32:53.185] [bbotk]                                 uhash 
INFO  [21:32:53.185] [bbotk]  bab74462-926b-4db8-a51f-7f29f66ccb01 
INFO  [21:32:53.187] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.209] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.217] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.242] [mlr3] Finished benchmark 
INFO  [21:32:53.280] [bbotk] Result of batch 13: 
INFO  [21:32:53.283] [bbotk]       cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.283] [bbotk]  0.07525       10  0.2890625        0      0            0.015 
INFO  [21:32:53.283] [bbotk]                                 uhash 
INFO  [21:32:53.283] [bbotk]  0bcabe5d-6904-4349-89ec-aa8074e6eb67 
INFO  [21:32:53.285] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.301] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.308] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.335] [mlr3] Finished benchmark 
INFO  [21:32:53.377] [bbotk] Result of batch 14: 
INFO  [21:32:53.379] [bbotk]      cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.379] [bbotk]  0.0505        8  0.2890625        0      0            0.016 
INFO  [21:32:53.379] [bbotk]                                 uhash 
INFO  [21:32:53.379] [bbotk]  6137897a-40d3-4d56-a851-fa609286611a 
INFO  [21:32:53.381] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.396] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.403] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.429] [mlr3] Finished benchmark 
INFO  [21:32:53.471] [bbotk] Result of batch 15: 
INFO  [21:32:53.474] [bbotk]   cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.474] [bbotk]  0.1        3  0.2890625        0      0            0.014 
INFO  [21:32:53.474] [bbotk]                                 uhash 
INFO  [21:32:53.474] [bbotk]  92d9f595-ca47-4333-a569-604ff8852f18 
INFO  [21:32:53.476] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.497] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.504] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.529] [mlr3] Finished benchmark 
INFO  [21:32:53.567] [bbotk] Result of batch 16: 
INFO  [21:32:53.575] [bbotk]       cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.575] [bbotk]  0.02575        3  0.2890625        0      0            0.015 
INFO  [21:32:53.575] [bbotk]                                 uhash 
INFO  [21:32:53.575] [bbotk]  25d32a32-8148-45ce-9f07-107c040b7299 
INFO  [21:32:53.577] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.593] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.601] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.636] [mlr3] Finished benchmark 
INFO  [21:32:53.674] [bbotk] Result of batch 17: 
INFO  [21:32:53.676] [bbotk]   cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.676] [bbotk]  0.1       10  0.2890625        0      0            0.016 
INFO  [21:32:53.676] [bbotk]                                 uhash 
INFO  [21:32:53.676] [bbotk]  fd0b0b11-cc38-430b-b6e6-ffe2a1899820 
INFO  [21:32:53.678] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.694] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.701] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.728] [mlr3] Finished benchmark 
INFO  [21:32:53.775] [bbotk] Result of batch 18: 
INFO  [21:32:53.777] [bbotk]       cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.777] [bbotk]  0.07525        3  0.2890625        0      0            0.015 
INFO  [21:32:53.777] [bbotk]                                 uhash 
INFO  [21:32:53.777] [bbotk]  a3ffe12d-b571-4dcd-9d79-6209cb61f8a0 
INFO  [21:32:53.779] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.794] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.802] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.826] [mlr3] Finished benchmark 
INFO  [21:32:53.863] [bbotk] Result of batch 19: 
INFO  [21:32:53.865] [bbotk]      cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.865] [bbotk]  0.0505        1  0.2890625        0      0            0.014 
INFO  [21:32:53.865] [bbotk]                                 uhash 
INFO  [21:32:53.865] [bbotk]  63f12400-4f97-4c99-9ecb-271c46adafce 
INFO  [21:32:53.867] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:53.882] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:53.896] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:53.925] [mlr3] Finished benchmark 
INFO  [21:32:53.964] [bbotk] Result of batch 20: 
INFO  [21:32:53.966] [bbotk]     cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:53.966] [bbotk]  0.001        1  0.3359375        0      0            0.017 
INFO  [21:32:53.966] [bbotk]                                 uhash 
INFO  [21:32:53.966] [bbotk]  8111cdfa-d50f-4b67-afb9-0d1747d1eba6 
INFO  [21:32:53.974] [bbotk] Finished optimizing after 20 evaluation(s) 
INFO  [21:32:53.975] [bbotk] Result: 
INFO  [21:32:53.977] [bbotk]     cp minsplit learner_param_vals  x_domain classif.ce 
INFO  [21:32:53.977] [bbotk]  0.001        5          <list[3]> <list[2]>    0.28125 
      cp minsplit learner_param_vals  x_domain classif.ce
1: 0.001        5          <list[3]> <list[2]>    0.28125
instance$result_learner_param_vals
$xval
[1] 0

$cp
[1] 0.001

$minsplit
[1] 5
instance$result_y
classif.ce 
   0.28125 

You can investigate all of the evaluations that were performed; they are stored in the archive of the TuningInstanceSingleCrit and can be accessed by using as.data.table():

as.data.table(instance$archive)
         cp minsplit classif.ce x_domain_cp x_domain_minsplit runtime_learners
 1: 0.05050        5  0.2890625     0.05050                 5            0.021
 2: 0.00100        5  0.2812500     0.00100                 5            0.015
 3: 0.02575        1  0.2890625     0.02575                 1            0.015
 4: 0.07525        5  0.2890625     0.07525                 5            0.015
 5: 0.05050       10  0.2890625     0.05050                10            0.013
 6: 0.05050        3  0.2890625     0.05050                 3            0.014
 7: 0.07525        1  0.2890625     0.07525                 1            0.014
 8: 0.10000        1  0.2890625     0.10000                 1            0.017
 9: 0.10000        8  0.2890625     0.10000                 8            0.014
10: 0.00100        8  0.2851562     0.00100                 8            0.016
11: 0.02575        5  0.2890625     0.02575                 5            0.015
12: 0.00100        3  0.3242188     0.00100                 3            0.017
13: 0.07525       10  0.2890625     0.07525                10            0.015
14: 0.05050        8  0.2890625     0.05050                 8            0.016
15: 0.10000        3  0.2890625     0.10000                 3            0.014
16: 0.02575        3  0.2890625     0.02575                 3            0.015
17: 0.10000       10  0.2890625     0.10000                10            0.016
18: 0.07525        3  0.2890625     0.07525                 3            0.015
19: 0.05050        1  0.2890625     0.05050                 1            0.014
20: 0.00100        1  0.3359375     0.00100                 1            0.017
              timestamp batch_nr warnings errors      resample_result
 1: 2022-06-29 21:32:52        1        0      0 <ResampleResult[22]>
 2: 2022-06-29 21:32:52        2        0      0 <ResampleResult[22]>
 3: 2022-06-29 21:32:52        3        0      0 <ResampleResult[22]>
 4: 2022-06-29 21:32:52        4        0      0 <ResampleResult[22]>
 5: 2022-06-29 21:32:52        5        0      0 <ResampleResult[22]>
 6: 2022-06-29 21:32:52        6        0      0 <ResampleResult[22]>
 7: 2022-06-29 21:32:52        7        0      0 <ResampleResult[22]>
 8: 2022-06-29 21:32:52        8        0      0 <ResampleResult[22]>
 9: 2022-06-29 21:32:52        9        0      0 <ResampleResult[22]>
10: 2022-06-29 21:32:52       10        0      0 <ResampleResult[22]>
11: 2022-06-29 21:32:53       11        0      0 <ResampleResult[22]>
12: 2022-06-29 21:32:53       12        0      0 <ResampleResult[22]>
13: 2022-06-29 21:32:53       13        0      0 <ResampleResult[22]>
14: 2022-06-29 21:32:53       14        0      0 <ResampleResult[22]>
15: 2022-06-29 21:32:53       15        0      0 <ResampleResult[22]>
16: 2022-06-29 21:32:53       16        0      0 <ResampleResult[22]>
17: 2022-06-29 21:32:53       17        0      0 <ResampleResult[22]>
18: 2022-06-29 21:32:53       18        0      0 <ResampleResult[22]>
19: 2022-06-29 21:32:53       19        0      0 <ResampleResult[22]>
20: 2022-06-29 21:32:53       20        0      0 <ResampleResult[22]>

Altogether, the grid search evaluated 20/25 different hyperparameter configurations in a random order before the Terminator stopped the tuning. In this example there were multiple configurations with the same best classification error, and without other criteria, the first one was returned. You may want to choose the configuration with the lowest classification error as well as time to train the model or some other combination of criteria for hyper parameter selection. You can do this with r ref("TuningInstanceMultiCrit"), see Tuning with Multiple Performance Measures.

The associated resampling iterations can be accessed in the "BenchmarkResult") of the tuning instance:

instance$archive$benchmark_result
<BenchmarkResult> of 20 rows with 20 resampling runs
 nr task_id    learner_id resampling_id iters warnings errors
  1    pima classif.rpart       holdout     1        0      0
  2    pima classif.rpart       holdout     1        0      0
  3    pima classif.rpart       holdout     1        0      0
  4    pima classif.rpart       holdout     1        0      0
  5    pima classif.rpart       holdout     1        0      0
  6    pima classif.rpart       holdout     1        0      0
  7    pima classif.rpart       holdout     1        0      0
  8    pima classif.rpart       holdout     1        0      0
  9    pima classif.rpart       holdout     1        0      0
 10    pima classif.rpart       holdout     1        0      0
 11    pima classif.rpart       holdout     1        0      0
 12    pima classif.rpart       holdout     1        0      0
 13    pima classif.rpart       holdout     1        0      0
 14    pima classif.rpart       holdout     1        0      0
 15    pima classif.rpart       holdout     1        0      0
 16    pima classif.rpart       holdout     1        0      0
 17    pima classif.rpart       holdout     1        0      0
 18    pima classif.rpart       holdout     1        0      0
 19    pima classif.rpart       holdout     1        0      0
 20    pima classif.rpart       holdout     1        0      0

The uhash column links the resampling iterations to the evaluated configurations stored in instance$archive$data. This allows e.g. to score the included ResampleResults on a different performance measure.

instance$archive$benchmark_result$score(msr("classif.acc"))
                                   uhash nr              task task_id
 1: 4968d354-691a-4157-a9a8-612e5710fcdb  1 <TaskClassif[50]>    pima
 2: 51d4d469-fad5-469d-ac58-48e2410a4664  2 <TaskClassif[50]>    pima
 3: 375d3998-5a28-46c2-8577-52e904263a35  3 <TaskClassif[50]>    pima
 4: 0b966ad1-450f-4bd9-ba69-9d76e0517657  4 <TaskClassif[50]>    pima
 5: ecbab2d5-03a5-43fa-900e-8f3ee5fc466d  5 <TaskClassif[50]>    pima
 6: 85eac96a-121e-4ac4-85f6-384ace82301b  6 <TaskClassif[50]>    pima
 7: 260ba49d-e295-4a92-a133-cf557696a7f2  7 <TaskClassif[50]>    pima
 8: 69fbaded-f0bb-4b49-b76a-b98c21ade33b  8 <TaskClassif[50]>    pima
 9: e5ede0d6-1a33-41f6-93ca-080012a3b1be  9 <TaskClassif[50]>    pima
10: 5dbacc61-4495-4f53-a074-22102282cd9c 10 <TaskClassif[50]>    pima
11: b53504cc-fd76-4764-ba7c-db2d8b333a95 11 <TaskClassif[50]>    pima
12: bab74462-926b-4db8-a51f-7f29f66ccb01 12 <TaskClassif[50]>    pima
13: 0bcabe5d-6904-4349-89ec-aa8074e6eb67 13 <TaskClassif[50]>    pima
14: 6137897a-40d3-4d56-a851-fa609286611a 14 <TaskClassif[50]>    pima
15: 92d9f595-ca47-4333-a569-604ff8852f18 15 <TaskClassif[50]>    pima
16: 25d32a32-8148-45ce-9f07-107c040b7299 16 <TaskClassif[50]>    pima
17: fd0b0b11-cc38-430b-b6e6-ffe2a1899820 17 <TaskClassif[50]>    pima
18: a3ffe12d-b571-4dcd-9d79-6209cb61f8a0 18 <TaskClassif[50]>    pima
19: 63f12400-4f97-4c99-9ecb-271c46adafce 19 <TaskClassif[50]>    pima
20: 8111cdfa-d50f-4b67-afb9-0d1747d1eba6 20 <TaskClassif[50]>    pima
                      learner    learner_id              resampling
 1: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 2: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 3: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 4: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 5: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 6: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 7: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 8: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
 9: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
10: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
11: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
12: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
13: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
14: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
15: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
16: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
17: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
18: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
19: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
20: <LearnerClassifRpart[38]> classif.rpart <ResamplingHoldout[20]>
    resampling_id iteration              prediction classif.acc
 1:       holdout         1 <PredictionClassif[20]>   0.7109375
 2:       holdout         1 <PredictionClassif[20]>   0.7187500
 3:       holdout         1 <PredictionClassif[20]>   0.7109375
 4:       holdout         1 <PredictionClassif[20]>   0.7109375
 5:       holdout         1 <PredictionClassif[20]>   0.7109375
 6:       holdout         1 <PredictionClassif[20]>   0.7109375
 7:       holdout         1 <PredictionClassif[20]>   0.7109375
 8:       holdout         1 <PredictionClassif[20]>   0.7109375
 9:       holdout         1 <PredictionClassif[20]>   0.7109375
10:       holdout         1 <PredictionClassif[20]>   0.7148438
11:       holdout         1 <PredictionClassif[20]>   0.7109375
12:       holdout         1 <PredictionClassif[20]>   0.6757812
13:       holdout         1 <PredictionClassif[20]>   0.7109375
14:       holdout         1 <PredictionClassif[20]>   0.7109375
15:       holdout         1 <PredictionClassif[20]>   0.7109375
16:       holdout         1 <PredictionClassif[20]>   0.7109375
17:       holdout         1 <PredictionClassif[20]>   0.7109375
18:       holdout         1 <PredictionClassif[20]>   0.7109375
19:       holdout         1 <PredictionClassif[20]>   0.7109375
20:       holdout         1 <PredictionClassif[20]>   0.6640625

Now we can take the optimized hyperparameters, set them for the previously-created Learner, and train it on the full dataset.

learner$param_set$values = instance$result_learner_param_vals
learner$train(task)

The trained model can now be used to make a prediction on new, external data. Note that predicting on observations present in the Task should be avoided because the model has seen these observations already during tuning and training and therefore performance values would be statistically biased – the resulting performance measure would be over-optimistic. To get statistically unbiased performance estimates for a given task, nested resampling is required.

9.4 Tuning with Multiple Performance Measures

When tuning, you might want to use multiple criteria to find the best configuration of hyperparameters. For example, you might want the configuration with the lowest classification error and lowest time to train the model. The full list of performance measures can be found here.

Continuing the above example and tuning the same hyperparameters:

  • The complexity hyperparameter cp that controls when the learner considers introducing another branch.
  • The minsplit hyperparameter that controls how many observations must be present in a leaf for another split to be attempted.

The tuning process is identical to the previous example, however, this time we will specify two performance measures, classification error and time to train the model (time_train).

measures = msrs(c("classif.ce", "time_train"))

Instead of creating a new TuningInstanceSingleCrit with a single measure, we create a new TuningInstanceMultiCrit with the two measures we are interested in here. Otherwise, it is the same as above.

library("mlr3tuning")

evals20 = trm("evals", n_evals = 20)

instance = TuningInstanceMultiCrit$new(
  task = task,
  learner = learner,
  resampling = hout,
  measures = measures,
  search_space = search_space,
  terminator = evals20
)
instance
<TuningInstanceMultiCrit>
* State:  Not optimized
* Objective: <ObjectiveTuning:classif.rpart_on_pima>
* Search Space:
         id    class lower upper nlevels
1:       cp ParamDbl 0.001   0.1     Inf
2: minsplit ParamInt 1.000  10.0      10
* Terminator: <TerminatorEvals>

After triggering the tuning, we will have the configuration with the best classification error and time to train the model.

tuner$optimize(instance)
INFO  [21:32:54.473] [bbotk] Starting to optimize 2 parameter(s) with '<TunerGridSearch>' and '<TerminatorEvals> [n_evals=20, k=0]' 
INFO  [21:32:54.477] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:54.493] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:54.501] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:54.525] [mlr3] Finished benchmark 
INFO  [21:32:54.573] [bbotk] Result of batch 1: 
INFO  [21:32:54.575] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:54.575] [bbotk]  0.07525        1  0.2773438          0        0      0            0.013 
INFO  [21:32:54.575] [bbotk]                                 uhash 
INFO  [21:32:54.575] [bbotk]  105b7c7f-b9f6-4e7f-8ee3-444ab62581c2 
INFO  [21:32:54.577] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:54.593] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:54.600] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:54.624] [mlr3] Finished benchmark 
INFO  [21:32:54.675] [bbotk] Result of batch 2: 
INFO  [21:32:54.677] [bbotk]      cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:54.677] [bbotk]  0.0505        1  0.2773438          0        0      0            0.014 
INFO  [21:32:54.677] [bbotk]                                 uhash 
INFO  [21:32:54.677] [bbotk]  b77d5bfb-6272-47aa-8fd9-bb18ed136f99 
INFO  [21:32:54.679] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:54.695] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:54.702] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:54.737] [mlr3] Finished benchmark 
INFO  [21:32:54.805] [bbotk] Result of batch 3: 
INFO  [21:32:54.807] [bbotk]      cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:54.807] [bbotk]  0.0505       10  0.2773438          0        0      0            0.025 
INFO  [21:32:54.807] [bbotk]                                 uhash 
INFO  [21:32:54.807] [bbotk]  a6d1470a-7ddc-46d0-ba25-cdfff210fbfa 
INFO  [21:32:54.809] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:54.827] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:54.834] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:54.859] [mlr3] Finished benchmark 
INFO  [21:32:54.907] [bbotk] Result of batch 4: 
INFO  [21:32:54.909] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:54.909] [bbotk]  0.07525        5  0.2773438          0        0      0            0.014 
INFO  [21:32:54.909] [bbotk]                                 uhash 
INFO  [21:32:54.909] [bbotk]  52b1f8ce-18d8-4760-a2b0-56dbd38b6748 
INFO  [21:32:54.911] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:54.935] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:54.944] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:54.971] [mlr3] Finished benchmark 
INFO  [21:32:55.019] [bbotk] Result of batch 5: 
INFO  [21:32:55.021] [bbotk]   cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.021] [bbotk]  0.1       10  0.2773438          0        0      0            0.016 
INFO  [21:32:55.021] [bbotk]                                 uhash 
INFO  [21:32:55.021] [bbotk]  de1c69ee-c235-4852-ba1d-3044652dff25 
INFO  [21:32:55.023] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.041] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.048] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.083] [mlr3] Finished benchmark 
INFO  [21:32:55.130] [bbotk] Result of batch 6: 
INFO  [21:32:55.132] [bbotk]      cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.132] [bbotk]  0.0505        8  0.2773438          0        0      0            0.023 
INFO  [21:32:55.132] [bbotk]                                 uhash 
INFO  [21:32:55.132] [bbotk]  cadaec8f-eda6-4501-a929-7758359ec268 
INFO  [21:32:55.134] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.150] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.157] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.181] [mlr3] Finished benchmark 
INFO  [21:32:55.236] [bbotk] Result of batch 7: 
INFO  [21:32:55.238] [bbotk]   cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.238] [bbotk]  0.1        3  0.2773438          0        0      0            0.014 
INFO  [21:32:55.238] [bbotk]                                 uhash 
INFO  [21:32:55.238] [bbotk]  d05cb8b6-53bb-4c7c-a5f4-0091139e9d63 
INFO  [21:32:55.240] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.256] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.264] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.290] [mlr3] Finished benchmark 
INFO  [21:32:55.341] [bbotk] Result of batch 8: 
INFO  [21:32:55.349] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.349] [bbotk]  0.02575        1  0.2617188          0        0      0            0.016 
INFO  [21:32:55.349] [bbotk]                                 uhash 
INFO  [21:32:55.349] [bbotk]  6c612ea9-8851-4ff0-a9db-33e889c267b8 
INFO  [21:32:55.351] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.372] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.380] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.405] [mlr3] Finished benchmark 
INFO  [21:32:55.452] [bbotk] Result of batch 9: 
INFO  [21:32:55.454] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.454] [bbotk]  0.02575       10  0.2617188          0        0      0            0.015 
INFO  [21:32:55.454] [bbotk]                                 uhash 
INFO  [21:32:55.454] [bbotk]  8ead2337-1a7f-442e-bbfa-a0de00e664e9 
INFO  [21:32:55.456] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.479] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.487] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.515] [mlr3] Finished benchmark 
INFO  [21:32:55.561] [bbotk] Result of batch 10: 
INFO  [21:32:55.563] [bbotk]     cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.563] [bbotk]  0.001       10  0.2695312          0        0      0            0.017 
INFO  [21:32:55.563] [bbotk]                                 uhash 
INFO  [21:32:55.563] [bbotk]  84e0e9be-66ea-4149-b267-fb620b7320f5 
INFO  [21:32:55.565] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.581] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.588] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.620] [mlr3] Finished benchmark 
INFO  [21:32:55.671] [bbotk] Result of batch 11: 
INFO  [21:32:55.673] [bbotk]     cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.673] [bbotk]  0.001        8  0.2695312          0        0      0            0.021 
INFO  [21:32:55.673] [bbotk]                                 uhash 
INFO  [21:32:55.673] [bbotk]  27c0a032-8981-4420-b035-f0e285bf48ec 
INFO  [21:32:55.675] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.692] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.699] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.723] [mlr3] Finished benchmark 
INFO  [21:32:55.778] [bbotk] Result of batch 12: 
INFO  [21:32:55.780] [bbotk]      cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.780] [bbotk]  0.0505        3  0.2773438          0        0      0            0.014 
INFO  [21:32:55.780] [bbotk]                                 uhash 
INFO  [21:32:55.780] [bbotk]  2573317d-76fd-4a5a-8220-89a42d2b8233 
INFO  [21:32:55.782] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.798] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.805] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.830] [mlr3] Finished benchmark 
INFO  [21:32:55.883] [bbotk] Result of batch 13: 
INFO  [21:32:55.886] [bbotk]      cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.886] [bbotk]  0.0505        5  0.2773438          0        0      0            0.015 
INFO  [21:32:55.886] [bbotk]                                 uhash 
INFO  [21:32:55.886] [bbotk]  daf12ef1-ba93-46b2-a0f8-1e1905b1c66f 
INFO  [21:32:55.889] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:55.908] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:55.915] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:55.940] [mlr3] Finished benchmark 
INFO  [21:32:55.987] [bbotk] Result of batch 14: 
INFO  [21:32:55.989] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:55.989] [bbotk]  0.02575        3  0.2617188          0        0      0            0.015 
INFO  [21:32:55.989] [bbotk]                                 uhash 
INFO  [21:32:55.989] [bbotk]  22555d47-7b42-4a01-a039-90b1e45db940 
INFO  [21:32:55.991] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:56.016] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:56.025] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:56.051] [mlr3] Finished benchmark 
INFO  [21:32:56.104] [bbotk] Result of batch 15: 
INFO  [21:32:56.107] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:56.107] [bbotk]  0.07525        8  0.2773438          0        0      0            0.014 
INFO  [21:32:56.107] [bbotk]                                 uhash 
INFO  [21:32:56.107] [bbotk]  96a7f17b-b34c-43f0-ade6-61acd699e844 
INFO  [21:32:56.109] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:56.127] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:56.135] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:56.173] [mlr3] Finished benchmark 
INFO  [21:32:56.226] [bbotk] Result of batch 16: 
INFO  [21:32:56.229] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:56.229] [bbotk]  0.02575        8  0.2617188          0        0      0            0.025 
INFO  [21:32:56.229] [bbotk]                                 uhash 
INFO  [21:32:56.229] [bbotk]  bcbaef93-ffb3-4b9b-ae12-4f319df65f08 
INFO  [21:32:56.231] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:56.249] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:56.256] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:56.281] [mlr3] Finished benchmark 
INFO  [21:32:56.336] [bbotk] Result of batch 17: 
INFO  [21:32:56.338] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:56.338] [bbotk]  0.07525       10  0.2773438          0        0      0            0.015 
INFO  [21:32:56.338] [bbotk]                                 uhash 
INFO  [21:32:56.338] [bbotk]  944b7c67-9786-4de9-a7f0-07ca837b69b1 
INFO  [21:32:56.340] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:56.356] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:56.363] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:56.388] [mlr3] Finished benchmark 
INFO  [21:32:56.448] [bbotk] Result of batch 18: 
INFO  [21:32:56.450] [bbotk]       cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:56.450] [bbotk]  0.02575        5  0.2617188          0        0      0            0.015 
INFO  [21:32:56.450] [bbotk]                                 uhash 
INFO  [21:32:56.450] [bbotk]  9d711228-a324-4d0a-839d-47663ddfae4c 
INFO  [21:32:56.452] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:56.468] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:56.476] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:56.501] [mlr3] Finished benchmark 
INFO  [21:32:56.556] [bbotk] Result of batch 19: 
INFO  [21:32:56.559] [bbotk]   cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:56.559] [bbotk]  0.1        1  0.2773438          0        0      0            0.015 
INFO  [21:32:56.559] [bbotk]                                 uhash 
INFO  [21:32:56.559] [bbotk]  d5d92baf-f089-45a3-9480-a4aa018d15fb 
INFO  [21:32:56.561] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:56.582] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:56.590] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:56.616] [mlr3] Finished benchmark 
INFO  [21:32:56.664] [bbotk] Result of batch 20: 
INFO  [21:32:56.666] [bbotk]   cp minsplit classif.ce time_train warnings errors runtime_learners 
INFO  [21:32:56.666] [bbotk]  0.1        8  0.2773438          0        0      0            0.016 
INFO  [21:32:56.666] [bbotk]                                 uhash 
INFO  [21:32:56.666] [bbotk]  49b8be52-c6a2-49a2-8f58-7f99ea9c16a9 
INFO  [21:32:56.675] [bbotk] Finished optimizing after 20 evaluation(s) 
INFO  [21:32:56.676] [bbotk] Result: 
INFO  [21:32:56.684] [bbotk]       cp minsplit learner_param_vals  x_domain classif.ce time_train 
INFO  [21:32:56.684] [bbotk]  0.02575        1          <list[3]> <list[2]>  0.2617188          0 
INFO  [21:32:56.684] [bbotk]  0.02575       10          <list[3]> <list[2]>  0.2617188          0 
INFO  [21:32:56.684] [bbotk]  0.02575        3          <list[3]> <list[2]>  0.2617188          0 
INFO  [21:32:56.684] [bbotk]  0.02575        8          <list[3]> <list[2]>  0.2617188          0 
INFO  [21:32:56.684] [bbotk]  0.02575        5          <list[3]> <list[2]>  0.2617188          0 
        cp minsplit learner_param_vals  x_domain classif.ce time_train
1: 0.02575        1          <list[3]> <list[2]>  0.2617188          0
2: 0.02575       10          <list[3]> <list[2]>  0.2617188          0
3: 0.02575        3          <list[3]> <list[2]>  0.2617188          0
4: 0.02575        8          <list[3]> <list[2]>  0.2617188          0
5: 0.02575        5          <list[3]> <list[2]>  0.2617188          0
instance$result_learner_param_vals
[[1]]
[[1]]$xval
[1] 0

[[1]]$cp
[1] 0.02575

[[1]]$minsplit
[1] 1


[[2]]
[[2]]$xval
[1] 0

[[2]]$cp
[1] 0.02575

[[2]]$minsplit
[1] 10


[[3]]
[[3]]$xval
[1] 0

[[3]]$cp
[1] 0.02575

[[3]]$minsplit
[1] 3


[[4]]
[[4]]$xval
[1] 0

[[4]]$cp
[1] 0.02575

[[4]]$minsplit
[1] 8


[[5]]
[[5]]$xval
[1] 0

[[5]]$cp
[1] 0.02575

[[5]]$minsplit
[1] 5
instance$result_y
   classif.ce time_train
1:  0.2617188          0
2:  0.2617188          0
3:  0.2617188          0
4:  0.2617188          0
5:  0.2617188          0

9.5 Automating the Tuning

We can automate this entire process in mlr3 so that learners are tuned transparently, without the need to extract information on the best hyperparameter settings at the end. The AutoTuner wraps a learner and augments it with an automatic tuning process for a given set of hyperparameters. Because the AutoTuner itself inherits from the Learner base class, it can be used like any other learner. In keeping with our example above, we create a classification learner that tunes itself automatically. This classification tree learner tunes the parameters cp and minsplit using an inner resampling (holdout). We create a terminator which allows 10 evaluations, and use a simple random search as tuning algorithm:

learner = lrn("classif.rpart")
search_space = ps(
  cp = p_dbl(lower = 0.001, upper = 0.1),
  minsplit = p_int(lower = 1, upper = 10)
)
terminator = trm("evals", n_evals = 10)
tuner = tnr("random_search")

at = AutoTuner$new(
  learner = learner,
  resampling = rsmp("holdout"),
  measure = msr("classif.ce"),
  search_space = search_space,
  terminator = terminator,
  tuner = tuner
)
at
<AutoTuner:classif.rpart.tuned>
* Model: -
* Search Space:
<ParamSet>
         id    class lower upper nlevels        default value
1:       cp ParamDbl 0.001   0.1     Inf <NoDefault[3]>      
2: minsplit ParamInt 1.000  10.0      10 <NoDefault[3]>      
* Packages: mlr3, mlr3tuning, rpart
* Predict Type: response
* Feature Types: logical, integer, numeric, factor, ordered
* Properties: importance, missings, multiclass, selected_features,
  twoclass, weights

We can now use the learner like any other learner, calling the $train() and $predict() method. The differnce to a normal learner is that $train() runs the tuning, which will take longer than a normal training process.

at$train(task)
INFO  [21:32:56.874] [bbotk] Starting to optimize 2 parameter(s) with '<OptimizerRandomSearch>' and '<TerminatorEvals> [n_evals=10, k=0]' 
INFO  [21:32:56.900] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:56.918] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:56.925] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:56.949] [mlr3] Finished benchmark 
INFO  [21:32:56.983] [bbotk] Result of batch 1: 
INFO  [21:32:56.985] [bbotk]         cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:56.985] [bbotk]  0.0926891        3  0.2539062        0      0            0.014 
INFO  [21:32:56.985] [bbotk]                                 uhash 
INFO  [21:32:56.985] [bbotk]  d3d95421-468b-4060-b0e7-bf81536c5e0d 
INFO  [21:32:56.990] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.012] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.020] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.045] [mlr3] Finished benchmark 
INFO  [21:32:57.082] [bbotk] Result of batch 2: 
INFO  [21:32:57.084] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.084] [bbotk]  0.09506428        2  0.2539062        0      0            0.014 
INFO  [21:32:57.084] [bbotk]                                 uhash 
INFO  [21:32:57.084] [bbotk]  da639aef-a959-468a-8dd6-d66ba175dc08 
INFO  [21:32:57.089] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.104] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.112] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.145] [mlr3] Finished benchmark 
INFO  [21:32:57.181] [bbotk] Result of batch 3: 
INFO  [21:32:57.183] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.183] [bbotk]  0.06826063        3  0.2539062        0      0            0.022 
INFO  [21:32:57.183] [bbotk]                                 uhash 
INFO  [21:32:57.183] [bbotk]  e36c5f1f-1acd-49b7-89da-555bb70d9734 
INFO  [21:32:57.188] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.203] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.210] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.250] [mlr3] Finished benchmark 
INFO  [21:32:57.288] [bbotk] Result of batch 4: 
INFO  [21:32:57.291] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.291] [bbotk]  0.02106238        4  0.2460938        0      0             0.03 
INFO  [21:32:57.291] [bbotk]                                 uhash 
INFO  [21:32:57.291] [bbotk]  d81d012d-2caa-487f-b24b-ecd5ad14db57 
INFO  [21:32:57.296] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.310] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.317] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.341] [mlr3] Finished benchmark 
INFO  [21:32:57.386] [bbotk] Result of batch 5: 
INFO  [21:32:57.388] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.388] [bbotk]  0.07210373        1  0.2539062        0      0            0.013 
INFO  [21:32:57.388] [bbotk]                                 uhash 
INFO  [21:32:57.388] [bbotk]  aca2c2fb-5a32-491b-9bef-b9714f53079a 
INFO  [21:32:57.394] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.410] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.417] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.441] [mlr3] Finished benchmark 
INFO  [21:32:57.478] [bbotk] Result of batch 6: 
INFO  [21:32:57.480] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.480] [bbotk]  0.03204024        4  0.2578125        0      0            0.014 
INFO  [21:32:57.480] [bbotk]                                 uhash 
INFO  [21:32:57.480] [bbotk]  d5c4cb4f-b375-4b42-9225-13dd305c1d3a 
INFO  [21:32:57.484] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.507] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.515] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.540] [mlr3] Finished benchmark 
INFO  [21:32:57.576] [bbotk] Result of batch 7: 
INFO  [21:32:57.578] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.578] [bbotk]  0.03397072        4  0.2539062        0      0            0.015 
INFO  [21:32:57.578] [bbotk]                                 uhash 
INFO  [21:32:57.578] [bbotk]  7ae77207-bb59-4bde-a8a2-74f070e24280 
INFO  [21:32:57.583] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.597] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.605] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.788] [mlr3] Finished benchmark 
INFO  [21:32:57.823] [bbotk] Result of batch 8: 
INFO  [21:32:57.825] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.825] [bbotk]  0.08755967        9  0.2539062        0      0            0.174 
INFO  [21:32:57.825] [bbotk]                                 uhash 
INFO  [21:32:57.825] [bbotk]  38ea51ca-70f1-4960-ac04-98db3342a661 
INFO  [21:32:57.830] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.843] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.850] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.873] [mlr3] Finished benchmark 
INFO  [21:32:57.909] [bbotk] Result of batch 9: 
INFO  [21:32:57.912] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:57.912] [bbotk]  0.02264868        5  0.2460938        0      0            0.013 
INFO  [21:32:57.912] [bbotk]                                 uhash 
INFO  [21:32:57.912] [bbotk]  c8257794-7a55-45e0-bc79-f24408932b50 
INFO  [21:32:57.916] [bbotk] Evaluating 1 configuration(s) 
INFO  [21:32:57.932] [mlr3] Running benchmark with 1 resampling iterations 
INFO  [21:32:57.940] [mlr3] Applying learner 'classif.rpart' on task 'pima' (iter 1/1) 
INFO  [21:32:57.966] [mlr3] Finished benchmark 
INFO  [21:32:58.003] [bbotk] Result of batch 10: 
INFO  [21:32:58.005] [bbotk]          cp minsplit classif.ce warnings errors runtime_learners 
INFO  [21:32:58.005] [bbotk]  0.04781472        8  0.2539062        0      0            0.015 
INFO  [21:32:58.005] [bbotk]                                 uhash 
INFO  [21:32:58.005] [bbotk]  60fcff48-25e6-4329-8766-4f66f625836d 
INFO  [21:32:58.015] [bbotk] Finished optimizing after 10 evaluation(s) 
INFO  [21:32:58.016] [bbotk] Result: 
INFO  [21:32:58.017] [bbotk]          cp minsplit learner_param_vals  x_domain classif.ce 
INFO  [21:32:58.017] [bbotk]  0.02106238        4          <list[3]> <list[2]>  0.2460938 

We can also pass it to resample() and benchmark(), just like any other learner. This would result in a nested resampling.