6.1 Design Creation

In mlr3 we require you to supply a “design” of your benchmark experiment. By “design” we essentially mean the matrix of settings you want to execute. A “design” consists of Task, Learner and Resampling. Additionally, you can supply different Measure along side.

Here, we call benchmark to perform a single holdout split on a single task and two learners:

design = data.table(
  task = mlr_tasks$mget("iris"),
  learner = mlr_learners$mget(c("classif.rpart", "classif.featureless")),
  resampling = mlr_resamplings$mget("holdout")
##             task                     learner          resampling
## 1: <TaskClassif>       <LearnerClassifRpart> <ResamplingHoldout>
## 2: <TaskClassif> <LearnerClassifFeatureless> <ResamplingHoldout>
bmr = benchmark(design)

Note that the holdout splits have been automatically instantiated for each row of the design. As a result, the rpart learner used a different training set than the featureless learner. However, for comparison of learners you usually want the learners to see the same splits into train and test sets. To overcome this issue, the resampling strategy needs to be manually instantiated before creating the design.

While the interface of benchmark() allows full flexibility, the creation of such design tables can be tedious. Therefore, mlr3 provides a convenience function to quickly generate design tables and instantiate resampling strategies in an exhaustive grid fashion: expand_grid().

# get some example tasks
tasks = mlr_tasks$mget(c("pima", "sonar", "spam"))

# set measures for all tasks: accuracy (acc) and area under the curve (auc)
measures = mlr_measures$mget(c("classif.acc", "classif.auc"))
tasks = lapply(tasks, function(task) { task$measures = measures; task })

# get a featureless learner and a classification tree
learners = mlr_learners$mget(c("classif.featureless", "classif.rpart"))

# let the learners predict probabilities instead of class labels
learners$classif.featureless$predict_type = "prob"
learners$classif.rpart$predict_type = "prob"

# compare via 10-fold cross validation
resamplings = mlr_resamplings$mget("cv")

# create a BenchmarkDesign object
design = expand_grid(tasks, learners, resamplings)
##             task                     learner     resampling
## 1: <TaskClassif> <LearnerClassifFeatureless> <ResamplingCV>
## 2: <TaskClassif>       <LearnerClassifRpart> <ResamplingCV>
## 3: <TaskClassif> <LearnerClassifFeatureless> <ResamplingCV>
## 4: <TaskClassif>       <LearnerClassifRpart> <ResamplingCV>
## 5: <TaskClassif> <LearnerClassifFeatureless> <ResamplingCV>
## 6: <TaskClassif>       <LearnerClassifRpart> <ResamplingCV>