6.2 Execution and Aggregation of Results

After the benchmark design is ready, we can directly call benchmark()

# execute the benchmark
bmr = benchmark(design)

Note in the code example above we used mlr_resamplings$mget() to instantiate the resampling instance for each Task.

After the benchmark, we can access the aggregated with .$aggregated():

bmr$aggregated(objects = FALSE)
##                hash  resample_result task_id          learner_id resampling_id
## 1: aa16bf476300fc84 <ResampleResult>    pima classif.featureless            cv
## 2: ede489b5e7d5983a <ResampleResult>    pima       classif.rpart            cv
## 3: 518bc2a7709ccca1 <ResampleResult>   sonar classif.featureless            cv
## 4: 97753fda02aef034 <ResampleResult>   sonar       classif.rpart            cv
## 5: 9bbdb27f46f0ffc3 <ResampleResult>    spam classif.featureless            cv
## 6: 0278680c4f4c3094 <ResampleResult>    spam       classif.rpart            cv
##    classif.acc classif.auc
## 1:      0.6511      0.5000
## 2:      0.7448      0.7805
## 3:      0.5329      0.5000
## 4:      0.7117      0.7427
## 5:      0.6060      0.5000
## 6:      0.8896      0.8931

We can aggregate the results further. For example, we might be interested which learner performed best over all tasks. Since we have data.table object here, we could do the following:

bmr$aggregated(objects = FALSE)[, list(acc = mean(classif.acc), auc = mean(classif.auc)), by = "learner_id"]
##             learner_id    acc    auc
## 1: classif.featureless 0.5966 0.5000
## 2:       classif.rpart 0.7820 0.8054

Alternatively, we can also use the tidyverse approach:

library("magrittr")
bmr$aggregated(objects = FALSE) %>%
  tibble::as_tibble() %>%
  dplyr::group_by(learner_id) %>%
  dplyr::summarise(acc = mean(classif.acc), auc = mean(classif.auc))
## # A tibble: 2 x 3
##   learner_id            acc   auc
##   <chr>               <dbl> <dbl>
## 1 classif.featureless 0.597 0.5  
## 2 classif.rpart       0.782 0.805

Unsurprisingly, the classification tree outperformed the featureless learner.