## 5.1 Settings

In this example we use the iris task and a simple classification tree (package rpart).

task = mlr_tasks$get("iris") learner = mlr_learners$get("classif.rpart")

When performing resampling with a dataset, we first need to define which approach should be used. The resampling strategies of mlr3 can be queried using the .$keys() function of the mlr_resampling dictionary. mlr_resamplings$keys()
## [1] "bootstrap"   "custom"      "cv"          "cv3"         "holdout"
## [6] "repeated_cv" "subsampling"

Additional resampling methods for special use cases will be available via extension packages, such as mlr3spatiotemporal for spatial data (still in development).

The experiment conducted in the train/predict/score chapter is equivalent to “holdout”, so let’s consider this one first.

resampling = mlr_resamplings$get("holdout") print(resampling) ## <ResamplingHoldout> with 1 iterations ## Instantiated: FALSE ## Parameters: ratio=0.6667 ## ## Public: clone, duplicated_ids, format, hash, id, instance, ## instantiate, is_instantiated, iters, param_set, task_hash, test_set, ## train_set print(resampling$param_set$values) ##$ratio
## [1] 0.6667

Note that the Instantianated field is set to FALSE. This means we did not actually apply the strategy on a dataset yet but just performed a dry-run. Applying the strategy on a dataset is done in section next Instantation.

By default we get a .66/.33 split of the data. There are two ways how the ratio can be changed:

1. Overwriting the slot in .$param_set$values using a named list.
resampling$param_set$values = list(ratio = 0.8)
1. Specifying the resampling parameters directly during creation using the param_vals argument:
mlr_resamplings\$get("holdout", param_vals = list(ratio = 0.8))
## <ResamplingHoldout> with 1 iterations
## Instantiated: FALSE
## Parameters: ratio=0.8
##
## Public: clone, duplicated_ids, format, hash, id, instance,
##   instantiate, is_instantiated, iters, param_set, task_hash, test_set,
##   train_set