Parallelization refers to the process of running multiple jobs in parallel, simultaneously. This process allows for significant savings in computing power. We distinguish between implicit parallelism and explicit parallelism.
5.1.1 Implicit Parallelization
We talk about implicit parallelization in this context if we call external code (i.e., code from foreign CRAN packages) which runs in parallel.
Many machine learning algorithms can parallelize their model fit using threading, e.g.
Unfortunately, threading conflicts with certain parallel backends used during explicit parallelization, causing the system to be overutilized in the best case and causing hangs or segfaults in the worst case.
For this reason, we introduced the convention that implicit parallelization is turned off in the defaults, but can be enabled again via a hyperparameter which is tagged with the label
library("mlr3verse") = lrn("classif.ranger") learner $param_set$ids(tags = "threads")learner
##  "num.threads"
To enable the parallelization for this learner, we simply can call the helper function
# set to use 4 CPUs set_threads(learner, n = 4)
## <LearnerClassifRanger:classif.ranger> ## * Model: - ## * Parameters: num.threads=4 ## * Packages: ranger ## * Predict Type: response ## * Feature types: logical, integer, numeric, character, factor, ordered ## * Properties: importance, multiclass, oob_error, twoclass, weights
# auto-detect cores on the local machine set_threads(learner)
## <LearnerClassifRanger:classif.ranger> ## * Model: - ## * Parameters: num.threads=2 ## * Packages: ranger ## * Predict Type: response ## * Feature types: logical, integer, numeric, character, factor, ordered ## * Properties: importance, multiclass, oob_error, twoclass, weights
This also works for filters from mlr3filters and lists of objects, even if some objects do not support threading at all:
# retrieve 2 filters # * variance filter with no support for threading # * mrmr filter with threading support = flts(c("variance", "mrmr")) filters # set threads for all filters which support it set_threads(filters, n = 4)
## [] ## <FilterVariance:variance> ## Task Types: classif, regr ## Task Properties: - ## Packages: stats ## Feature types: integer, numeric ## ## [] ## <FilterMRMR:mrmr> ## Task Types: classif, regr ## Task Properties: - ## Packages: praznik ## Feature types: integer, numeric, factor, ordered
# variance filter is unchanged 1]]$param_setfilters[[
## <ParamSet> ## id class lower upper nlevels default value ## 1: na.rm ParamLgl NA NA 2 TRUE
# mrmr now works in parallel with 4 cores 2]]$param_setfilters[[
## <ParamSet> ## id class lower upper nlevels default value ## 1: threads ParamInt 0 Inf Inf 0 1
5.1.2 Explicit Parallelization
We talk about explicit parallelization here if mlr3 starts the parallelization itself.
The abstraction implemented in future is used to support a broad range of parallel backends.
There are two use cases where mlr3 calls future:
During resampling, all resampling iterations can be executed in parallelization.
The same holds for benchmarking, where additionally all combinations in the provided design are also independent.
These loops are performed by future using the parallel backend configured with
Extension packages like mlr3tuning internally call
benchmark() during tuning and thus work in parallel, too.
In this section, we will use the
spam task and a simple
classification tree to showcase the explicit parallelization.
In this example, the
future::multisession parallel backend is selected which should work on all systems.
# select the multisession backend ::plan("multisession") future = tsk("spam") task = lrn("classif.rpart") learner = rsmp("subsampling") resampling = Sys.time() time resample(task, learner, resampling) Sys.time() - time
By default, all CPUs of your machine are used unless you specify argument
On most systems you should see a decrease in the reported elapsed time, but in practice you cannot expect the runtime to fall linearly as the number of cores increases (Amdahl’s law). Depending on the parallel backend, the technical overhead for starting workers, communicating objects, sending back results and shutting down the workers can be quite large. Therefore, it is advised to only enable parallelization for resamplings where each iteration runs at least some seconds.
If you are transitioning from mlr, you might be used to selecting different parallelization levels, e.g. for resampling, benchmarking or tuning. In mlr3 this is no longer required (except for nested resampling, briefly described in the following section). All kind of events are rolled out on the same level. Therefore, there is no need to decide whether you want to parallelize the tuning OR the resampling.
Just lean back and let the machine do the work :-)
5.1.3 Nested Resampling Parallelization
Nested resampling results in two nested resampling loops. We can choose different parallelization backends for the inner and outer resampling loop, respectively. We just have to pass a list of future backends:
# Runs the outer loop in parallel and the inner loop sequentially ::plan(list("multisession", "sequential")) future# Runs the outer loop sequentially and the inner loop in parallel ::plan(list("sequential", "multisession"))future
While nesting real parallelization backends is often unintended and causes unnecessary overhead, it is useful in some distributed computing setups. It can be achieved with future by forcing a fixed number of workers for each loop:
# Runs both loops in parallel ::plan(list(future::tweak("multisession", workers = 2), future::tweak("multisession", workers = 4))) future