5.4 Modeling

The main purpose of a Graph is to build combined preprocessing and model fitting pipelines that can be used as mlr3 Learner. In the following we chain two preprocessing tasks

  • mutate (creation of a new feature)
  • filter (filtering the dataset)

and then chain a PO learner to train and predict on the modified dataset.

Until here we defined the main pipeline stored in Graph. Now we can train and predict the pipeline.

Rather than calling $train() and $predict() manually, we can put the pipeline Graph into a GraphLearner object. A GraphLearner encapsulates the whole pipeline (including the preprocessing steps) and can be put into resample() or benchmark() . If you are familiar with the old mlr package, this is the equivalent of all the make*Wrapper() functions. The pipeline being encapsulated (here Graph ) must always produce a Prediction with its $predict() call, so it will probably contain at least one PipeOpLearner .

This learner can be used for model fitting, resampling, benchmarking, and tuning.

5.4.1 Setting Hyperparameters

Individual POs offer hyperparameters because they contain $param_set slots that can be read and written from $param_set$values (via the paradox package). The parameters get passed down to the Graph, and finally to the GraphLearner . This makes it not only possible to easily change change the behavior of a Graph / GraphLearner and try different settings manually, but also to perform tuning using the mlr3tuning package.

5.4.2 Tuning

If you are unfamiliar with tuning in mlr3 yet, we recommend to take a look at the section about tuning first. Here we define a ParamSet for the “rpart” learner and the “variance” filter which should be optimized during tuning.

After having defined the PerformanceEvaluator, a random search with 10 iterations is created. For the inner resampling, we are simply doing holdout (single split into train/test) to keep the runtimes reasonable.

The tuning result can be inspected using the $tune_result() method.