The data, which mlr3 encapsulates in tasks, is split into non-overlapping training and test sets. Since we are interested in models that extrapolate to new data rather than just memorizing the training data, the separate test data allows to objectively evaluate models with respect to generalization. The training data is given to a machine learning algorithm, which we call a learner in mlr3. The learner uses the training data to build a model of the relationship of the input features to the output target values. This model is then used to produce predictions on the test data, which are compared to the ground truth values to assess the quality of the model. mlr3 offers a number of different measures to quantify how well a model performs based on the difference between predicted and actual values. Usually this measure is a numeric score.
The process of splitting up data into training and test sets, building a model, and evaluating it may be repeated several times, resampling different training and test sets from the original data each time. Multiple resampling iterations allow us to get a better, more generalizable performance estimate for a particular type of model as it is tested under different conditions and less likely to get lucky or unlucky because of a particular way the data was resampled.
In many cases, this simple workflow is not sufficient to deal with real-world data, which may require normalization, imputation of missing values, or feature selection. We will cover more complex workflows that allow to do this and even more later in the book.
This chapter covers the following subtopics:
Tasks encapsulate the data with meta-information, such as the name of the prediction target column. We cover how to:
- access predefined tasks,
- specify a task type,
- create a task,
- work with a task’s API,
- assign roles to rows and columns of a task,
- implement task mutators, and
- retrieve the data that is stored in a task.
- access the set of classification and regression learners that come with mlr3 and retrieve a specific learner,
- access the set of hyperparameter values of a learner and modify them.
How to modify and extend learners is covered in a supplemental advanced technical section.
Train and predict
- properly set up tasks and learners,
- set up train and test splits for a task,
- train the learner on the training set to produce a model,
- generate predictions on the test set, and
- assess the performance of the model by comparing predicted and actual values.
A resampling is a method to create training and test splits. We cover how to
- access and select resampling strategies,
- instantiate the split into training and test sets by applying the resampling, and
- execute the resampling to obtain results.
Benchmarking is used to compare the performance of different models, for example models trained with different learners, on different tasks, or with different resampling methods. We cover how to
- create a benchmarking design,
- execute a design and aggregate results, and
- convert benchmarking objects to resample objects.
Binary classification is a special case of classification where the target variable to predict has only two possible values. In this case, additional considerations apply; in particular:
- ROC curves and the threshold where to predict one class versus the other, and
- threshold tuning (WIP).
Before we get into the details of how to use mlr3 for machine learning, we give a brief introduction to R6 as it is a relatively new part of R. mlr3 heavily relies on R6 and all basic building blocks it provides are R6 classes: