Performance Evaluation and Comparison
Now that we are familiar with the basics of how to create tasks and learners, how to fit models, let’s have a look at some of the details, and in particular how mlr3 makes it easy to perform many common machine learning steps.
We will cover the following topics:
Performance Scoring
Resampling
Resampling is a methodology to create training and test splits. We cover how to
- access and select resampling strategies,
- instantiate the split into training and test sets by applying the resampling, and
- execute the resampling to obtain results.
Additional information on resampling can be found in the section about nested resampling and in the chapter on model optimization.
Benchmarking
Benchmarking is used to compare the performance of different models, for example models trained with different learners, on different tasks, or with different resampling methods. This is usually done to get an overview of how different methods perform across different tasks. We cover how to
- create a benchmarking design,
- execute a design and aggregate results, and
- convert benchmarking objects to other types of objects that can be used for different purposes.