## 7.1 Survival Analysis

Survival analysis examines data on whether a specific event of interest takes place and how long it takes till this event occurs. One cannot use ordinary regression analysis when dealing with survival analysis data sets. Firstly, survival data contains solely positive values and therefore needs to be transformed to avoid biases. Secondly, ordinary regression analysis cannot deal with censored observations accordingly. Censored observations are observations in which the event of interest has not occurred, yet. Survival analysis allows the user to handle censored data with limited time frames that sometimes do not entail the event of interest. Note that survival analysis accounts for both censored and uncensored observations while adjusting respective model parameters.

The package mlr3proba extends mlr3 with the following objects for survival analysis:

In this example we demonstrate the basic functionality of the package on the rats data from the survival package. This task ships as pre-defined TaskSurv with mlr3proba.

library(mlr3proba)
head(task$truth()) ## [1] 101+ 49 104+ 91+ 104+ 102+ # kaplan-meier plot library(mlr3viz) autoplot(task) Now, we conduct a small benchmark study on the rats task using some of the integrated survival learners: # some integrated learners learners = lapply(c("surv.coxph", "surv.kaplan", "surv.ranger"), lrn) print(learners) ## [[1]] ## <LearnerSurvCoxPH:surv.coxph> ## * Model: - ## * Parameters: list() ## * Packages: survival, distr6 ## * Predict Type: distr ## * Feature types: logical, integer, numeric, factor ## * Properties: importance ## ## [[2]] ## <LearnerSurvKaplan:surv.kaplan> ## * Model: - ## * Parameters: list() ## * Packages: survival, distr6 ## * Predict Type: crank ## * Feature types: logical, integer, numeric, character, factor, ordered ## * Properties: missings ## ## [[3]] ## <LearnerSurvRanger:surv.ranger> ## * Model: - ## * Parameters: list() ## * Packages: ranger, distr6 ## * Predict Type: distr ## * Feature types: logical, integer, numeric, character, factor, ordered ## * Properties: importance, oob_error, weights # Uno's C-Index for survival measure = msr("surv.unoC") print(measure) ## <MeasureSurvUnoC:surv.unoC> ## * Packages: survAUC ## * Range: [0, 1] ## * Minimize: FALSE ## * Properties: na_score, requires_task, requires_train_set ## * Predict type: crank set.seed(1) bmr = benchmark(benchmark_grid(task, learners, rsmp("cv", folds = 3))) bmr$aggregate(measure)
autoplot(bmr, measure = measure)