16 References

Apley, Daniel W., and Jingyu Zhu. 2020. “Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models.” Journal of the Royal Statistical Society Series B: Statistical Methodology 82 (4): 1059–86. https://doi.org/10.1111/rssb.12377.

Au, Quay, Julia Herbinger, Clemens Stachl, Bernd Bischl, and Giuseppe Casalicchio. 2022. “Grouped Feature Importance and Combined Features Effect Plot.” Data Mining and Knowledge Discovery 36 (4): 1401–50. https://doi.org/10.1007/s10618-022-00840-5.

Bagnall, Anthony, Jason Lines, Aaron Bostrom, James Large, and Eamonn Keogh. 2017. “The Great Time Series Classification Bake Off: A Review and Experimental Evaluation of Recent Algorithmic Advances.” Data Mining and Knowledge Discovery 31: 606–60. https://doi.org/10.1007/s10618-016-0483-9.

Baniecki, Hubert, and Przemyslaw Biecek. 2019. “modelStudio: Interactive Studio with Explanations for ML Predictive Models.” Journal of Open Source Software 4 (43): 1798. https://doi.org/10.21105/joss.01798.

Baniecki, Hubert, Dariusz Parzych, and Przemyslaw Biecek. 2023. “The Grammar of Interactive Explanatory Model Analysis.” Data Mining and Knowledge Discovery, 1573–756X. https://doi.org/10.1007/s10618-023-00924-w.

Barocas, Solon, Moritz Hardt, and Arvind Narayanan. 2019. Fairness and Machine Learning: Limitations and Opportunities. fairmlbook.org.

Bengtsson, Henrik. 2020. “Future 1.19.1 - Making Sure Proper Random Numbers Are Produced in Parallel Processing.” https://www.jottr.org/2020/09/22/push-for-statistical-sound-rng/.

———. 2022. “Please Avoid detectCores() in Your R Packages.” https://www.jottr.org/2022/12/05/avoid-detectcores/.

Bergstra, James, and Yoshua Bengio. 2012. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13: 281–305. https://jmlr.org/papers/v13/bergstra12a.html.

Biecek, Przemyslaw. 2018. “DALEX: Explainers for Complex Predictive Models in R.” Journal of Machine Learning Research 19 (84): 1–5. https://jmlr.org/papers/v19/18-416.html.

Biecek, Przemyslaw, and Tomasz Burzykowski. 2021. Explanatory Model Analysis. Chapman; Hall/CRC, New York. https://ema.drwhy.ai/.

Binder, Martin, Florian Pfisterer, and Bernd Bischl. 2020. “Collecting Empirical Data about Hyperparameters for Data Driven AutoML.” In Proceedings of the 7th ICML Workshop on Automated Machine Learning (AutoML 2020). https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_63.pdf.

Binder, Martin, Florian Pfisterer, Michel Lang, Lennart Schneider, Lars Kotthoff, and Bernd Bischl. 2021. “mlr3pipelines - Flexible Machine Learning Pipelines in R.” Journal of Machine Learning Research 22 (184): 1–7. https://jmlr.org/papers/v22/21-0281.html.

Bischl, Bernd, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, et al. 2023. “Hyperparameter Optimization: Foundations, Algorithms, Best Practices, and Open Challenges.” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, e1484. https://doi.org/10.1002/widm.1484.

Bischl, Bernd, Giuseppe Casalicchio, Matthias Feurer, Pieter Gijsbers, Frank Hutter, Michel Lang, Rafael Gomes Mantovani, Jan N. van Rijn, and Joaquin Vanschoren. 2021. “OpenML Benchmarking Suites.” In Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). https://openreview.net/forum?id=OCrD8ycKjG.

Bischl, Bernd, Michel Lang, Olaf Mersmann, Jörg Rahnenführer, and Claus Weihs. 2015. “BatchJobs and BatchExperiments: Abstraction Mechanisms for Using r in Batch Environments.” Journal of Statistical Software 64 (11): 1–25. https://doi.org/10.18637/jss.v064.i11.

Bischl, Bernd, Olaf Mersmann, Heike Trautmann, and Claus Weihs. 2012. “Resampling Methods for Meta-Model Validation with Recommendations for Evolutionary Computation.” Evolutionary Computation 20 (2): 249–75. https://doi.org/10.1162/EVCO_a_00069 .

Bishop, Christopher M. 2006. Pattern Recognition and Machine Learning. Springer.

Bommert, Andrea, Xudong Sun, Bernd Bischl, Jörg Rahnenführer, and Michel Lang. 2020. “Benchmark for Filter Methods for Feature Selection in High-Dimensional Classification Data.” Computational Statistics & Data Analysis 143: 106839. https://doi.org/10.1016/j.csda.2019.106839.

Breiman, Leo. 1996. “Bagging Predictors.” Machine Learning 24 (2): 123–40. https://doi.org/10.1007/BF00058655.

———. 2001a. “Random Forests.” Machine Learning 45: 5–32. https://doi.org/10.1023/A:1010933404324.

———. 2001b. “Statistical Modeling: The Two Cultures (with Comments and a Rejoinder by the Author).” Statistical Science 16 (3). https://doi.org/10.1214/ss/1009213726.

Bücker, Michael, Gero Szepannek, Alicja Gosiewska, and Przemyslaw Biecek. 2022. “Transparency, Auditability, and Explainability of Machine Learning Models in Credit Scoring.” Journal of the Operational Research Society 73 (1): 70–90. https://doi.org/10.1080/01605682.2021.1922098.

Byrd, Richard H., Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. 1995. “A Limited Memory Algorithm for Bound Constrained Optimization.” SIAM Journal on Scientific Computing 16 (5): 1190–1208. https://doi.org/10.1137/0916069.

Caton, S., and C. Haas. 2020. “Fairness in Machine Learning: A Survey.” Arxiv 2010.04053 [cs.LG]. https://doi.org/10.48550/arXiv.2010.04053.

Chandrashekar, Girish, and Ferat Sahin. 2014. “A Survey on Feature Selection Methods.” Computers and Electrical Engineering 40 (1): 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024.

Chen, Tianqi, and Carlos Guestrin. 2016. “XGBoost: A Scalable Tree Boosting System.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–94. https://doi.org/10.1145/2939672.2939785.

Collett, David. 2014. Modelling Survival Data in Medical Research. 3rd ed. CRC. https://doi.org/10.1201/b18041.

Couronné, Raphael, Philipp Probst, and Anne-Laure Boulesteix. 2018. “Random Forest Versus Logistic Regression: A Large-Scale Benchmark Experiment.” BMC Bioinformatics 19: 1–14. https://doi.org/10.1186/s12859-018-2264-5.

Dandl, Susanne, Christoph Molnar, Martin Binder, and Bernd Bischl. 2020. “Multi-Objective Counterfactual Explanations.” In Parallel Problem Solving from Nature PPSN XVI, 448–69. Springer International Publishing. https://doi.org/10.1007/978-3-030-58112-1_31.

Davis, Jesse, and Mark Goadrich. 2006. “The Relationship Between Precision-Recall and ROC Curves.” In Proceedings of the 23rd International Conference on Machine Learning, 233–40. https://doi.org/10.1145/1143844.1143874.

De Cock, Dean. 2011. “Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project.” Journal of Statistics Education 19 (3). https://doi.org/10.1080/10691898.2011.11889627.

Demšar, Janez. 2006. “Statistical Comparisons of Classifiers over Multiple Data Sets.” Journal of Machine Learning Research 7 (1): 1–30. https://jmlr.org/papers/v7/demsar06a.html.

Ding, Yufeng, and Jeffrey S Simonoff. 2010. “An Investigation of Missing Data Methods for Classification Trees Applied to Binary Response Data.” Journal of Machine Learning Research 11 (6): 131–70. https://www.jmlr.org/papers/v11/ding10a.html.

Dobbin, Kevin K., and Richard M. Simon. 2011. “Optimally Splitting Cases for Training and Testing High Dimensional Classifiers.” BMC Medical Genomics 4 (1): 31. https://doi.org/10.1186/1755-8794-4-31.

Dua, Dheeru, and Casey Graff. 2017. “UCI Machine Learning Repository.” University of California, Irvine, School of Information; Computer Sciences. https://archive.ics.uci.edu/ml.

Eddelbuettel, Dirk. 2020. “Parallel Computing with R: A Brief Review.” WIREs Computational Statistics 13 (2). https://doi.org/10.1002/wics.1515.

Feurer, Matthias, and Frank Hutter. 2019. “Hyperparameter Optimization.” In Automated Machine Learning: Methods, Systems, Challenges, edited by Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren, 3–33. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-05318-5_1.

Feurer, Matthias, Jost Springenberg, and Frank Hutter. 2015. “Initializing Bayesian Hyperparameter Optimization via Meta-Learning.” In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 29. 1. https://doi.org/10.1609/aaai.v29i1.9354.

Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. 2019. “All Models Are Wrong, but Many Are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously.” https://doi.org/10.48550/arxiv.1801.01489.

Friedman, Jerome H. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” The Annals of Statistics 29 (5). https://doi.org/10.1214/aos/1013203451.

Garnett, Roman. 2022. Bayesian Optimization. Cambridge University Press. https://bayesoptbook.com/.

Gijsbers, Pieter, Marcos L. P. Bueno, Stefan Coors, Erin LeDell, Sébastien Poirier, Janek Thomas, Bernd Bischl, and Joaquin Vanschoren. 2022. “AMLB: An AutoML Benchmark.” arXiv. https://doi.org/10.48550/ARXIV.2207.12560.

Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2015. “Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation.” Journal of Computational and Graphical Statistics 24 (1): 44–65. https://doi.org/10.1080/10618600.2014.907095.

Gower, John C. 1971. “A General Coefficient of Similarity and Some of Its Properties.” Biometrics, 857–71. https://doi.org/10.2307/2528823.

Grinsztajn, Leo, Edouard Oyallon, and Gael Varoquaux. 2022. “Why Do Tree-Based Models Still Outperform Deep Learning on Typical Tabular Data?” In Thirty-Sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track. https://openreview.net/forum?id=Fp7__phQszn.

Guidotti, Riccardo. 2022. “Counterfactual Explanations and How to Find Them: Literature Review and Benchmarking.” Data Mining and Knowledge Discovery, 1–55. https://doi.org/10.1007/s10618-022-00831-6.

Guidotti, Riccardo, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. “A Survey of Methods for Explaining Black Box Models.” ACM Computing Surveys (CSUR) 51 (5): 1–42. https://doi.org/10.1145/3236009.

Guyon, Isabelle, and André Elisseeff. 2003. “An Introduction to Variable and Feature Selection.” Journal of Machine Learning Research 3 (Mar): 1157–82. https://www.jmlr.org/papers/v3/guyon03a.html.

Hand, David J, and Robert J Till. 2001. “A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems.” Machine Learning 45: 171–86. https://doi.org/10.1023/A:1010920819831.

Hansen, Nikolaus, and Anne Auger. 2011. “CMA-ES: Evolution Strategies and Covariance Matrix Adaptation.” In Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation, 991–1010. https://doi.org/10.1145/2001858.2002123.

Hastie, Trevor, Jerome Friedman, and Robert Tibshirani. 2001. The Elements of Statistical Learning. Springer New York. https://doi.org/10.1007/978-0-387-21606-5.

Hooker, Giles, and Lucas K. Mentch. 2019. “Please Stop Permuting Features: An Explanation and Alternatives.” https://doi.org/10.48550/arxiv.1905.03151.

Horn, Daniel, Tobias Wagner, Dirk Biermann, Claus Weihs, and Bernd Bischl. 2015. “Model-Based Multi-Objective Optimization: Taxonomy, Multi-Point Proposal, Toolbox and Benchmark.” In Evolutionary Multi-Criterion Optimization, edited by António Gaspar-Cunha, Carlos Henggeler Antunes, and Carlos Coello Coello, 64–78. https://doi.org/10.1007/978-3-319-15934-8_5.

Huang, D., T. T. Allen, W. I. Notz, and N. Zheng. 2012. “Erratum to: Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models.” Journal of Global Optimization 54 (2): 431–31. https://doi.org/10.1007/s10898-011-9821-z.

Huang, Jonathan, Galal Galal, Mozziyar Etemadi, and Mahesh Vaidyanathan. 2022. “Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: Scoping Review.” JMIR Med Inform 10 (5). https://doi.org/10.2196/36388.

Hutter, Frank, Lars Kotthoff, and Joaquin Vanschoren, eds. 2019. Automated Machine Learning - Methods, Systems, Challenges. Springer.

“Introduction to Data.table.” 2023. https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2014. An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated. https://doi.org/10.1007/978-1-4614-7138-7.

Jamieson, Kevin, and Ameet Talwalkar. 2016. “Non-Stochastic Best Arm Identification and Hyperparameter Optimization.” In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, edited by Arthur Gretton and Christian C. Robert, 51:240–48. Proceedings of Machine Learning Research. Cadiz, Spain: PMLR. https://proceedings.mlr.press/v51/jamieson16.html.

Japkowicz, Nathalie, and Mohak Shah. 2011. Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press. https://doi.org/10.1017/CBO9780511921803.

Jones, Donald R., Cary D. Perttunen, and Bruce E. Stuckman. 1993. “Lipschitzian Optimization Without the Lipschitz Constant.” Journal of Optimization Theory and Applications 79 (1): 157–81. https://doi.org/10.1007/BF00941892.

Jones, Donald R., Matthias Schonlau, and William J. Welch. 1998. “Efficient Global Optimization of Expensive Black-Box Functions.” Journal of Global Optimization 13 (4): 455–92. https://doi.org/10.1023/A:1008306431147.

Kalbfleisch, John D, and Ross L Prentice. 2011. The Statistical Analysis of Failure Time Data. Vol. 360. John Wiley & Sons. https://doi.org/10.1002/9781118032985.

Karl, Florian, Tobias Pielok, Julia Moosbauer, Florian Pfisterer, Stefan Coors, Martin Binder, Lennart Schneider, et al. 2022. “Multi-Objective Hyperparameter Optimization–an Overview.” arXiv Preprint arXiv:2206.07438. https://doi.org/10.48550/arXiv.2206.07438.

Kim, Ji-Hyun. 2009. “Estimating Classification Error Rate: Repeated Cross-Validation, Repeated Hold-Out and Bootstrap.” Computational Statistics & Data Analysis 53 (11): 3735–45. https://doi.org/10.1016/j.csda.2009.04.009.

Kim, Jungtaek, and Seungjin Choi. 2021. “On Local Optimizers of Acquisition Functions in Bayesian Optimization.” In Machine Learning and Knowledge Discovery in Databases, edited by Frank Hutter, Kristian Kersting, Jefrey Lijffijt, and Isabel Valera, 675–90. https://doi.org/10.1007/978-3-030-67661-2_40.

Knowles, Joshua. 2006. “ParEGO: A Hybrid Algorithm with on-Line Landscape Approximation for Expensive Multiobjective Optimization Problems.” IEEE Transactions on Evolutionary Computation 10 (1): 50–66. https://doi.org/10.1109/TEVC.2005.851274.

Kohavi, Ron. 1995. “A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection.” In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2, 1137–43. IJCAI’95. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

Kohavi, Ron, and George H. John. 1997. “Wrappers for Feature Subset Selection.” Artificial Intelligence 97 (1): 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X.

Krzyziński, Mateusz, Mikołaj Spytek, Hubert Baniecki, and Przemysław Biecek. 2023. “SurvSHAP(t): Time-Dependent Explanations of Machine Learning Survival Models.” Knowledge-Based Systems 262: 110234. https://doi.org/10.1016/j.knosys.2022.110234.

Kuehn, Daniel, Philipp Probst, Janek Thomas, and Bernd Bischl. 2018. “Automatic Exploration of Machine Learning Experiments on OpenML.” https://arxiv.org/abs/1806.10961.

Lang, Michel. 2017. “checkmate: Fast Argument Checks for Defensive R Programming.” The R Journal 9 (1): 437–45. https://doi.org/10.32614/RJ-2017-028.

Lang, Michel, Martin Binder, Jakob Richter, Patrick Schratz, Florian Pfisterer, Stefan Coors, Quay Au, Giuseppe Casalicchio, Lars Kotthoff, and Bernd Bischl. 2019. “mlr3: A Modern Object-Oriented Machine Learning Framework in R.” Journal of Open Source Software, December. https://doi.org/10.21105/joss.01903.

Lang, Michel, Bernd Bischl, and Dirk Surmann. 2017. “batchtools: Tools for R to Work on Batch Systems.” The Journal of Open Source Software 2 (10). https://doi.org/10.21105/joss.00135.

LeCun, Yann, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. “Gradient-Based Learning Applied to Document Recognition.” Proceedings of the IEEE 86 (11): 2278–2324. https://doi.org/10.1109/5.726791.

Li, Lisha, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2018. “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization.” Journal of Machine Learning Research 18 (185): 1–52. https://jmlr.org/papers/v18/16-558.html.

Lindauer, Marius, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhkopf, René Sass, and Frank Hutter. 2022. “SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization.” Journal of Machine Learning Research 23 (54): 1–9. https://www.jmlr.org/papers/v23/21-0888.html.

Lipton, Zachary C. 2018. “The Mythos of Model Interpretability: In Machine Learning, the Concept of Interpretability Is Both Important and Slippery.” Queue 16 (3): 31–57. https://doi.org/10.1145/3236386.3241340.

López-Ibáñez, Manuel, Jérémie Dubois-Lacoste, Leslie Pérez Cáceres, Mauro Birattari, and Thomas Stützle. 2016. “The irace Package: Iterated Racing for Automatic Algorithm Configuration.” Operations Research Perspectives 3: 43–58. https://doi.org/10.1016/j.orp.2016.09.002.

Lundberg, Scott M., Gabriel G. Erion, and Su-In Lee. 2019. “Consistent Individualized Feature Attribution for Tree Ensembles.” arXiv. https://doi.org/10.48550/arxiv.1802.03888.

Mehrabi, Ninareh, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. “A Survey on Bias and Fairness in Machine Learning.” ACM Comput. Surv. 54 (6). https://doi.org/10.1145/3457607.

Micci-Barreca, Daniele. 2001. “A Preprocessing Scheme for High-Cardinality Categorical Attributes in Classification and Prediction Problems.” ACM SIGKDD Explorations Newsletter 3 (1): 27–32. https://doi.org/10.1145/507533.507538.

Mitchell, Shira, Eric Potash, Solon Barocas, Alexander D’Amour, and Kristian Lum. 2021. “Algorithmic Fairness: Choices, Assumptions, and Definitions.” Annual Review of Statistics and Its Application 8: 141–63. https://doi.org/10.1146/annurev-statistics-042720-125902.

Molinaro, Annette M, Richard Simon, and Ruth M Pfeiffer. 2005. “Prediction Error Estimation: A Comparison of Resampling Methods.” Bioinformatics 21 (15): 3301–7. https://doi.org/10.1093/bioinformatics/bti499.

Molnar, Christoph. 2022. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2nd ed. https://christophm.github.io/interpretable-ml-book.

Molnar, Christoph, Bernd Bischl, and Giuseppe Casalicchio. 2018. “iml: An R Package for Interpretable Machine Learning.” JOSS 3 (26): 786. https://doi.org/10.21105/joss.00786.

Molnar, Christoph, Gunnar König, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Christian A. Scholbeck, Giuseppe Casalicchio, Moritz Grosse-Wentrup, and Bernd Bischl. 2022. “General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models.” In xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers, edited by Andreas Holzinger, Randy Goebel, Ruth Fong, Taesup Moon, Klaus-Robert Müller, and Wojciech Samek, 39–68. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-04083-2_4.

Morales-Hernández, Alejandro, Inneke Van Nieuwenhuyse, and Sebastian Rojas Gonzalez. 2022. “A Survey on Multi-Objective Hyperparameter Optimization Algorithms for Machine Learning.” Artificial Intelligence Review, 1–51. https://doi.org/10.1007/s10462-022-10359-2.

Niederreiter, Harald. 1988. “Low-Discrepancy and Low-Dispersion Sequences.” Journal of Number Theory 30 (1): 51–70. https://doi.org/10.1016/0022-314X(88)90025-X.

Pargent, Florian, Florian Pfisterer, Janek Thomas, and Bernd Bischl. 2022. “Regularized Target Encoding Outperforms Traditional Methods in Supervised Machine Learning with High Cardinality Features.” Computational Statistics 37 (5): 2671–92. https://doi.org/10.1007/s00180-022-01207-6.

Poulos, Jason, and Rafael Valle. 2018. “Missing Data Imputation for Supervised Learning.” Applied Artificial Intelligence 32 (2): 186–96. https://doi.org/10.1080/08839514.2018.1448143.

Provost, Foster, and Tom Fawcett. 2013. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking. O’Reilly Media.

R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Ribeiro, Marco, Sameer Singh, and Carlos Guestrin. 2016. ““Why Should I Trust You?”: Explaining the Predictions of Any Classifier.” In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, 97–101. San Diego, California: Association for Computational Linguistics. https://doi.org/10.18653/v1/N16-3020.

Romaszko, Kamil, Magda Tatarynowicz, Mateusz Urbański, and Przemysław Biecek. 2019. “modelDown: Automated Website Generator with Interpretable Documentation for Predictive Machine Learning Models.” Journal of Open Source Software 4 (38): 1444. https://doi.org/10.21105/joss.01444.

Ruspini, Enrique H. 1970. “Numerical Methods for Fuzzy Clustering.” Information Sciences 2 (3): 319–50. https://doi.org/10.1016/S0020-0255(70)80056-1.

Saleiro, Pedro, Benedict Kuester, Abby Stevens, Ari Anisfeld, Loren Hinkson, Jesse London, and Rayid Ghani. 2018. “Aequitas: A Bias and Fairness Audit Toolkit.” arXiv Preprint arXiv:1811.05577. https://doi.org/10.48550/arXiv.1811.05577.

Schmidberger, Markus, Martin Morgan, Dirk Eddelbuettel, Hao Yu, Luke Tierney, and Ulrich Mansmann. 2009. “State of the Art in Parallel Computing with R.” Journal of Statistical Software 31 (1). https://doi.org/10.18637/jss.v031.i01.

Schratz, Patrick, Marc Becker, Michel Lang, and Alexander Brenning. 2021. “mlr3spatiotempcv: Spatiotemporal Resampling Methods for Machine Learning in R,” October. https://arxiv.org/abs/2110.12674.

Silverman, Bernard W. 1986. Density Estimation for Statistics and Data Analysis. Vol. 26. CRC press.

Simon, Richard. 2007. “Resampling Strategies for Model Assessment and Selection.” In Fundamentals of Data Mining in Genomics and Proteomics, edited by Werner Dubitzky, Martin Granzow, and Daniel Berrar, 173–86. Boston, MA: Springer US. https://doi.org/10.1007/978-0-387-47509-7_8.

Snoek, Jasper, Hugo Larochelle, and Ryan P Adams. 2012. “Practical Bayesian Optimization of Machine Learning Algorithms.” In Advances in Neural Information Processing Systems, edited by F. Pereira, C. J. Burges, L. Bottou, and K. Q. Weinberger. Vol. 25. https://proceedings.neurips.cc/paper_files/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf.

Sonabend, Raphael Edward Benjamin. 2021. “A Theoretical and Methodological Framework for Machine Learning in Survival Analysis: Enabling Transparent and Accessible Predictive Modelling on Right-Censored Time-to-Event Data.” PhD, University College London (UCL). https://discovery.ucl.ac.uk/id/eprint/10129352/.

Sonabend, Raphael, and Andreas Bender. 2023. Machine Learning in Survival Analysis. https://www.mlsabook.com.

Sonabend, Raphael, Andreas Bender, and Sebastian Vollmer. 2022. “Avoiding C-Hacking When Evaluating Survival Distribution Predictions with Discrimination Measures.” Edited by Zhiyong Lu. Bioinformatics 38 (17): 4178–84. https://doi.org/10.1093/bioinformatics/btac451.

Sonabend, Raphael, Franz J Király, Andreas Bender, Bernd Bischl, and Michel Lang. 2021. “mlr3proba: An R Package for Machine Learning in Survival Analysis.” Bioinformatics, February. https://doi.org/10.1093/bioinformatics/btab039.

Sonabend, Raphael, Florian Pfisterer, Alan Mishler, Moritz Schauer, Lukas Burk, Sumantrak Mukherjee, and Sebastian Vollmer. 2022. “Flexible Group Fairness Metrics for Survival Analysis.” In DSHealth 2022 Workshop on Applied Data Science for Healthcare at KDD2022. https://arxiv.org/abs/2206.03256.

Stein, Michael. 1987. “Large Sample Properties of Simulations Using Latin Hypercube Sampling.” Technometrics 29 (2): 143–51. https://doi.org/10.2307/1269769.

Strobl, Carolin, Anne-Laure Boulesteix, Thomas Kneib, Thomas Augustin, and Achim Zeileis. 2008. “Conditional Variable Importance for Random Forests.” BMC Bioinformatics 9 (1). https://doi.org/10.1186/1471-2105-9-307.

Štrumbelj, Erik, and Igor Kononenko. 2013. “Explaining Prediction Models and Individual Predictions with Feature Contributions.” Knowledge and Information Systems 41 (3): 647–65. https://doi.org/10.1007/s10115-013-0679-x.

Thornton, Chris, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2013. “Auto-WEKA.” In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. https://doi.org/10.1145/2487575.2487629.

Tsallis, Constantino, and Daniel A Stariolo. 1996. “Generalized Simulated Annealing.” Physica A: Statistical Mechanics and Its Applications 233 (1-2): 395–406. https://doi.org/10.1016/S0378-4371(96)00271-3.

Vanschoren, Joaquin, Jan N. van Rijn, Bernd Bischl, and Luis Torgo. 2013. “OpenML: Networked Science in Machine Learning.” SIGKDD Explorations 15 (2): 49–60. https://doi.org/10.1145/2641190.2641198.

Wachter, Sandra, Brent Mittelstadt, and Chris Russell. 2017. “Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3063289.

———. 2021. “Why Fairness Cannot Be Automated: Bridging the Gap Between EU Non-Discrimination Law and AI.” Computer Law & Security Review 41: 105567. https://doi.org/https://doi.org/10.1016/j.clsr.2021.105567.

Watson, David S, and Marvin N Wright. 2021. “Testing Conditional Independence in Supervised Learning Algorithms.” Machine Learning 110 (8): 2107–29. https://doi.org/10.1007/s10994-021-06030-6.

Wexler, James, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viégas, and Jimbo Wilson. 2019. “The What-If Tool: Interactive Probing of Machine Learning Models.” IEEE Transactions on Visualization and Computer Graphics 26 (1): 56–65. https://doi.org/10.1109/TVCG.2019.2934619.

Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 1st ed. O’Reilly Media. https://r4ds.had.co.nz/.

Williams, Christopher KI, and Carl Edward Rasmussen. 2006. Gaussian Processes for Machine Learning. Vol. 2. 3. MIT press Cambridge, MA.

Wiśniewski, Jakub, and Przemysław Biecek. 2022. “The R Journal: Fairmodels: A Flexible Tool for Bias Detection, Visualization, and Mitigation in Binary Classification Models.” The R Journal 14: 227–43. https://doi.org/10.32614/RJ-2022-019.

Wolpert, David H. 1992. “Stacked Generalization.” Neural Networks 5 (2): 241–59. https://doi.org/10.1016/S0893-6080(05)80023-1.

Xiang, Yang, Sylvain Gubian, Brian Suomela, and Julia Hoeng. 2013. “Generalized Simulated Annealing for Global Optimization: The GenSA Package.” R Journal 5 (1): 13. https://doi.org/10.32614/RJ-2013-002.

Yu, Keming, Zudi Lu, and Julian Stander. 2003. “Quantile Regression: Applications and Current Research Areas.” Journal of the Royal Statistical Society: Series D (The Statistician) 52 (3): 331–50. https://doi.org/10.1111/1467-9884.00363.