Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

Acta neurochirurgica. Supplement 2022 Vol.134() p. 33-41

Staartjes VE, Kernbach JM

관련 도메인

Abstract

We illustrate the steps required to train and validate a simple, machine learning-based clinical prediction model for any binary outcome, such as, for example, the occurrence of a complication, in the statistical programming language R. To illustrate the methods applied, we supply a simulated database of 10,000 glioblastoma patients who underwent microsurgery, and predict the occurrence of 12-month survival. We walk the reader through each step, including import, checking, and splitting of datasets. In terms of pre-processing, we focus on how to practically implement imputation using a k-nearest neighbor algorithm, and how to perform feature selection using recursive feature elimination. When it comes to training models, we apply the theory discussed in Parts I-III. We show how to implement bootstrapping and to evaluate and select models based on out-of-sample error. Specifically for classification, we discuss how to counteract class imbalance by using upsampling techniques. We discuss how the reporting of a minimum of accuracy, area under the curve (AUC), sensitivity, and specificity for discrimination, as well as slope and intercept for calibration-if possible alongside a calibration plot-is paramount. Finally, we explain how to arrive at a measure of variable importance using a universal, AUC-based method. We provide the full, structured code, as well as the complete glioblastoma survival database for the readers to download and execute in parallel to this section.

추출된 의학 개체 (NER)

유형영어 표현한국어 / 풀이UMLS CUI출처등장
시술 microsurgery 미세수술 dict 1
질환 glioblastoma C0017636
Glioblastoma
scispacy 1
질환 Part IV-A scispacy 1
질환 glioblastoma patients scispacy 1

MeSH Terms

Algorithms; Humans; Logistic Models; Machine Learning; Models, Statistical; Prognosis

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

관련 논문