Comparison of the effectiveness of different machine learning algorithms in predicting new fractures after PKP for osteoporotic vertebral compression fractures

J Orthop Surg Res. 2023 Jan 23;18(1):62. doi: 10.1186/s13018-023-03551-9.

ABSTRACT

BACKGROUND: The use of machine learning has the potential to estimate the probability of a second classification event more accurately than traditional statistical methods, and few previous studies on predicting new fractures after osteoporotic vertebral compression fractures (OVCFs) have focussed on this point. The aim of this study was to explore whether several different machine learning models could produce better predictions than logistic regression models and to select an optimal model.

METHODS: A retrospective analysis of 529 patients who underwent percutaneous kyphoplasty (PKP) for OVCFs at our institution between June 2017 and June 2020 was performed. The patient data were used to create machine learning (including decision trees (DT), random forests (RF), support vector machines (SVM), gradient boosting machines (GBM), neural networks (NNET), and regularized discriminant analysis (RDA)) and logistic regression models (LR) to estimate the probability of new fractures occurring after surgery. The dataset was divided into a training set (75%) and a test set (25%), and machine learning models were built in the training set after ten cross-validations, after which each model was evaluated in the test set, and model performance was assessed by comparing the area under the curve (AUC) of each model.

RESULTS: Among the six machine learning algorithms, except that the AUC of DT [0.775 (95% CI 0.728-0.822)] was lower than that of LR [0.831 (95% CI 0.783-0.878)], RA [0.953 (95% CI 0.927-0.980)], GBM [0.941 (95% CI 0.911-0.971)], SVM [0.869 (95% CI 0.827-0.910), NNET [0.869 (95% CI 0.826-0.912)], and RDA [0.890 (95% CI 0.851-0.929)] were all better than LR.

CONCLUSIONS: For prediction of the probability of new fracture after PKP, machine learning algorithms outperformed logistic regression, with random forest having the strongest predictive power.

PMID:36683045 | DOI:10.1186/s13018-023-03551-9

By Nevin Manimala