Categories
Nevin Manimala Statistics

Toward Reliable Coronary Heart Disease Prediction: Integrating Multi-source Data with Ensemble Machine Learning

J Imaging Inform Med. 2025 Aug 15. doi: 10.1007/s10278-025-01644-x. Online ahead of print.

ABSTRACT

Coronary heart disease is one of the leading causes of global death. Early detection and accurate risk assessment are critical for improving patient health and reducing fatality rates. Recently, machine learning has emerged as a powerful approach for predicting heart disease by analyzing clinical data, which enables timely intervention and personalized treatment. Hence, this study aims to propose a reliable model for predicting coronary heart disease that integrates multi-source heart disease data with various machine learning models. This study employs heart disease datasets from four separate databases: Cleveland, Hungary, Switzerland, and VA Long Beach, provided by the UCI machine learning repository. Various machine learning models were utilized in this study, such as logistic regression, Naive Bayes, random forest, extreme gradient boost, K-nearest neighbors, decision trees, and support vector machines. These models were assessed using different evaluation metrics, such as the confusion matrix, accuracy, precision, recall, and F1-score. The models with the highest accuracy were integrated into the proposed ensemble-learning model. The synthetic minority oversampling technique was implemented prior to training to address the issue of class imbalance, which is frequently observed in medical datasets. The proposed ensemble model achieved 98.46% accuracy, 96% precision, 100% recall, and a 98% F1-score. These findings demonstrate the effectiveness and robustness of the proposed model in accurately predicting coronary heart disease.

PMID:40817318 | DOI:10.1007/s10278-025-01644-x

By Nevin Manimala

Portfolio Website for Nevin Manimala