JMIR Med Inform. 2022 Jul 26. doi: 10.2196/37578. Online ahead of print.
BACKGROUND: Healthcare costs have been continuously increasing in the past few years despite various efforts and policies by the government. The Centers for Medicare and Medicaid Services projects that healthcare costs will continue to grow over the next few years. Rising readmission costs have been a significant contributor to the increasing healthcare costs. Multiple areas of healthcare, including readmissions, have benefited from the application of various machine learning algorithms in several ways.
OBJECTIVE: We identify suitable models for predicting readmission charges billed by hospitals. Our literature review revealed that this application of machine learning is still underexplored. We used various predictive methods, ranging from glass-box models (such as regularization techniques) to black-box models (such as deep learning-based models).
METHODS: Readmission with the same major diagnostic category (RSDC) and all-cause readmission category (RADC) are the two ways we defined readmissions. 576,701 and 1,091,580 individuals were identified from the Nationwide Readmission Database (NRD), Healthcare Cost and Utilization Project (HCUP) by the Agency for Healthcare Research and Quality (AHRQ) for 2013 for the two identified readmission categories, i.e., RSDC and RADC, respectively. Linear regression, Lasso regression, Elastic Net, Ridge regression, XGBoost, and deep learning model based on multilayer perceptron (MLP) were the six machine learning algorithms we tested for both RSDC and RADC through 10-fold cross-validation.
RESULTS: Our preliminary analysis using a data-driven approach revealed that within an RADC, the subsequent readmission charge billed per patient was higher than the previous charge for 541,090 individuals, and this number is 319,233 for an RSDC. The top three Major Diagnostic Categories (MDCs) for such instances were the same for both RADC and RSDC. The average readmission charge billed was higher than the previous charge for 21 of the MDCs in the case of RSDC, whereas it was only 13 of the MDCs for RADC. We recommend XGBoost and a deep learning model based on MLP for predicting readmission charges. The performance obtained for XGBoost: (i) RADC (MAPE-3.121%; RMSE-0.414; MAE-0.317; RRSE-0.410; RAE-0.399; NRMSE-0.040; MAD-0.031) and (ii) RSDC (MAPE-3.171%; RMSE-0.421; MAE-0.321; RRSE-0.407; RAE-0.393; NRMSE-0.041; MAD-0.031). The performance obtained for deep neural networks based on MLP: (i) RADC (MAPE-3.103%; RMSE-0.413; MAE-0.316; RRSE-0.410; RAE-0.397; NRMSE-0.040; MAD-0.031) and (ii) RSDC (MAPE-3.202%; RMSE-0.427; MAE-0.326; RRSE-0.413; RAE-0.399; NRMSE-0.041; MAD-0.032). Based on repeated measures ANOVA, the mean RMSE was significantly different across models with P<.001. Post-hoc tests using the Bonferroni correction method indicated that the mean RMSE of deep learning/XGBoost models was statistically significantly (P<.001) lower than that of all other models i.e., linear regression/Elastic Net/Lasso/Ridge regression.
CONCLUSIONS: Models built using XGBoost and MLP deep neural networks are suitable for predicting readmission charges billed by hospitals. The MDCs can be used by models to accurately predict hospital readmission charges.
CLINICALTRIAL: Not applicable.