Comparing deep learning and classical regression approaches for predicting healthcare expenditure and spending: a systematic review

J Med Econ. 2026 Dec;29(1):654-671. doi: 10.1080/13696998.2026.2630598. Epub 2026 Mar 4.

ABSTRACT

AIMS: This study compares deep learning architectures with traditional regression and tree-based models for individual-level healthcare cost prediction, with particular attention to performance differences across data contexts.

METHODS: We conducted a preregistered systematic review (PROSPERO CRD420251129440). Web of Science, PubMed, Embase, and Scopus were searched through August 2025. Eligible studies used real-world individual-level data (claims, electronic health records, or registries) to predict cost-related outcomes with at least one deep learning architecture and one classical regression comparator, and reported quantitative performance. Data were extracted on population, predictors, outcome horizon, model type, validation strategy, performance metrics, calibration, and interpretability.

RESULTS: Eight studies met inclusion criteria, spanning the United States, Europe, and Asia. In longitudinal designs-such as multi-year claims prediction and medication or hospitalization time-series forecasting-sequential deep learning models, particularly LSTM and CNN-LSTM hybrids, consistently outperformed regression and tree-based algorithms. Reported gains included approximately 10-20% reductions in RMSE/MAE, R² improvements of 0.01-0.15, and AUROC values up to 0.78 for high-risk classification. Across studies, prior costs and utilization were consistently the strongest predictors, while social determinants and free-text features were rarely incorporated. In contrast, for low-dimensional, structured, cross-sectional medical data, generalized linear models and tree-based approaches remain robust baseline models due to their interpretability and calibration stability.

LIMITATIONS: Evidence is based on a small and heterogeneous set of eight studies, with limited external or temporal validation, short prediction horizons, and sparse assessment of calibration, economic interpretability, and fairness, warranting cautious interpretation.

CONCLUSIONS: Deep learning offers clear gains for longitudinal, sequence-rich cost forecasting, whereas tree-based methods remain highly competitive for cross-sectional tabular prediction. Overall, these findings are consistent with the proposed Complexity-Performance Hypothesis, which posits that the predictive advantages of deep learning emerge primarily when model capacity is well matched to data complexity.

PMID:41779998 | DOI:10.1080/13696998.2026.2630598

By Nevin Manimala