Categories
Nevin Manimala Statistics

Determining health care cost drivers in older Hodgkin lymphoma survivors using interpretable machine learning methods

J Manag Care Spec Pharm. 2025 Apr;31(4):406-420. doi: 10.18553/jmcp.2025.31.4.406.

ABSTRACT

BACKGROUND: The cost of health care for patients with Hodgkin lymphoma (HL) is projected to rise, making it essential to understand expenditure drivers across different demographics, including the older adult population. Although older HL patients constitute a significant number of HL patients, the literature on health care expenditures in older HL patients is lacking. Predictive capabilities of machine learning (ML) methods enhance our ability to leverage a data-driven approach, which helps identify key predictors of expenditures and strategically plan future expenditures.

OBJECTIVE: To determine the leading predictors of health care expenditures among older HL survivors across prediagnosis, treatment, and posttreatment phases of care.

METHODS: The study uses a retrospective research design to identify the incident cases of HL diagnosed between 2009 and 2017 using Surveillance, Epidemiology, and End Results-Medicare data. Three phases of cancer care (prediagnosis, treatment, and posttreatment) were indexed around the diagnosis date, with each phase divided into 12 months of baseline and 12 months of follow-up. ML methods, including XGBoost, Random Forest, and Cross-Validated linear regressions, were used to identify the best regression model for predicting Medicare and out-of-pocket (OOP) health care expenditures. Interpretable ML SHapley Additive exPlanations method was used to identify the leading predictors of Medicare and OOP health care expenditures in each phase.

RESULTS: The study analyzed 1,242 patients in the prediagnosis phase, 902 in the treatment phase, and 873 in the posttreatment phase. XGBoost regression outperformed Random Forest and Cross-Validated linear regression models with overall performance in predicting Medicare expenditures, with R-squared (root mean square error) values of 0.42 (1.39), 0.43 (0.56), and 0.46 (0.90) across the 3 phases of care, respectively. Interpretable ML methods highlighted baseline expenditures, number of prescription medications, and cardiac dysrhythmia as the leading predictors for Medicare and OOP expenditures in the prediagnosis phase. Chemotherapy and immunotherapy and surgical treatment and immunotherapy were the leading predictors of expenditures in the treatment and posttreatment phases, respectively.

CONCLUSIONS: As ML applications increase in predicting health care expenditure, researchers should consider implementing models in different phases of care to identify the changes in the predictors. Leading predictors of health care expenditures can be targeted for informed policy development to address financial hardship in HL survivors.

PMID:40152796 | DOI:10.18553/jmcp.2025.31.4.406

By Nevin Manimala

Portfolio Website for Nevin Manimala