Dementia risk predictions from German claims data using methods of machine learning

Alzheimers Dement. 2022 Apr 22. doi: 10.1002/alz.12663. Online ahead of print.

ABSTRACT

INTRODUCTION: We examined whether German claims data are suitable for dementia risk prediction, how machine learning (ML) compares to classical regression, and what the important predictors for dementia risk are.

METHODS: We analyzed data from the largest German health insurance company, including 117,895 dementia-free people age 65+. Follow-up was 10 years. Predictors were: 23 age-related diseases, 212 medical prescriptions, 87 surgery codes, as well as age and sex. Statistical methods included logistic regression (LR), gradient boosting (GBM), and random forests (RFs).

RESULTS: Discriminatory power was moderate for LR (C-statistic = 0.714; 95% confidence interval [CI] = 0.708-0.720) and GBM (C-statistic = 0.707; 95% CI = 0.700-0.713) and lower for RF (C-statistic = 0.636; 95% CI = 0.628-0.643). GBM had the best model calibration. We identified antipsychotic medications and cerebrovascular disease but also a less-established specific antibacterial medical prescription as important predictors.

DISCUSSION: Our models from German claims data have acceptable accuracy and may provide cost-effective decision support for early dementia screening.

PMID:35451562 | DOI:10.1002/alz.12663

By Nevin Manimala