JMIR Med Inform. 2026 Mar 23;14:e86171. doi: 10.2196/86171.
ABSTRACT
BACKGROUND: Digital health literacy (DHL) is the ability to locate, understand, evaluate, and apply health information in digital environments. It is essential for older adults to effectively engage with contemporary health care. However, existing DHL assessments primarily rely on self-reported measures, which are susceptible to subjective bias and often fail to capture actual performance. There is a need for a comprehensive, data-driven approach that integrates objective performance indicators with self-assessments to accurately predict and explain DHL levels in older adults.
OBJECTIVE: This study develops and validates a machine learning approach to predict DHL levels in older adults by integrating performance-based and self-assessed evaluations.
METHODS: We applied a 2-stage methodological framework using 2 independent datasets. In the first stage, to identify performance-based determinants, we assessed actual digital and information comprehension in a separate pilot cohort of 30 older adults (aged 60-74 years). In parallel, to measure self-reported DHL, we conducted an online survey with a distinct group of 1000 older adults (aged 55-74 years) using the Digital Health Literacy Scale and the Korean version of the eHealth Literacy Scale (KeHEALS). Bayesian linear regression was applied to both datasets to identify significant explanatory variables. In the second phase, we trained and validated a binary classification model to predict KeHEALS levels using the survey dataset (n=1000), leveraging the features identified in the first stage. Five machine learning algorithms were evaluated, and the best-performing model was interpreted using Shapley Additive Explanations (SHAP) analysis.
RESULTS: In the pilot performance-based assessment, using a greater number of electronic devices and having higher educational attainment were positively associated with comprehension, whereas alcohol intake showed a negative association. In the self-assessed survey data, key correlates included interest in health-related apps, self-care confidence, age, smoking, alcohol intake, number of devices used, and exercise frequency. Among the machine learning models, categorical boosting demonstrated the most balanced performance (accuracy 0.785, precision 0.769, F1-score 0.765, area under the receiver operating characteristic curve 0.835), outperforming the dummy classifier (accuracy 0.540). SHAP analysis indicated that self-care confidence, health information search, interest in health-related apps, number of electronic devices used, and exercise frequency were the strongest positive contributors to high-DHL predictions, whereas older age and lifestyle factors (alcohol intake, smoking) contributed negatively.
CONCLUSIONS: By explicitly integrating performance-based and self-assessed indicators within an explainable machine learning framework, this study demonstrates that DHL in older adults is influenced by both digital engagement and health management factors. These findings suggest that the proposed framework can serve as a structured approach for evaluating DHL in older adults and inform the design of personalized digital health interventions in clinical and community settings.
PMID:41871331 | DOI:10.2196/86171