BMC Psychol. 2025 Apr 17;13(1):395. doi: 10.1186/s40359-025-02691-3.
ABSTRACT
BACKGROUND: Aging has become a global trend, and depression, as an accompanying issue, poses a significant threat to the health of middle-aged and older adults. Existing studies primarily rely on statistical methods such as logistic regression for small-scale data analysis, while research on the application of machine learning in large-scale data remains limited. Therefore, this study employs machine learning methods to explore the risk factors for depression among middle-aged and older adults in China.
METHODS: Using a two-step hybrid model combining long short-term memory (LSTM) and machine learning (ML), we compared 20 depression risk/protective factors in a balanced panel dataset of middle-aged and elderly Chinese adults (N = 3706; aged 45-94; 64.65% female; 41.20% middle-aged) from the China Health and Retirement Longitudinal Study (CHARLS). Data were collected across five waves (2011, 2013, 2015, 2018, and 2020). The LSTM model predicted risk factors for the fifth wave via data from the preceding four waves. Five ML models were then used to classify depression (yes/no) based on these factors, which included demographic, lifestyle, health, and socioeconomic variables.
RESULTS: The LSTM model effectively predicted depression-related variables (mean square error = 0.067). The average AUC of the five ML models ranged from 0.78 to 0.82. The key predictive factors were disability, life satisfaction, activities of daily living (ADL) impairment, chronic diseases, and self-reported memory. For the middle-aged group, the top three factors were disability, life satisfaction, and chronic diseases; for the Older people group, they were life satisfaction, chronic diseases, and ADL impairment.
CONCLUSION: The two-step hybrid model (“LSTM + ML”) effectively predicted depression over 2 years via demographic and health data, aiding early diagnosis and intervention.
PMID:40247342 | DOI:10.1186/s40359-025-02691-3