Severity Classification of Anxiety and Depression Using Generalized Anxiety Disorder Scale and Patient Health Questionnaire: National Cross-Sectional Study Applying Classification and Regression Tree Models

JMIR Public Health Surveill. 2025 Sep 30;11:e72591. doi: 10.2196/72591.

ABSTRACT

BACKGROUND: Scalable and accurate screening tools are critical for public mental health strategies, especially in low- and middle-income countries (LMICs). While the Generalized Anxiety Disorder Scale (GAD-7) and Patient Health Questionnaire (PHQ-9) are widely used, their full application in large-scale programs can pose feasibility challenges. By contrast, shorter versions like GAD-2 and PHQ-2 reduce burdens but fail to capture symptom diversity.

OBJECTIVE: This study aimed to optimize screening for anxiety and depression severity using classification and regression tree (CART) models, identifying concise and high-performing decision rules based on the GAD-7 and PHQ-9 items, and to test their reproducibility in 5 independent datasets.

METHODS: A cross-sectional, nonprobabilistic study was conducted with 20,585 Brazilian adults from all 27 states and more than 3,000 cities, collected using digital outreach. Anxiety and depression symptoms were assessed using the GAD-7 and PHQ-9. CART models were trained and tested on bootstrapped samples (70% training, 30% testing), totaling 45,000 trees per scale. Each model used combinations of scale items and sociodemographic predictors. Robustness was evaluated via 10-fold cross-validation and evaluation across 3 hyperparameter configurations (minsplit and minbucket=500, 1000, 2000). Performance metrics included accuracy, sensitivity, specificity, precision, F1-score, and area under the curve (AUC).

RESULTS: The CART models produced concise, high-performing decision rules-using only 2 items for the GAD-7 and 3 for the PHQ-9. No sociodemographic variable appeared in the final classification paths. For GAD-7, the models achieved an accuracy of 86.1% for minimal or mild severity and 85.1% for severe cases, with both categories showing AUC values above 0.900. By contrast, the moderate severity class had lower performance, with accuracy around 51% and an AUC of 0.728. For PHQ-9, the models achieved 81.7% accuracy for minimal or mild cases and 78.8% for severe cases, with AUCs again exceeding 0.900 for the extreme classes; the moderate or moderately severe class showed 66.9% accuracy and an AUC of 0.776. The most frequently repeated rules included the following: “GAD2<2 and GAD4<2” for identifying minimal or mild anxiety and “GAD2≥2 and GAD4=3” for severe anxiety; for depression, “PHQ2<2and PHQ4<2” for minimal or mild cases and “PHQ2≥2 and PHQ8≥2” for severe cases. These rule-based models demonstrated stable performance across thousands of bootstrapped replications and showed reproducibility in 5 independent datasets through external validation.

CONCLUSIONS: CART models enabled simplified, symptom-specific pathways for stratifying anxiety and depression severity with high precision and minimal item burden. These rule-based shortcuts offer an efficient alternative to fixed short forms (eg, GAD-2, PHQ-2) by preserving symptom diversity and severity discrimination. The findings support and lay the groundwork for adaptive, cost-effective screening and intervention models, especially in resource-limited settings and LMICs.

PMID:41027019 | DOI:10.2196/72591

By Nevin Manimala