J Med Internet Res. 2026 May 27;28:e87049. doi: 10.2196/87049.
ABSTRACT
BACKGROUND: Health care workers (HCWs) face sustained psychological demands that place them at heightened risk for burnout and posttraumatic stress disorder (PTSD). However, assessing psychological distress in this population remains challenging because of stigma, underreporting, and the limitations of self-report tools. Although nonverbal behaviors such as facial expressions and gaze hold diagnostic promise, most approaches overlook the fine-grained, temporal fluctuations in these signals. In this study, we focused on microbehavior intervals-brief, involuntary changes in multimodal nonverbal signals-that emerge during emotion-eliciting interviews.
OBJECTIVE: This study aimed to determine whether microbehavior intervals improve the discrimination of psychological distress profiles among HCWs with symptoms of burnout and PTSD.
METHODS: HCWs participated in a semistructured interview that included 5 work-related, emotionally charged questions and that was recorded via Webex (online video platform). Participants also completed validated questionnaires for burnout (Maslach Burnout Inventory General Survey 9-item) and PTSD (PTSD checklist for Diagnostic and Statistical Manual, 5th edition). Recordings were analyzed with computer vision models to generate time-series data of facial expressions, head movement, gaze, body posture, and hand gestures. An unsupervised anomaly detection model (MOMENT [a Family of Open Time-Series Foundation Models]) isolated microbehavior intervals without requiring manual labels. Features derived from these intervals were used to train a deep learning classifier that predicted 4 symptom classes of psychological distress: “moderate-severe burnout,” “subthreshold-provisional PTSD,” “burnout+PTSD,” and “resilient.” We conducted an ablation study by systematically removing one behavioral data stream at a time. Finally, we conducted an explainability analysis to characterize the features driving model predictions.
RESULTS: We analyzed 258 interview recordings from 151 HCWs. Per interview, an average of 19.65 (SD 6.01) microbehavior intervals were detected, each lasting an average of 1.31 (SD 1.10) seconds. The classifier demonstrated robust performance across classes, achieving a macro- F1-score of 0.75 and a macro area under the receiver operating characteristic curve of 0.80 on held-out data. Ablation analysis showed that excluding gaze or arousal-valence signals caused the largest performance declines, particularly in recall and F1-score. The explainability analysis revealed distinct temporal patterns across symptom classes, with irregularity and variability in microbehaviors emerging as key predictors.
CONCLUSIONS: Focusing on microbehavior intervals yields a scalable, interpretable, and annotation-free framework for detecting psychological distress from nonverbal signals. By moving from whole-video features to fine-grained multimodal temporal modeling, we successfully captured subtle, involuntary fluctuations in nonverbal responses to emotion-eliciting questions. This multimodal approach enables an objective, robust, and explainable assessment of psychological distress and offers a promising complement to conventional psychometric assessments.
PMID:42202278 | DOI:10.2196/87049