Otolaryngol Head Neck Surg. 2026 Jun 17. doi: 10.1002/ohn.70313. Online ahead of print.
ABSTRACT
OBJECTIVE: To determine whether contemporary large language models can match clinician performance in evaluating the urgency of emergency otolaryngology referrals.
STUDY DESIGN: Blinded cross-sectional diagnostic reasoning study.
SETTING: Simulated emergency referral environment modeled on tertiary care otolaryngology practice.
METHODS: Thirty emergency referral scenarios spanning the spectrum of otolaryngologic urgency were independently evaluated by 4 large language models (GPT-5, GPT-4, DeepSeek, and Grok) and 4 clinicians (otolaryngology attending and resident, emergency attending and resident). Outputs were anonymized and scored by 10 blinded otolaryngologists for appropriateness of urgency and quality of explanation using a three-point scale. Statistical analyses included nonparametric group comparisons, adjusted ordinary least squares modeling with case-level control, and correlation of each entity’s case profile with that of the otolaryngology attending.
RESULTS: Inter-rater reliability was excellent. The otolaryngology attending achieved the highest overall performance. GPT-5 demonstrated comparable mean performance, with no statistically significant difference in either domain. GPT-4 scored modestly lower but received higher mean ratings than both emergency clinicians. DeepSeek and the otolaryngology resident demonstrated intermediate performance, while Grok and the emergency clinicians performed lowest. Group-level analyses showed no significant difference between the large language model and otolaryngology cohorts; both were rated higher than emergency clinicians in this sample.
CONCLUSION: GPT-5 demonstrated triage performance comparable to the otolaryngology attending in this controlled sample. Large language models may support emergency decision-making and education when specialist consultation is limited, but require supervision, transparency, and local calibration.
PMID:42307998 | DOI:10.1002/ohn.70313