Eur Radiol. 2026 Feb 21. doi: 10.1007/s00330-026-12385-y. Online ahead of print.
ABSTRACT
OBJECTIVE: To evaluate the performance of a commercial artificial intelligence (AI) software in detecting intracranial hemorrhage (ICH) in emergency settings, compared to on-call radiology residents.
MATERIALS AND METHODS: All consecutive unenhanced cerebral CT-scans performed in a single center over a 3-month period in the emergency department in patients with suspected ICH, initially interpreted by radiology residents on-call and subsequently verified and approved by a board-certified radiologist, were concomitantly analyzed by an AI software for the presence of ICH. Results from the AI software were stored in a separate PACS partition and were unavailable to the radiologists for the case reading. We assessed the diagnostic performance of the AI software and of the radiology residents in detecting ICH. The reference standard was the final report of the board-certified radiologist.
RESULTS: Radiology reports of 2153 CT-scans were analyzed, and ICH prevalence was 15.4% (331/2153). The AI software achieved an overall sensitivity of 84% and a specificity of 94.4%, and radiology residents achieved a sensitivity of 96.4% and a specificity of 99.6%, respectively (p-values < 0.001). The sensitivity was 97.7% for AI and 98.5% for residents when CT examinations displayed an association of multiple hemorrhagic types (p = 1). The sensitivity was 95.2% for AI and 98.4% for radiology residents in the presence of multiple ICH sites (p = 0.11).
CONCLUSION: Radiology residents demonstrated a significantly higher performance in detecting ICH compared to the AI software. AI exhibited very good diagnostic performance only in the presence of multiple hemorrhagic sites or multiple hemorrhage types.
KEY POINTS: QuestionHow does the performance of the AI software compare to that of radiology residents in detecting ICH on unenhanced CT in real-life emergency workflow conditions? FindingsIn the emergency setting, the AI software demonstrated lower overall sensitivity and specificity than radiology residents for detecting ICH. Clinical relevanceIn real-life emergency conditions at a university hospital, the AI software did not offer a superior performance compared to radiology residents in detecting ICH. The integration of AI in this specific setting remains to be defined.
PMID:41721849 | DOI:10.1007/s00330-026-12385-y