Curr Probl Diagn Radiol. 2026 Mar 11:S0363-0188(26)00042-3. doi: 10.1067/j.cpradiol.2026.03.001. Online ahead of print.
ABSTRACT
BACKGROUND: Artificial intelligence (AI) is increasingly integrated into radiology across multiple workflow levels, with its role as a simultaneous second reader holding particular promise.
METHODS: We performed an umbrella review of systematic reviews and meta-analyses reporting pooled diagnostic accuracy of AI models using clinician (including radiologist) interpretation as the reference standard. References were identified through queries of Pubmed, Scopus, Embase, and Google Scholar (last updated January 7th, 2025). Data were analyzed using metaumbrella tool within R statistical software with stratification of evidence by Ioannidis criteria. Study quality was assessed using the AMSTAR-2 tool.
RESULTS: From 1,719 unique references, ten meta-analyses met inclusion criteria, encompassing 147 primary studies with over 722,000 case and 3.6 million control images. Diagnostic odds ratios ranged from 30.67 (95% CI; 10.06-102.87), fracture detection on X-ray, to 273.60 (95% CI; 130.51-573.58), pulmonary nodule detection on CT. Most meta-analyses (n = 9) provided Class II evidence, reflecting highly suggestive findings limited by invariably substantial heterogeneity (I² = 89.9%-99.9%). The quality was assessed as critically low in nine reviews and low in one.
DISCUSSION: AI models have shown strong diagnostic performance across various radiologic applications. Due to our inclusion criteria requiring clinician/radiologist interpretation as the reference standard, these findings reflect AI-human agreement rather than AI accuracy using a more definitive ground truth (e.g. histopathology). Furthermore, the strength of this evidence is limited by substantial heterogeneity, variability in imaging modalities, and differences in model development and validation.
PMID:41850944 | DOI:10.1067/j.cpradiol.2026.03.001