Artificial intelligence for oral cancer diagnosis: a systematic review and meta-analysis of image-based and non-imaging models

BMC Cancer. 2026 May 25. doi: 10.1186/s12885-026-16154-4. Online ahead of print.

ABSTRACT

BACKGROUND: Artificial intelligence (AI) is increasingly recognized as a valuable tool for the early detection and prognosis of oral cancer, addressing the challenge of high mortality due to late diagnosis. Artificial intelligence based diagnostic models have the potential to improve accuracy in differentiating between malignant, premalignant and benign oral lesions. This systematic review and meta-analysis evaluated the diagnostic performance of non-imaging and image-based artificial intelligence models and narratively synthesized evidence on prognostic and risk stratification applications in oral cancer.

METHODS: This study follows PRISMA guidelines to ensure quality and reproducibility. A systematic search across PubMed, Embase, Web of Science, Google Scholar and Scopus identified studies from 2010 to 2024 on artificial intelligence applications in oral cancer diagnosis. Sixteen eligible studies met predefined inclusion criteria, including AI-based screening compared to histology. Data extraction and bias assessment were conducted independently using QUADAS-2. The findings highlight AI’s potential in early detection and prognosis, emphasizing the need for further validation and clinical integration to enhance diagnostic accuracy.

RESULTS: A total of 801 studies were initially identified, with 53 undergoing further review, ultimately selecting 16 studies. Sample sizes varied from 70 to 44,000, allowing a broad evaluation of AI’s diagnostic performance. Artificial intelligence models showed wide range of sensitivity (42%-100%), specificity (63%-100%), and accuracy (63%-100%). Meta-analysis revealed a pooled sensitivity of 0.90 (95% CI: 0.81-0.98), specificity of 0.89 (95% CI: 0.84-0.95), and accuracy of 0.89 (95% CI: 0.83-0.95), with substantial heterogeneity (I² = 100%). Image-based models had higher pooled sensitivity (0.94 vs. 0.76, P = 0.320), specificity (0.93 vs. 0.79, P = 0.025), and accuracy (0.93 vs. 0.81, P = 0.042).

CONCLUSIONS: Artificial intelligence models show promising diagnostic performance for oral cancer based on retrospective clinical data. Although image-based models, particularly convolutional neural networks, demonstrated higher pooled sensitivity and specificity than non-imaging models, these differences were not statistically significant. Results should be interpreted with caution due to substantial heterogeneity. Advances reported in the literature, such as multimodal approaches and data augmentation, may improve non-imaging model performance and help narrow the gap between methodologies. These developments highlight AI’s potential in enhancing early detection and prognosis of oral cancer.

PMID:42185818 | DOI:10.1186/s12885-026-16154-4

By Nevin Manimala