Categories
Nevin Manimala Statistics

AI-assisted statistical review of 100 oncology research articles: compliance with SAMPL guidelines

Curr Res Transl Med. 2025 Sep 19;73(4):103544. doi: 10.1016/j.retram.2025.103544. Online ahead of print.

ABSTRACT

BACKGROUND: Ensuring accurate statistical reporting is critical in oncology research, where data-driven conclusions impact clinical decision-making. Despite standardized guidelines such as the Statistical Analyses and Methods in the Published Literature (SAMPL), adherence remains inconsistent. This study evaluates the performance of Gemini Advanced 2.0 Flash, an AI model, in assessing compliance with SAMPL guidelines in oncology research articles.

METHODS: A total of 100 original research articles published in four peer-reviewed oncology journals (October 2024-February 2025) were analyzed. Gemini Advanced 2.0 Flash assessed adherence to ten key SAMPL guidelines, categorizing each as “not met,” “partially met,” or “fully met.” AI evaluations were compared with independent assessments by a statistical editor, with agreement quantified using Cohen’s Kappa coefficient.

RESULTS: The overall weighted Kappa coefficient was 0.77 (95 % CI: 0.6-0.94), indicating substantial agreement between AI and manual assessment. Full agreement (Kappa = 1) was found for four guidelines, including naming statistical packages and reporting confidence intervals. High agreement was observed for specifying statistical methods (Kappa = 0.85) and confirming test assumptions (Kappa = 0.75). Moderate agreement was noted for summarizing non-normally distributed data (Kappa = 0.42) and specifying test directionality (Kappa = 0.43). The lowest agreement (Kappa = 0.37) was observed in multiple comparison adjustments due to missing justifications for post hoc tests.

CONCLUSION: AI-assisted evaluation showed substantial agreement with expert assessment, demonstrating its potential in statistical review. However, discrepancies in specific guidelines suggest human oversight remains essential for ensuring statistical rigor in oncology research. Further refinement of AI models may enhance their reliability in scientific publishing.

PMID:41004884 | DOI:10.1016/j.retram.2025.103544

By Nevin Manimala

Portfolio Website for Nevin Manimala