Machine learning-based prediction model for omental metastasis in right-sided colon cancer patients: a retrospective multicenter study

Int J Colorectal Dis. 2025 Nov 17;40(1):233. doi: 10.1007/s00384-025-05031-4.

ABSTRACT

PURPOSE: Current diagnostic modalities lack sufficient sensitivity for detecting omental metastasis (OM), often underestimating metastatic burden. Unlike traditional statistical model, machine learning (ML) model is designed to detect subtle variable interactions and model nonlinear patterns that traditional statistics overlook, enhancing the reliability of OM risk evaluation in clinical practice. The aim of the study was to build a ML model in preoperatively predicting OM in right-sided colon cancer (RCC) patients using a multicenter dataset.

METHODS: This retrospective multicenter study included 1798 RCC patients: 1206 from Zhejiang Cancer Hospital (training set n = 804, test set n = 402) and 592 from the Second Affiliated Hospital of Harbin Medical University (validation set). OM status, tumor location, preoperative CEA level, preoperative CA199 level, Grade, histology, tumor size and age of patients were recorded. Six ML models including extreme gradient boosting (XGB), artificial neural network (ANN), logistic regression (LR), random forest (RF), support vector machine (SVM) and decision tree (DT) were developed for the OM prediction in RCC. The area under the receiver operator characteristic (ROC) curve (AUC), accuracy, sensitivity, specificity, precision, F1 score and decision curve analysis (DCA) were analyzed for judging predictive performance.

RESULTS: The OM rates in training set, test set and validation set were 10.4%, 9.5% and 10.0%, respectively. The XGB model outperforming five other algorithms (ANN, RF, LR, SVM, and DT) across training set (AUC = 0.924, 0.096 gain vs LR), internal test (AUC = 0.868, 0.038 gain vs LR) and validation set (AUC = 0.766, 0.065 gain vs LR). The comparison of accuracy, sensitivity, specificity, precision and F1 score revealed the XGB model exhibited the best performance. The DCA curve also suggested that XGB had better clinical decision-making capability than the other five models. Feature importance analysis highlighted preoperative CEA level and tumor location as key predictors.

CONCLUSION: Our study developed and validated an XGB-based machine learning model that could accurately predict OM in RCC patients using routine preoperative variables. This model demonstrates strong discriminative ability and clinical utility, assisting personalized risk stratification and appropriate treatment decisions.

PMID:41242993 | DOI:10.1007/s00384-025-05031-4

By Nevin Manimala