BMC Med Imaging. 2025 Dec 6. doi: 10.1186/s12880-025-02094-1. Online ahead of print.
ABSTRACT
BACKGROUND: This study explores the feasibility and effectiveness of an interpretable machine learning model for assessing the pathological grading of pancreatic ductal adenocarcinoma (PDAC) using radiomics and topological features derived from contrast-enhanced CT habitat subregions.
METHODS: A retrospective study was conducted on a total of 306 patients with PDAC from two hospitals: a training cohort (n = 176), a validation cohort (n = 76), and a test cohort (n = 54). K-means clustering analysis was first used to segment portal venous phase CT images into three habitat regions. Radiomics features of the whole-tumour region, along with radiomics and topological features of each habitat region, were extracted respectively. LASSO regression was applied for feature dimensionality reduction to construct the radiomics score (Rad-score) for the whole-tumour region and the habitat score (H-score) for each habitat region. Meanwhile, logistic regression was used to identify statistically significant predictors from clinical and semantic features. Five machine learning algorithms were used to construct Habitat-TDA models, with interpretability analysis performed via SHAP analysis.
RESULTS: Total volume, diabetes, and M staging were identified as independent risk factors for predicting the pathological grading of PDAC, and were used to construct the Clinical model. 6 radiomics features with non-zero coefficients were selected to calculate the Rad-score, which was further used to construct the WholeRad model. In the three habitat regions, 6, 5, and 6 topological and radiomics features were included to generate the H-score. The logistic regression algorithm performed best in the validation and test cohorts and was ultimately selected as the classifier for constructing the Habitat-TDA model. SHAP analysis showed that H-score1, derived from Habitat Region 1 (the habitat region with the lowest average CT value), has the most significant average impact on the model output intensity. The AUC values of the Habitat-TDA model in the training, validation, and test cohorts were 0.894, 0.872, and 0.829, all outperforming the clinical model (0.784, 0.765, 0.731) and WholeRad model (0.817, 0.810, 0.773).
CONCLUSIONS: The Habitat-TDA model improves the accuracy and interpretability of preoperative predictions of PDAC grading, providing a promising tool for personalised management.
PMID:41353533 | DOI:10.1186/s12880-025-02094-1