J Cancer Res Clin Oncol. 2025 Dec 17;152(1):14. doi: 10.1007/s00432-025-06377-6.
ABSTRACT
BACKGROUND: Thyroid cancer (TC) is one of the most prevalent endocrine malignancies, and its recurrence presents a major clinical challenge that can adversely affect patient prognosis and treatment outcomes. Despite the progress in diagnostic methods, traditional statistical models still face limitations in accurately predicting TC recurrence due to the intricate interactions between clinical and pathological factors.
METHODS: To address this challenge, the study presented a novel stacking ensemble learning framework for TC recurrence prediction. The dataset included a total of 383 patients, comprising 108 recurrence and 275 non-recurrence cases, and was stratified into training set (n = 268) and testing set (n = 115) using a 70:30 ratio. The proposed stacking framework integrated three heterogeneous base learners, namely Stochastic Gradient Descent (SGD), Extra Trees (ET), and Decision Trees (DT) with eXtreme Gradient Boosting (XGBoost) as the meta learner. The hyperparameter optimization of various learners was performed through 5-fold cross-validation on the training set. The model performance was evaluated on testing set using accuracy, precision, recall, F1-score, AUC, and Brier score (BS). To enhance the model’s interpretability, the Shapley Additive Explanations (SHAP) method was utilized to identify the overall top influential factor and provide local interpretation for specific individual patient based model outcome.
RESULTS: The proposed stacking model achieved accuracy of 96.52%, precision of 96.67%, recall of 90.62%, and F1-Score of 93.55%, AUC of 0.9921 on the testing set. The SHAP analysis revealed the top 5 critical factors to TC recurrence, including treatment response, age, N-stage, risk stratification, and adenopathy. Furthermore, an interactive and user-friendly prediction tool, TCCheck, was developed based on optimized stacking model, accessible online at https://tccheck-prediction-tool.streamlit.app/ .
CONCLUSION: The study presented an effective and interpretable stacking ensemble learning framework for predicting TC recurrence. By deploying the proposed framework as a web prediction tool, it enables explainable and individualized clinical decision support, thereby enhancing its translational value in real-world settings. Furthermore, the framework serves as a methodological reference for recurrence prediction in other cancer types.
PMID:41408410 | DOI:10.1007/s00432-025-06377-6