Categories
Nevin Manimala Statistics

CrayStack: a simplified crayfish optimization driven stacking ensemble for prediction of machining quality characteristics under data scarcity

Sci Rep. 2026 Jun 18. doi: 10.1038/s41598-026-55016-8. Online ahead of print.

ABSTRACT

Intelligent manufacturing demands accurate prediction of machining quality characteristics with conflicting behaviour, which is challenging with limited experimental data. Taguchi L27 experimental data sets were collected with 6 influencing variables (workpiece material: PA66, PA66 + GF30, PA66 + MoS₂, tool approach angle, tool nose radius, cutting speed, feed rate and depth of cut) and 8 machining quality characteristics (surface roughness, cutting force, temperature, amplitude of vibration, tool wear rate, specific cutting energy, material removal rate, and sound pressure level). The total 27 experimental datasets were stratified by material into a 3-fold cross-validation protocol. The present work develops five base learners such as Gaussian Process Regression (GPR), Least Squares Boosting (LSB), Support Vector Regression (SVR), Random Forest (RF) and extreme gradient boosting (XGBoost) for predictions of machining quality characteristics. All three metaheuristic algorithms (genetic algorithm (GA), particle swarm optimization (PSO), and crayfish optimization algorithm (COA)) determine identical weights for three best predictive base learners (GPR, SVR, and LSB) for developing Ensemble model. The COA converge to a minimum composite cost with comparatively lesser computation time than GA and PSO. Therefore, CrayStack ensemble model is constructed with a hybrid combination of GPR, SVR, and LSB and COA methods. The COA efficiently optimizes the adaptive fusion weights by assigning a higher weight fraction to GPR and LSB for nonlinear models. CrayStack Ensemble predictions outperforms all individual learners (SVR, GPR, LSB, XGBoost, and RF) with material stratified three-fold cross validation across all eight outputs of training data. CrayStack Ensemble requires a total training cost of 16.13 s (which includes base model training: 94.92% & 15.31 s, COA weight optimization: 1.88% & 0.30 s, and bootstrap confidence interval estimation: 3.20% & 0.52 s) ensuring its practical usefulness. CrayStack Ensemble achieves near-instantaneous inference (0.003 ms/sample; 290,592 samples/s) with a speed of 58.33, 79.33, 4899.33, 5630.67, 5608.3 over GPR, SVR, LSB, RF and XGBoost ensuring practicality suitable for real-time monitoring systems. Wilcoxon single-rank test confirmed that improvements are statistically significant (with a preset confidence level, p < 0.05) for 7 of the 8 responses, validating the practical utility of the developed models. CrayStack Ensemble showed superior prediction performances against nine randomly generated test cases with a mean absolute percent error of 9.8%, followed by GPR, LSB, XGBoost, SVR, and RF of 12.69%, 13.81%, 15.39%, 21.52% and 42.73% considering all responses. The results demonstrated that the intelligent ensemble stack ensures robustness and higher prediction accuracy for limited experimental datasets offering a practical solution for industrial process optimization.

PMID:42315862 | DOI:10.1038/s41598-026-55016-8

By Nevin Manimala

Portfolio Website for Nevin Manimala