Machine learning-based estimation of discharge coefficient for semicircular labyrinth weirs

Sci Rep. 2025 Sep 26;15(1):33002. doi: 10.1038/s41598-025-18230-4.

ABSTRACT

Accurately predicting the discharge coefficient (C_d) in weir structures is crucial for improving hydraulic designs and ensuring their safe operation. This study focuses on developing and testing advanced Machine Learning (ML) models to estimate C_d in Semicircular Labyrinth Weirs (SCLWs). The models explored include a Tabular Neural Network (TabNet) optimized with the Moth Flame Optimization algorithm (TabNet-MFO), an Extreme Learning Machine (ELM) enhanced with the Jaya and Firefly Algorithms (ELM-JFO), a Decision Tree (DT), and a Light Gradient Boosting Machine (LightGBM). One of the key innovations in this study is the introduction of the TabNet-MFO framework. Through sensitivity analysis, using tools like the Explainable Boosting Machine (EBM) and SHapley Additive exPlanations (SHAP), the study found that the ratio of upstream flow depth to weir height (h/P) is the most significant factor affecting C_d predictions. Other important factors include the number of weir cycles (N) and the ratio of crest length to weir height (l_C/P). The dataset was split into 75% for train and 25% for validation. The performance of each model was gauged using a number of statistical indicators. They were the coefficient of determination (R²), the Root Mean Square Error (RMSE), the symmetric Mean Absolute Percentage Error (sMAPE), the Scatter Index (SI), and the Weighted Mean Absolute Percentage Error, or WMAPE and along with Taylor diagrams and the Performance Index (PI) for comparison. In the training phase, the ELM-JFO model delivered the best results in predicting C_d, with a PI of 166 and a normalized centered RMSE (E’) of 0.0052. The TabNet-MFO model also performed well, with a PI of 142 and an E’ of 0.0068. The LightGBM and DT models produced good results as well, with PIs of 89.45 and 89.36, respectively. In the testing phase, the TabNet-MFO model remained the top performer (PI = 81.92, E’ = 0.0118), followed by ELM-JFO (PI = 69.71, E’ = 0.0139). LightGBM and DT showed lower accuracy, with PIs of 60.62 and 47.55 and E’ values of 0.0159 and 0.0199, respectively. The novelty of this research lies in combining interpretable and hybrid ML techniques for C_d estimation, offering a reliable alternative to traditional empirical and regression-based methods. These results show the potential of ML in improving flow prediction accuracy and supporting better hydraulic structure design.

PMID:41006552 | DOI:10.1038/s41598-025-18230-4

By Nevin Manimala