Categories
Nevin Manimala Statistics

Development of soil surface wetness models using machine learning techniques in the selected sites in Punjab, North-Western India

Sci Rep. 2026 May 26. doi: 10.1038/s41598-026-50687-9. Online ahead of print.

ABSTRACT

Accurate prediction of soil surface wetness (SSW) is vital for effective land management and resource optimization, particularly in sensitive ecosystems like the Western Himalayas. The main objective of the present study is to improve the accuracy of SSW prediction using hybrid and bagging models in the selected sites in Punjab, north-western India. The study utilizes ten machine-learning models comprising five base learners-random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), logistic model tree (LMT), and classification and regression tree (CART) and their corresponding AdaBoost-based hybrid variants: AdaBoost-RF, AdaBoost-XGBoost, AdaBoost-LightGBM, AdaBoost-LMT, and AdaBoost-CART. The SSW data set was collected from NASA POWER platform and Goddard Earth Observing System (GEOS) derived data which covers the period from 1986 to 2021. For model development, we applied a monthly data-lagging technique to generate different model scenarios. For feature selection, we applied greedy stepwise and best-first algorithms to identify the most effective predictors and improve model efficiency. Evaluations of the models were based on a variety of statistical indices. The results show that the RF model achieved the highest correlation coefficients (CC) across the study areas, ranging from 0.64 to 0.76 in Moga, 0.61 to 0.82 (XGBoost) in Hoshiarpur, and 0.41 to 0.64 (LMT) in Firozpur during the testing period. Accordingly, the hybrid AdaBoost-LightGBM model was the best, with CC values of 0.61-0.76, 0.63-0.82, and 0.49-0.60 for Moga, Hoshiarpur, and Firozpur, respectively. Overall model performance was limited to moderate and varied across locations and scenarios; although AdaBoost-LMT showed the best relative performance, the results primarily support its use for reproducing temporal variability in GEOS-derived SSW rather than precise wetness estimation. The findings contribute to improving SSW estimation and support data-driven decision-making for sustainable land and water management in the selected sites in Punjab, north-western India.

PMID:42192144 | DOI:10.1038/s41598-026-50687-9

By Nevin Manimala

Portfolio Website for Nevin Manimala