Categories
Nevin Manimala Statistics

Assessing the onset of spring water-level rise in snowmelt-dominated rivers of northeastern Russia using machine learning

Sci Rep. 2026 May 26. doi: 10.1038/s41598-026-54492-2. Online ahead of print.

ABSTRACT

The timing of the initial spring water-level rise represents a key indicator of seasonal hydrological transition in snowmelt-dominated river systems of high-latitude regions. This study evaluates the capability of ensemble machine learning (ML) models to estimate the onset date of the spring water-level rise in Arctic-subarctic rivers of the Anadyr-Kolyma basin district in northeastern Russia using a station-year dataset for the period 2008-2022, combining hydrological observations with meteorological and basin-related predictors. Five regression algorithms were tested using grouped cross-validation by year. CatBoost achieved the highest predictive accuracy with an out-of-fold mean absolute error of 4.54 days, RMSE of 9.79 days, and [Formula: see text], slightly outperforming ExtraTrees (MAE 4.66 days) and RandomForest (MAE 4.70 days). Spatial analysis shows that most gauging stations exhibit prediction errors within 0.5-3 days, whereas errors exceeding 10 days occur mainly in small or topographically complex basins with limited observational coverage. Model interpretation using SHapley Additive exPlanations (SHAP) and partial dependence (PDP) analysis indicates that predictors describing thermal forcing during late winter and early spring dominate the model response, with positive degree days during March-April, the first thaw day, and indicators of rapid water-level rise providing the largest contributions. The onset of spring water-level rise in the studied Arctic-subarctic river systems is primarily associated with the interaction between temperature-driven snowmelt processes and the early hydrological response of the river network, whereas precipitation and spatial descriptors exhibit comparatively smaller contributions. These statistical relationships are conditioned on the 2008-2022 period and may vary under different climatic conditions or longer observational records, which should be considered when applying the model for prediction.

PMID:42192198 | DOI:10.1038/s41598-026-54492-2

By Nevin Manimala

Portfolio Website for Nevin Manimala