Sci Rep. 2026 Jan 4. doi: 10.1038/s41598-025-34849-9. Online ahead of print.
ABSTRACT
The rapid evolution of urban demographics necessitates advanced predictive modeling to optimize rail transit capacity and reliability. This study presents a novel age-sensitive demand forecasting and anomaly detection framework for Istanbul’s urban rail network, utilizing a comprehensive dataset of 721,328 passenger-trip records collected between 2021 and 2023. By engineering eleven spatiotemporal and transactional features, passengers are classified into four distinct age cohorts (< 20, 20-30, 30-60, 60+) to capture diverse mobility behaviors. The methodological approach benchmarks four classical linear classifiers, three gradient-boosting decision trees, and a tabular deep learning model (SAINT) against a proposed two-stage hybrid ensemble. This hybrid architecture integrates the deep representational capability of the SAINT Transformer with the categorical robustness of CatBoost, employing a stacking strategy enriched with calibrated uncertainty meta-features (entropy and maximum confidence). Rigorous evaluation using a chronological hold-out protocol demonstrates that the proposed ensemble establishes a new state-of-the-art performance, achieving a peak accuracy of 91.94% and a ROC-AUC of 0.9910, significantly surpassing the standalone SAINT (90.12%) and CatBoost (74.78%) baselines. The statistical significance of this enhancement is confirmed via McNemar’s test (p < 0.001), while five-fold time-series cross-validation verifies generalization stability. Furthermore, an unsupervised anomaly detection mechanism is introduced, achieving a ROC-AUC of 0.77 in distinguishing irregular latent patterns through synthetic perturbation validation. Post-hoc SHAP analysis elucidates the model’s decision-making dynamics, revealing that cumulative usage frequency primarily drives predictions for the working-age population, whereas consistent solo travel behavior characterizes the senior demographic. Consequently, this work delivers a robust, highly calibrated, and interpretable solution for intelligent transportation planning, offering actionable insights for real-time capacity management and operational resilience.
PMID:41486208 | DOI:10.1038/s41598-025-34849-9