Categories
Nevin Manimala Statistics

Operational 24-h PM2.5 forecasting using ensemble methods and open-access satellite data: application to Dakar, Senegal

Environ Monit Assess. 2026 May 13;198(6):581. doi: 10.1007/s10661-026-15384-0.

ABSTRACT

Accurate short-term air quality forecasting is urgently needed in African megacities. Chronic PM2.5 pollution threatens public health in these cities. At the same time, ground-based monitoring infrastructure remains sparse. This study develops and evaluates an operational 24-h PM2.5 forecasting system for the low-income housing (HLM) district of Dakar, Senegal. The site represents a high-exposure urban-industrial environment. Daily PM2.5 concentrations from 2019 to 2024 (mean = 274.71 µg/m3) were predicted using four ensemble machine learning (ML) models: Random Forest (RF), Extra Trees Regression (ETR), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost). Predictor variables included ERA5-Land meteorology, aerosol optical depth (AOD) from the Copernicus Atmosphere Monitoring Service (CAMS), and Sentinel-5P tropospheric columns (NO2, CO, SO2). CatBoost achieved the highest performance on the independent 2024 test set (R2 = 0.931, RMSE = 11.50 µg/m3, MAE = 7.51 µg/m3). Shapley Additive exPlanations (SHAP) analysis identified AOD as the dominant predictor, followed by lagged PM2.5, relative humidity (RH), seasonality, and precipitation (PRECIP), reflecting the combined influence of Saharan dust transport, hygroscopic growth factor (HGF), wet deposition, and Harmattan-monsoon dynamics (see the “Physical interpretation of feature importance and SHAP analysis” section for full interpretation). External forcing dominated over local emission persistence, confirming that accurate 24-h predictions can be issued without any real-time ground-based PM2.5 input. Relying exclusively on open-access data and requiring minimal computational resources, the framework is scalable to other Sahelian urban contexts with similar monitoring constraints. These findings demonstrate that physically interpretable ML can transform sparse ground networks into actionable public health intelligence.

PMID:42120759 | DOI:10.1007/s10661-026-15384-0

By Nevin Manimala

Portfolio Website for Nevin Manimala