Optimizing Air Pollution Exposure Assessment with Application to Cognitive Function

Res Rep Health Eff Inst. 2025 Aug;(228):1-117.

ABSTRACT

INTRODUCTION: Epidemiological studies often make use of exposure data that is collected in opportunistic and logistically convenient ways. And, while exposure assessment is fundamental to environmental epidemiology, little is known about what exposure assessment study designs are optimal for health inference. The objective of this project was to advance our understanding of the design of exposure assessment measurement campaigns and evaluate their impact on estimating the associations between long-term average air pollution exposure and cognitive function. This feeds into the broader goal of advancing understanding of air pollution exposure assessment design for application to epidemiological inference.

METHODS: We leveraged data from the Adult Changes in Thought (ACT³) Air Pollution study (ACT-AP) to characterize exposures for over 5,000 participants from the ongoing ACT cohort. This is a population-based cohort of urban and suburban elderly individuals in the greater Puget Sound region drawn from Group Health Cooperative, now Kaiser Permanente, starting in 1994. Participants were routinely followed with routine biennial visits until dementia incidence, drop-out, or death. Extensive health, lifestyle, biological, and demographic data were also collected. The outcome measure used in this report is cognitive function at baseline based on the Cognitive Abilities Screening Instrument derived using Item Response Theory (CASI-IRT). The IRT transformation of the CASI score improves score accuracy, measures cognitive change with less bias, and accounts for missing test items. Health association analyses were based on 5,409 participants with both a valid CASI score and who had lived in the mobile monitoring region during at least 95% of the 5 years prior to baseline. We used 5-year average exposures that accounted for residential history.

Exposure data came from two distinct exposure assessment campaigns carried out by the ACT-AP study: a campaign using low-cost sensors (2017+) that supplemented existing regulatory monitoring data for fine particles (PM_2.5, 1978+) and nitrogen dioxide (NO₂, 1996+), and a year-long multipollutant mobile monitoring campaign (2019-2020). The evaluation of the added value of low-cost sensor data relied on a combination of regulatory monitoring data and other high-quality data from research studies, calibrated 2-week low-cost sensor measurements from over 100 locations, which were mostly ACT cohort residences, and a snapshot campaign that measured NO₂ using Ogawa samplers. Predictions were at a 2-week average time scale, used a suite of ~200 geographic covariates, and were obtained from a spatiotemporal model developed at the University of Washington. The Seattle mobile monitoring campaign collected a combination of stationary roadside and on-road measurements of ultrafine particles (UFPs, four instruments), black carbon (BC), NO₂, carbon dioxide (CO₂), and PM_2.5. Visits were temporally balanced over 288 drive days such that all sites were visited during all seasons, days of the week, and most hours of the day (5 a.m. to 11 p.m.) approximately 29 times each. For the on-road measurements, we divided the driving route into 100-meter segments and assigned all measurements to the segment midpoint. Predictions used the same suite of geographic covariates in a spatial model fit using partial least squares (PLS) dimension reduction with universal kriging (UK-PLS) to capture the remaining spatial structure. We reported model performance metrics for both the spatial and spatiotemporal models as root mean squared error (RMSE) and mean squared error (MSE)-based R². The reference observations for the spatiotemporal model were low-cost sensor measurements at home locations (with performance metrics averaged over their entire measurement period to approximate spatial contrasts), and for the spatial model, the reference observations were the all data long-term averages at stationary roadside locations.

Using various approaches to sample data from these two exposure monitoring campaigns, we determined the impact on exposure prediction and estimates of health associations using two confounder models and 5-year average exposure predictions for cohort members at baseline developed from the alternative campaigns. For the low-cost sensor data, we evaluated temporally or spatially reduced subsets of low-cost sensors, as well as a comparison of the low-cost sensor versus snapshot campaigns for NO₂. For the mobile monitoring data, we considered designs focused on the stationary roadside and on-road data separately. We reduced the stationary roadside data temporally by restricting seasons, times of day, or days of week for the campaign, while also considering a reduced number of visits using balanced sampling, as well as a set of unbalanced visit designs. We also reduced the on-road data spatially and temporally to assess the importance of spatially or temporally balanced data collection. In addition, we considered the impact of incorporating temporal adjustment to account for temporally unbalanced sampling, as well as plume adjustment to account for on-road sources. For each design, we evaluated prediction model performance using the all data stationary roadside observations (mobile campaign) or the measurements at homes (low-cost sensor campaign) as reference observations to ensure consistency in reported performance metrics. We also used long-term average exposures estimated from these alternative campaigns in health association analyses under two different confounder models that were adjusted by potentially confounding variables: Model 1 adjusted for age, calendar year, sex, and educational attainment; Model 2 included all Model 1 variables with the addition of race and socioeconomic status. Furthermore, using the stationary roadside data, we applied parametric and nonparametric bootstrap methods to account for Berkson-like and classical-like exposure measurement error for the UFP exposure in confounder model 1.

In a separate methods-focused aim, we developed and applied advanced statistical methods using the stationary roadside mobile monitoring data. To evaluate possible improvements in exposure model performance, we applied tree-based machine learning algorithms that also account for residual spatial structure, and compared these to UK-PLS. This led to the development of a variable importance metric that uses a leave-one-out approach to evaluate the change in predictions across various user-specified quantiles. The variable importance metric produces covariate-specific averages that reflect how the predictions, on average, vary across different quantiles of each covariate. This serves as an intuitive measure of the contribution of this covariate to the predicted outcome. A key idea in this variable importance approach is to reuse the trained mean model across all locations and to refit the covariance model in a leave-one-out manner. In separate work to address dimension reduction for multipollutant prediction, we extended classical principal component analysis (PCA) and a recently developed predictive PCA approach to optimize performance by balancing the representativeness in classical PCA with the predictive ability of predictive PCA. We called the new method representative and predictive PCA, or RapPCA.

Finally, we characterized the various exposure assessment campaigns in terms of the value of their information as quantified by cost. We calculated costs, focused predominantly on staff days of effort, for various exposure assessment designs and compared these to exposure model performance statistics.

RESULTS: We found that air pollution exposure assessment design is critical for exposure prediction, and also impacts health inference. We showed that a mobile monitoring study with stationary roadside sampling that has at least 12 visits per location in a balanced and temporally unrestricted design optimizes exposure model performance while also limiting costs. Relative to weaker alternatives, a balanced and temporally unrestricted design has improved accuracy and reduced variability of health inferences, particularly for confounder model 1. To address temporal balance, it is important that the exposure sampling in mobile monitoring campaigns cover all days of the week, most hours of the day, and at least two seasons. The popular temporally restricted business-hours sampling design had the poorest performance, which was not improved by adjusting for the temporally unbalanced sampling approach. We found similar patterns using on-road data, though the findings were weaker overall.

For the alternative exposure campaign that supplemented regulatory monitoring data with low-cost sensor data, while the exposure prediction model performances improved with the inclusion of the low-cost sensors, there was little notable impact on the health inferences, and the costs were steep. Given that the supplementary exposure assessment data were sparse relative to the existing regulatory monitoring data, and that the low-cost sensor data collection used a rotating approach due to the limited number of sensors (i.e., low-cost sensor measurements were not collected using a balanced design), it was much more challenging to develop deep insights from this exposure assessment approach.

Finally, we found that leveraging spatial ensemble-learning methods for prediction did not improve exposure prediction model performances or alter health inferences. The new multipollutant dimension-reduction we developed, RapPCA, had the best predictive performance and also minimized the prediction error in comparison with both classical and predictive PCA.

CONCLUSIONS: This project has shown that there should be greater attention to the design of the exposure data collection campaigns used in epidemiological inference. Based on the multiple investigations conducted, many of which focused on UFPs, we found that exposure predictions with better performance statistics resulted in health association estimates that were generally more consistent with those obtained using the “best” exposure model predictions (the model with all data included), although the pattern of health estimates was often less conclusive than the pattern of prediction model performances. Furthermore, we found that it is possible to design air pollution exposure assessment studies that achieve good exposure prediction model performance while controlling their relative cost.

We developed strong recommendations for mobile monitoring campaign design, thanks to the well-designed and comprehensive Seattle mobile monitoring campaign. Insights from supplementing regulatory monitoring data with low-cost sensor data were less compelling, driven predominantly by a data structure with sparse and temporally unbalanced supplementary data that may not have been sufficiently comprehensive to demonstrate the impacts of alternative designs. Broadly speaking, better exposure assessment design leads to better exposure prediction model performance, which in turn can benefit estimates of health associations.

We did not find that leveraging advanced statistical methods (specifically, spatial ensemble-learning methods for prediction) improved exposure prediction model performances. This finding is not consistent with the conclusions reached by other investigators, and may have been due to the already sophisticated UK-PLS approach we used by default, and in particular its application in conjunction with the large number of covariates that we considered in the PLS model, such that the contribution of any single covariate was approximately linear. In other words, it is reasonable to believe that in the presence of the large set of covariates we considered, each can contribute an approximately linear association with the pollutant being modeled, such that the potential added value of the spatial Random Forest approach is not observed in the model fit. Other settings with a smaller number of possible covariates available may lead to different conclusions and suggest greater added value of the application of a spatial Random Forest approach.

We based our approach on leveraging the extensive air pollution exposure assessment and outcome data available from the ACT-AP study. Thus, we sampled from the existing air pollution data to evaluate exposure assessment designs that were subsets of those data. Then, conditional on each of these designs, we evaluated subsequent health inferences, which focused on cognitive function at baseline using the CASI-IRT outcome. The magnitude and uncertainty of these health association estimates were dependent upon the associations evident in the ACT cohort, and the insights we were able to develop are conditional on the strengths and weaknesses of these data. Specifically, while we observed some larger impacts on health association estimates of more poorly performing exposure models relative to the complete all data exposure model, such as the business-hours design from a mobile monitoring campaign, many of the differences were small and did not deviate meaningfully from the health association estimate obtained from the “best” exposure model. The degree of impact on the epidemiological inference depended on the magnitude of the health association estimate from the “best” exposure model and the width of its confidence interval. Future investigations should replicate and expand upon these findings in other settings, including application to new cohorts and exposure assessment data, as well as in simulation studies, which provide an alternative approach to using real-world data to evaluate a constellation of exposure models. However, while knowledge of the assumed underlying truth is an important strength of simulation studies, it is challenging to capture real-world complexity meaningfully in simulation studies.

Our foray into applying advanced machine-learning methods to improve exposure predictions produced the surprising result that our default UK-PLS approach for spatial prediction produced similar performance metrics to spatial ensemble-learning methods. Future evaluations that assess smaller subsets of exposure covariates will allow determination of the relative exposure model performance benefits of UK-PLS versus spatial ensemble-learning methods, and provide insights into the possible reason that our conclusions differ from others in the literature.

PMID:41310253

By Nevin Manimala