Benefit of the N-of-1 Approach Versus Aggregate Analysis in Tracking Individual Trajectories During Pregnancy: Comparison of Longitudinal Wearable Observational Studies

JMIR Form Res. 2026 Apr 28;10:e86203. doi: 10.2196/86203.

ABSTRACT

BACKGROUND: Personal digital health technologies (DHTs) enable real-time monitoring of physiological metrics and behavioral data, including heart rate variability (HRV), supporting analysis of pregnancy-related conditions and personalized care throughout the perinatal period. While recent studies demonstrate the utility of personal DHTs in tracking pregnancy-related symptoms, they often rely on aggregate statistical methods that overlook individual variability.

OBJECTIVE: This study aims to compare aggregate and individual-level analyses of DHT data for pregnancy-related conditions, using the comprehensive BUMP (Better Understanding the Metamorphosis of Pregnancy) dataset to highlight the importance of individual variability and data heterogeneity.

METHODS: We analyzed wearable and self-reported data from 256 participants enrolled in the BUMP study (January 2021 to May 2022), including HRV, sleep, and fatigue measured via Oura Rings and smartphone surveys. Individual-level (N-of-1) trajectories were evaluated and compared with aggregate results to uncover personal and collective trends. A statistical method was developed to assess the influence of adverse events and severe symptoms, while case studies explored confounding and modifying factors underlying heterogeneity. Comprehensive statistical analysis included the coefficient of determination, Kolmogorov-Smirnov tests, likelihood ratio tests, and Welch t tests, with interindividual variability flagged based on high-variability thresholds.

RESULTS: Substantial interindividual variability was observed across all features. Only 4.76% (12/256) of participants exhibited an HRV inflection at the aggregate week-33 inflection point, with a coefficient of variation of 14.24%. The median value of the gestational week in individual fatigue troughs was 23 (IQR 8; range 8-38) weeks, differing from aggregate estimates. Distributional comparisons showed no statistically significant differences in individual-level model fit (R²) by pregnancy complications or age (P values ranging from .06 to .99 across all model fit comparisons). Case studies further highlighted both intraindividual and interindividual differences, emphasizing the importance of considering external factors, such as adverse events and severe symptoms.

CONCLUSIONS: Our findings show that aggregate wearable data often fail to generalize across populations, oversimplifying pregnancy-related physiological and subjective changes. This simplification can obscure individual trajectories, leading to generalized insights that may not reflect many pregnant women’s experiences. Our results highlight the impact of heterogeneity on pregnancy outcomes, emphasizing the need to move beyond one-size-fits-all models and leverage DHT for personalized care.

PMID:42048637 | DOI:10.2196/86203

By Nevin Manimala