Categories
Nevin Manimala Statistics

Mixed effect gradient boosting for high-dimensional longitudinal data

Sci Rep. 2025 Aug 22;15(1):30927. doi: 10.1038/s41598-025-16526-z.

ABSTRACT

High-dimensional longitudinal data present significant analytical challenges due to intricate within-subject correlations and an overwhelming ratio of predictors to observations. To address these challenges, we introduce Mixed-Effect Gradient Boosting (MEGB), a novel R package that synergises gradient boosting with mixed-effects modelling to simultaneously account for population-level fixed effects and subject-specific random variability. MEGB provides a unified framework for analysing repeated measures data that accommodates complex covariance structures while harnessing gradient boosting’s inherent regularisation for robust feature selection and prediction. In comprehensive simulations spanning linear and nonlinear data-generating processes, MEGB achieved 35-76% lower mean squared error (MSE) compared to state-of-the-art alternatives like Mixed-Effect Random Forests (MERF) and REEMForest, while maintaining 55-70% true positive rates for variable selection in ultra-high-dimensional regimes ( p = 2000 ) . Demonstrating practical utility, we applied MEGB to maternal cell-free plasma RNA data ( n = 12 subjects, p = 33 , 297 transcripts), where it identified 9 key placental transcripts driving fetal RNA dynamics across pregnancy trimesters.

PMID:40847064 | DOI:10.1038/s41598-025-16526-z

By Nevin Manimala

Portfolio Website for Nevin Manimala