J Imaging Inform Med. 2026 Apr 2. doi: 10.1007/s10278-026-01934-y. Online ahead of print.
ABSTRACT
This study proposes a Residual Conditional Variational Autoencoder model (ResCVAE-Harmonizer) that integrates batch information and clinical covariates for multi-center feature harmonization and systematically and comprehensively evaluates its harmonization performance. This study collected 806 cases from 9 different centers. After preprocessing, three types of features were extracted from PET and CT images: low-dimensional radiomic features, high-dimensional radiomic features, and deep learning features based on 3D-DenseNet-121. Each feature type was harmonized using ComBat, CovBat, and the proposed ResCVAE-Harmonizer. Both harmonized and original features were included in a comprehensive evaluation framework comprising variance homogeneity analysis, multi-center classification test, and downstream task effectiveness evaluation. The ResCVAE-Harmonizer significantly improved cross-center feature consistency. Levene’s test results showed a general reduction in – log10(p) values after harmonization, with more pronounced improvements observed in low- and high-dimensional radiomic features. In center classification tasks, ResCVAE-harmonized features demonstrated greater stability across four classifiers and outperformed the original features. For the downstream survival prediction task, PET deep learning features processed by ResCVAE achieved the highest C-index (0.8920, 95% CI 0.8514-0.9325), surpassing those of the original features (0.8765), ComBat (0.8909), and CovBat (0.8455). Similarly, the C-index for CT deep features improved to 0.8296 (95% CI 0.7715-0.8877). Kaplan-Meier survival stratification based on ResCVAE features showed clearer separation between high- and low-risk groups, with statistically significant log-rank test results. While slightly inferior to ComBat in linear variance consistency, ResCVAE-Harmonizer effectively eliminated both linear and nonlinear batch effects and significantly enhanced survival prediction performance, demonstrating strong research potential.
PMID:41927822 | DOI:10.1007/s10278-026-01934-y