Categories
Nevin Manimala Statistics

Personalized echocardiographic segmentation via bidirectional encoder representations from transformers Y-shaped network with patient attributes

Med Phys. 2026 Jan;53(1):e70235. doi: 10.1002/mp.70235.

ABSTRACT

BACKGROUND: Accurate cardiac structure segmentation from echocardiography is essential for quantitative cardiac function assessment in clinical cardiology. However, traditional manual annotation is time-consuming and subjective, and existing automated methods often overlook inter-patient anatomical differences, particularly sex-related variability, limiting their generalizability.

PURPOSE: We propose BTY-Net (Bidirectional Encoder Representations from Transformers (BERT) Text-based Y-shaped Network), a novel automatic segmentation framework designed to incorporate patient-specific attributes, enabling personalized and anatomically adaptive segmentation of echocardiograms.

METHODS: BTY-Net is built upon a Unet3+ backbone combined with a Transformer encoder, incorporates a multi-layer denoising filter to enhance image quality, and employs a pre-trained BERT model to encode patient demographic and acquisition context as natural language embeddings. Experiments were conducted on the Cardiac Acquisitions for Multi-structure Ultrasound segmentation dataset (500 biplane cases, 400/50/50 train/validation/test split). We benchmarked eight state-of-the-art models (e.g., variants of Unet and generative adversarial networks). Dice similarity (Dice) and Hausdorff Distance (HD) served as the primary metrics. Statistical significance was assessed via the Wilcoxon signed-rank test with p < 0.05 as the threshold, and Holm-Bonferroni correction (α = 0.05) was applied for multiple comparisons. The Hedges’ g effect was calculated to quantify the difference.

RESULTS: BTY-Net achieved the highest Dice coefficients across the three cardiac structures (LV endocardium: 0.9316 ± 0.027; LV myocardium: 0.8617 ± 0.050; left atrium: 0.8703 ± 0.086) and the lowest HD values (8.14 ± 3.44, 10.97 ± 7.41, and 11.13 ± 8.16, respectively). Compared with the strongest baseline, BTY-Net improved Dice by up to 0.02-0.03 and reduced HD by approximately 1.1-1.5 mm, with Holm-adjusted p < 0.05 and small-to-medium Hedges’g effect sizes. Across all test cases, BTY-Net yielded the highest agreement with reference ejection fraction (correlation = 0.9119, MAE = 3.40%). Sex-stratified analyses further confirmed stable performance in both male and female subgroups, indicating robust adaptation to anatomical diversity.

CONCLUSIONS: BTY-Net offers an effective and interpretable solution for personalized echocardiographic analysis. By leveraging multimodal fusion of patient information and image data, it enhances segmentation accuracy, and embeds clinically meaningful attention maps, thereby delivering a multimodal, sex-robust and clinically interpretable solution for routine echocardiographic analysis.

PMID:41423714 | DOI:10.1002/mp.70235

By Nevin Manimala

Portfolio Website for Nevin Manimala