Semantic edge-guided single-view 2D/3D registration for vertebrae in X-rays

Med Phys. 2026 Mar;53(3):e70385. doi: 10.1002/mp.70385.

ABSTRACT

BACKGROUND: The integration of artificial intelligence into image-guided intraoperative interventions holds considerable promise for deriving 3D geometric information from 2D imaging. 2D/3D registration establishes the spatial relationship between preoperative computed tomography (CT) and intraoperative X-rays. However, existing methods are often limited by the image domain gap and imprecise feature extraction, causing coarse registration to provide inadequate initial poses and subsequent fine registration to fall into local optima, thereby reducing accuracy.

PURPOSE: We aim to develop a robust single-view lumbar spine 2D/3D registration framework that balances high clinical accuracy with intraoperative efficiency requirements by aligning preoperative CT with intraoperative X-rays.

METHODS: We propose utilizing vertebral body edges in X-rays as novel semantic features to guide 2D/3D registration. For robust edge extraction, we develop ESegMamba, an efficient U-shaped Mamba network incorporating Group multi-axis Hadamard Product Attention (GHPA) and Group Aggregation Concatenation (GAC) modules. Experiments for semantic edge extraction were performed on a dataset of 710 images (comprising X-rays and Digitally Reconstructed Radiographs) derived from 10 patients. The dataset was partitioned using a 4:1 patient-specific split, resulting in 568 training and 142 test images. The training set was further utilized via 5-fold cross-validation for network fine-tuning. ESegMamba was benchmarked against SegMamba, SwinUNETR, and UNETR using Dice and mIoU metrics. For 2D/3D registration, experiments were conducted separately on 300 simulated samples and 90 real clinical samples, following the same patient-specific split. The proposed framework was compared with landmark-based, intensity-based, and learning-based methods using mean Target Registration Error (mTRE). Statistical significance was assessed using the Wilcoxon signed-rank test with a significance level of 0.05, applying Bonferroni correction for multiple comparisons.

RESULTS: ESegMamba outperforms representative networks with fewer parameters (99.18 M), achieving 90.36% Dice and 85.49% mIoU on the test set. Compared to the strong baseline SegMamba, ESegMamba demonstrated a large effect size in Dice improvement (Cohen’s $d = 2.05$ , $p < 0.00067$ ). For 2D/3D registration, the proposed method demonstrated superior performance over representative benchmarks. Specifically, compared to Xreg and PSSS, our method achieved large practical improvements in mTRE ( $d = 1.04$ and $d = 2.12$ , respectively; $p < 0.0011$ ). On real clinical data, the method achieved a mean in-plane translation error of approximately 1.5 mm and an average registration time of approximately 10 s.

CONCLUSIONS: The proposed method, empowered by ESegMamba, yields statistically significant improvements over intensity-based benchmarks ( $p < 0.0011$ ). The achieved sub-2mm accuracy and 10 s processing time on clinical data confirm its efficacy for intraoperative spinal navigation. The code for the proposed method is available at github.com/shenao1995/lineReg.

PMID:41833531 | DOI:10.1002/mp.70385

By Nevin Manimala