Categories
Nevin Manimala Statistics

From dictation to diagnosis: enhancing radiology reporting with integrated speech recognition in multimodal large language models

Eur Radiol. 2025 Aug 15. doi: 10.1007/s00330-025-11929-y. Online ahead of print.

ABSTRACT

OBJECTIVES: This study evaluates the efficiency, accuracy, and cost-effectiveness of radiology reporting using audio multimodal large language models (LLMs) compared to conventional reporting with speech recognition software. We hypothesized that providing minimal audio input would enable a multimodal LLM to generate complete radiological reports.

MATERIALS AND METHODS: 480 reports from 80 retrospective multimodal imaging studies were reported by two board-certified radiologists using three workflows: conventional workflow (C-WF) with speech recognition software to generate findings and impressions separately and LLM-based workflow (LLM-WF) using the state-of-the-art LLMs GPT-4o and Claude Sonnet 3.5. Outcome measures included reporting time, corrections and personnel cost per report. Two radiologists assessed formal structure and report quality. Statistical analysis used ANOVA and Tukey’s post hoc tests (p < 0.05).

RESULTS: LLM-WF significantly reduced reporting time (GPT-4o/Sonnet 3.5: 38.9 s ± 22.7 s vs. C-WF: 88.0 s ± 60.9 s, p < 0.01), required fewer corrections (GPT-4o: 1.0 ± 1.1, Sonnet 3.5: 0.9 ± 1.0 vs. C-WF: 2.4 ± 2.5, p < 0.01), and lowered costs (GPT-4o: $2.3 ± $1.4, Sonnet 3.5: $2.4 ± $1.4 vs. C-WF: $3.0 ± $2.1, p < 0.01). Reports generated with Sonnet 3.5 were rated highest in quality, while GPT-4o and conventional reports showed no difference.

CONCLUSION: Multimodal LLMs can generate high-quality radiology reports based solely on minimal audio input, with greater speed, fewer corrections, and reduced costs compared to conventional speech-based workflows. However, future implementation may involve licensing costs, and generalizability to broader clinical contexts warrants further evaluation.

KEY POINTS: Question Comparing time, accuracy, cost, and report quality of reporting using audio input functionality of GPT-4o and Claude Sonnet 3.5 to conventional reporting with speech recognition. Findings Large language models enable radiological reporting via minimal audio input, reducing turnaround time and costs without quality loss compared to conventional reporting with speech recognition. Clinical relevance Large language model-based reporting from minimal audio input has the potential to improve efficiency and report quality, supporting more streamlined workflows in clinical radiology.

PMID:40815310 | DOI:10.1007/s00330-025-11929-y

Categories
Nevin Manimala Statistics

How best to combine DWI and T2WI to predict pathologic complete response: a multi-center study on interpreting MRI following chemoradiotherapy of rectal cancer

Eur Radiol. 2025 Aug 15. doi: 10.1007/s00330-025-11927-0. Online ahead of print.

ABSTRACT

OBJECTIVES: To explore the different criteria of integrating diffusion-weighted imaging (DWI) for predicting pathologic complete response (pCR) of rectal cancer on post-chemoradiotherapy (CRT) MRI.

MATERIALS AND METHODS: In this multi-center retrospective study, five radiologists reviewed pre- and post-CRT MRIs of patients with rectal cancer diagnosed in 2017-2021. In addition to mrTRG, three criteria were assessed: “AND” criterion (mrTRG 1-2 and absence of DWI restriction considered as CR), “OR” criterion (mrTRG 1-2 or absence of restriction), and a modified MR tumor regression grade (modMR-TRG). A crossed random effects model was used to pool sensitivity and specificity across five radiologists. F1 score and positive predictive value (PPV) were analyzed across varying pCR rates.

RESULTS: In 146 patients (median age [IQR], 63 [57-70] years; 87 men), the AND criterion yielded higher specificity (77.4% [63.3-80.0%] vs 75.3% [60.5-79.0%], p = 0.001) without significant difference in sensitivity (63.9% [42.8-75.3%] vs 67.5% [45.3-76.0%], p = 0.063) compared with mrTRG. OR criterion yielded higher sensitivity (86.1% [65.3-89.3%]; p < 0.001) but lower specificity (49.5% [36.2-62.6%]; p < 0.001). The modMR-TRG demonstrated similar effects to the OR criterion. Assuming a 20% pCR rate, PPV and F1 score of the AND criterion (point estimate of 41.4% and 50.3%, respectively) were higher than those of the OR criterion (PPV, 29.9%; F1 score, 44.4%), although the difference diminished with increasing pCR rate.

CONCLUSION: The AND criterion-which utilizes DWI complementarily to further exclude patients with residual tumors after initial screening on T2WI-should be preferred over other criteria giving greater emphasis on DWI.

KEY POINTS: Question How should diffusion-weighted images be combined with T2-weighted images in predicting complete tumor response of rectal cancer on MRI following CRT? Findings Compared to mrTRG, the AND criterion yielded higher specificity without a significant difference in sensitivity, while the OR and modMR-TRG criteria yielded higher sensitivity but lower specificity. Clinical relevance Our study explores practical strategies for integrating DWI with T2WI that can be applied in daily practice. The AND criterion by using DWI conservatively is preferred over OR criteria, which results in a disproportionately higher number of additional false-positives than additional true-positives.

PMID:40815309 | DOI:10.1007/s00330-025-11927-0

Categories
Nevin Manimala Statistics

Integrating Gender-Based Violence Services Into HIV Care: Insights From Malawi

Glob Health Sci Pract. 2025 Aug 14;13(1):e2400177. doi: 10.9745/GHSP-D-24-00177. Print 2025 Aug 14.

ABSTRACT

INTRODUCTION: Gender-based violence (GBV) not only poses significant public health and human rights challenges but is also closely associated with HIV. GBV acts as a barrier to HIV prevention, testing, and treatment adherence, and fear of GBV inhibits disclosure of HIV status to sexual partners. In Malawi, where both GBV and HIV prevalence is high, integrating GBV services into HIV care is crucial. We describe the integration of GBV services into Lighthouse Trust’s HIV testing and treatment clinics in Malawi, including screening, documentation, intervention implementation, outcomes, and lessons learned.

METHODS: We conducted a retrospective analysis from January 2020 to June 2024. Data on cases identified, post-GBV services, and perpetrator demographics were collected from the GBV register. We used descriptive statistics to describe the intervention outcomes.

RESULTS: We documented 9,045 reported GBV cases among males and females from January 2020 to June 2024. Adolescent girls aged 10-19 years constituted a significant proportion of survivors. Psychosocial services were the most common type of service that was offered to GBV survivors (25%), followed by HIV testing (19%) and sexually transmitted infection screening (18%). Perpetrators were mostly known to survivors.

CONCLUSION: We successfully integrated GBV services into the Lighthouse Trust HIV clinics in close collaboration with the one-stop centers in Malawi. Training health care providers enhanced support for GBV survivors, with a focus on increasing awareness, especially for children and adolescents. Recommended actions include improving access to GBV services, enhancing documentation, and promoting multi-sectoral collaboration to ensure comprehensive care aimed at creating a safer, more dignified health care environment for all, particularly GBV survivors.

PMID:40813243 | DOI:10.9745/GHSP-D-24-00177

Categories
Nevin Manimala Statistics

Clinical Validation of Deep Learning for Image Restoration of Ultra-Low-Count [18F]FDG PET for Dementia Diagnostics

J Nucl Med. 2025 Aug 14:jnumed.124.269234. doi: 10.2967/jnumed.124.269234. Online ahead of print.

ABSTRACT

Deep learning (DL) represents a promising technique for image restoration. We explored its ability to restore ultra-low-count [18F]FDG PET studies of the brain in subjects with dementia and in healthy subjects to allow for reduced scan durations or administered activities without compromising diagnostic performance. Methods: Various DL models using the content aware image restoration approach of CSBDeep toolbox (3D U-nets) were trained with subvolumes of 1,000 random subjects. On the basis of 10-min list-mode PET data after injection of 208 ± 10 MBq of [18F]FDG, we reconstructed reduced scan durations of 2 min, 1 min, 30 s, 20 s, and 10 s. The resulting models were applied to [18F]FDG PET scans of subjects with Alzheimer disease (n = 15), frontotemporal dementia (n = 14), and healthy controls (n = 13). We explored the effect of reduced scan times on individual regional measures in diagnostically relevant regions and on voxel-based group contrasts. Three independent readers rated all datasets with regard to assessability, diagnosis, and diagnostic confidence. Results: Individual mean regional [18F]FDG uptake remained largely unchanged. The SD strongly increased with shorter scan duration without application of DL (mean increase ≤ 48%), whereas it slightly decreased with DL (≥-7%). In group contrasts, the number of significant voxels strongly decreased with shorter scan time without DL (≥-41%), which was partially offset by DL (≥-27%). On visual reads, the fraction of assessable images steeply fell to only 4% (10-s scan) for scan durations below 2 min without DL, whereas every single image restored with DL was assessable. The diagnostic confidence continuously declined with shorter scan durations without DL, whereas diagnostic confidence only negligibly changed with DL (intermediate-to-high confidence ratings: 0%-54% vs. 80%-84%; 83% for the 10-min scan). The diagnostic accuracy of PET reads dropped from 90% to 4% without and remained high with DL (90%-93%; 90% for the 10-min scan). Conclusion: Our study demonstrates the compelling performance of DL to restore cerebral [18F]FDG PET datasets with ultra-low-count statistics for quantitative regional, voxel-based group, and clinical visual analyses. Consequently, DL enables a dramatic reduction of scan durations or administered activities (e.g., 10-min scan with 3.5 MBq, equivalent to ∼60 µSv) for [18F]FDG PET in patients with dementia and possibly other indications.

PMID:40813236 | DOI:10.2967/jnumed.124.269234

Categories
Nevin Manimala Statistics

Lymphoma Therapy Response Assessment with Low-Dose [18F]FDG Total-Body PET/CT

J Nucl Med. 2025 Aug 14:jnumed.124.268841. doi: 10.2967/jnumed.124.268841. Online ahead of print.

ABSTRACT

The improved sensitivity of total-body (TB) PET/CT offers the possibility of reducing injected activities. The aim of our study was to define a lower limit of reduced injected activities in [18F]FDG TB PET/CT for interim and end-of-treatment assessment of patients with lymphoma at 2 acquisition times. Methods: Twenty-four consecutive patients with lymphoma who were undergoing interim and end-of-treatment TB PET/CT were prospectively enrolled in this study. An [18F]FDG activity of 3.0 MBq/kg served as the reference standard (RS). Images simulating low doses of 1.0, 0.5, 0.25, and 0.125 MBq/kg were reconstructed at 1 and 2 h after injection. The coefficient of variation of the liver was assessed. Lymphoma lesions were segmented and semiquantitatively compared with the RS using the SUV. Additionally, metabolic tumor volume (MTV) for each lesion, patient-based total MTV, and total-lesion glycolysis (TLG) were analyzed. Semiquantitative parameters were normalized to the liver and blood pool by tumor-to-background ratios (TBRs) and contrast-to-noise ratios. Therapy response was assessed using Deauville criteria. Results: Overall, 191 lymphoma lesions were analyzed. SUVmax demonstrated a trend toward a statistically significant increase in scans with reduced activity at 1 h after injection (6.28 ± 5.87 for RS vs. 7.76 ± 6.69 for 0.125 MBq/kg; P = 0.07) and 2 h after injection (7.14 ± 7.16 for RS vs. 8.67 ± 7.62 for 0.125 MBq/kg; P = 0.13). SUVpeak, SUVmean, MTV, and TLG did not significantly differ between the reduced injected activities and the RS. The coefficient of variation for the liver increased significantly with decreasing injected activities (P < 0.01). The TBR for the liver did not differ significantly, whereas the TBR for the blood pool was significantly higher only for the lowest injected activity (P < 0.01) at 2 h after injection. The contrast-to-noise ratio significantly decreased with reduced activities. Deauville scores did not differ significantly, up to a dose of 0.25 MBq/kg at 1 h after injection and a dose of 1.0 MBq/kg at 2 h after injection. Below this limit, we noted significantly lower Deauville scores for reduced injected activities (P < 0.01). Conclusion: Reduction of injected activities with [18F]FDG TB PET/CT for therapy response assessment in patients with lymphoma may be possible and does not result in significant differences in MTV, TBR, or TLG. SUVmax and Deauville scores were comparable to the RS to a lower limit of 0.25 MBq/kg at 1 h after injection and 1.0 MBq/kg at 2 h after injection.

PMID:40813234 | DOI:10.2967/jnumed.124.268841

Categories
Nevin Manimala Statistics

Three-dimensional cephalometric analysis of morphological characteristics in children with bilateral craniofacial microsomia

J Craniomaxillofac Surg. 2025 Aug 13:S1010-5182(25)00247-1. doi: 10.1016/j.jcms.2025.07.024. Online ahead of print.

ABSTRACT

PURPOSE: Craniofacial microsomia (CFM), the second most common congenital craniofacial anomaly, is poorly characterized in bilateral cases because conventional cephalometry cannot accurately assess facial asymmetry. This study aims to characterize craniofacial morphology in children with bilateral CFM using three-dimensional (3D) cephalometric analysis.

MATERIALS AND METHODS: A retrospective 3D cephalometric analysis was conducted on 8 bilateral CFM patients and 10 age-/sex-matched normal patients as controls. A coordinate system was established with three reference planes: the Frankfurt Horizontal Plane (FHP), the Midsagittal Plane (MSP) and the Nasion Perpendicular Plane (CP). Fifteen linear and angular measurements assessed maxillary, mandibular, chin, and occlusal parameters. Subgroups were stratified by bilateral mandibular deficiency severity according to Pruzansky-Kaban classification (Group A: similar; Group B: different). Statistical comparisons utilized independent t-tests (CFM vs. controls) and Mann-Whitney U tests (subgroups), with Pearson’s correlation analysis exploring variable relationships.

RESULTS: Bilateral CFM patients exhibited significant reductions in ramal height (Co-Go: p < 0.001) and mandibular body length (Go-Me: p < 0.001), a posteriorly inclined occlusal plane (OP-FHP: 27.63° ± 5.50° vs. 8.13° ± 3.33°, p < 0.001), and pronounced chin retrusion (Me-NP: 46.08 ± 6.66 mm vs. 8.28 ± 7.71 mm, p < 0.001) and lateral deviation (Me-MSP: 5.84 ± 4.64 mm vs. 1.73 ± 0.93 mm, p < 0.05). Me-NP and Me-MSP differed significantly in subgroup analyses. Pearson correlation analysis revealed strong associations between Me-NP and Me-MSP and OP-MSP, posterior maxillary height (U6-FHP) and Co-Go.

CONCLUSION: Bilateral CFM is mainly characterized by posteriorly inclined occlusal plane and pronounced mandibular retrognathia. The occlusal plane and chin will consistently deviate toward the more severely affected side. When bilateral mandibular involvement is similar in extent, chin deviation tends to be mild, resulting in less severe facial asymmetry.

PMID:40813223 | DOI:10.1016/j.jcms.2025.07.024

Categories
Nevin Manimala Statistics

Structured clinical evaluation for rapid identification of temporomandibular joint closed lock

Int J Oral Maxillofac Surg. 2025 Aug 13:S0901-5027(25)01398-0. doi: 10.1016/j.ijom.2025.07.009. Online ahead of print.

ABSTRACT

Temporomandibular joint (TMJ) closed lock, corresponding to Wilkes stage 3 internal derangement, is a common cause of restricted mouth opening and functional impairment. Early diagnosis is essential but challenging, particularly when imaging is unavailable. This study analyzed 40 consecutive patients diagnosed with TMJ closed lock. Patients were routinely evaluated using a standardized 11-test assessment format as part of regular practice, and these records were retrospectively analyzed. Statistical analysis compared findings between the affected and non-affected joints and the muscles of mastication. The most frequent clinical findings included tenderness in the affected joint on passive stretch (75%), on palpation (67.5%), on contralateral movement (57.5%), and on contralateral loading (47.5%). On average, the affected joint was tender in 4.1 of 11 tests, while the contralateral joint was positive in only 0.2 tests (P < 0.001). Passive stretch increased mouth opening from 27.1 mm to 31.7 mm, with a hard end feel observed in 85% of patients. Muscular tenderness was observed in 32.5% of patients, most commonly in the ipsilateral masseter (25%) and temporalis (12.5%). These findings support structured clinical evaluation for early recognition of TMJ closed lock, improving diagnostic accuracy and enabling timely intervention.

PMID:40813221 | DOI:10.1016/j.ijom.2025.07.009

Categories
Nevin Manimala Statistics

Prevertebral Hematoma: A Potential Biomarker for the Severity of Upper Cervical Spine Trauma and a Predictor for the Need for Surgical Intervention

AJNR Am J Neuroradiol. 2025 Aug 14. doi: 10.3174/ajnr.A8849. Online ahead of print.

ABSTRACT

BACKGROUND AND PURPOSE: Upper cervical spine trauma (UCST) can lead to severe morbidity and mortality, particularly when associated with craniocervical junction (CCJ) injuries. Previous studies suggest that identifying prevertebral hematomas in patients with cervical trauma may have clinical significance. However, the association between prevertebral hematomas and clinical outcomes in patients with UCST has not yet been firmly established. The purpose of this study is to investigate the association of prevertebral hematomas with the severity of upper cervical spine injury, and clinical outcomes in patients with UCST.

MATERIALS AND METHODS: This retrospective study analyzed patients with UCST admitted to a level I trauma center in a 3-year period. Inclusion criteria included cervical spine trauma confirmed via imaging and MRI performed within 7 days of admission. Prevertebral hematomas were assessed for size and location and correlated with injury patterns and clinical outcomes. Statistical analysis was performed by using χ2 tests and logistic regression models.

RESULTS: One hundred sixty-five patients (mean age, 40.6 ± 19.9 years; 103 men) were evaluated. Prevertebral hematomas were identified in 88 of 165 patients (53.3%). Hematomas were significantly associated with CCJ dislocations (17/23 patients; 73.9%, P = .03) and subaxial disco-ligamentous injuries (48/71 patients; 67.6%, P = .001). Surgical intervention was more likely in patients with prevertebral hematomas (39/60 patients; 65%, P = .02), with an OR of 2.08 (95% CI, 1.08-4.01). While 75 of 148 patients with neurologic disability at discharge had prevertebral hematomas (50.6%, P = .06), this association did not reach statistical significance.

CONCLUSIONS: Prevertebral hematomas were significantly associated with CCJ dislocations, subaxial disco-ligamentous injuries, and an increased likelihood of surgical intervention in patients with UCST. These findings suggest that prevertebral hematomas can serve as a useful marker for identifying patients with more severe injury patterns.

PMID:40813212 | DOI:10.3174/ajnr.A8849

Categories
Nevin Manimala Statistics

Compressed Sensing Technology Accelerated 3D-FLAIR MRI Sequence for Endolymphatic Hydrops at 3T

AJNR Am J Neuroradiol. 2025 Aug 14. doi: 10.3174/ajnr.A8864. Online ahead of print.

ABSTRACT

BACKGROUND AND PURPOSE: 3D-FLAIR sequence has an important contribution to the display of endolymphatic hydrops (EH) in Meniere disease (MD), but its clinical application is limited because of the long acquisition time. We investigated whether 3D-FLAIR combined with compressed sensing (CS) technology (3D-FLAIR-CS) can shorten the scan time while maintaining the image quality and diagnostic efficiency for EH.

MATERIALS AND METHODS: This prospective study included 50 patients with unilateral definite MD who underwent 3T MR imaging 4 hours after gadolinium injection using traditional 3D-FLAIR (10 minutes 35 seconds) and 3D-FLAIR-CS (5 minutes 25 seconds). Image quality was assessed using quantitative (the contrast-to-noise ratio [CNR], SNR, and signal intensity ratio [SIR]) and qualitative methods. The chi-square test compared the diagnostic efficacy of the sequences, paired t tests analyzed quantitative differences, and intra-/interobserver agreement was evaluated using the weighted kappa statistic.

RESULTS: Among 50 patients (23 men, 27 women; 27 left ears, 23 right ears), no significant differences were found between the 2 sequences in image quality or diagnosing EH (P >.05). There were no statistically significant differences in CNR (affected side: P = .09; asymptomatic side: P = .07), SNR (affected side: P = .12; asymptomatic side: P = .10), and SIR (affected side: P = .13; asymptomatic side: P = .45) between traditional 3D-FLAIR and 3D-FLAIR-CS, and both sequences exhibited excellent intra- and interobserver agreement (kappa >0.80).

CONCLUSIONS: Acquisition time for the 3D-FLAIR-CS sequence is reduced by a factor of about 2 compared to traditional 3D-FLAIR, while image quality and diagnostic efficacy in the assessment of EH are the same.

PMID:40813211 | DOI:10.3174/ajnr.A8864

Categories
Nevin Manimala Statistics

Outcomes and recurrence patterns following resection of T1 ampullary carcinomas: single centre experience of 92 cases

HPB (Oxford). 2025 Jul 12:S1365-182X(25)00664-1. doi: 10.1016/j.hpb.2025.07.005. Online ahead of print.

ABSTRACT

BACKGROUND: Resected early-stage (T1) ampullary carcinomas (ACs) have the best overall survival (OS) but can have higher postoperative morbidity compared to higher-stage ACs and other periampullary cancers.

METHOD: A retrospective analysis of resected T1-ACs at Tata Memorial Centre, Mumbai, from January 2012 to December 2022 was performed. Perioperative and long-term outcomes were assessed.

RESULTS: A total of 92 patients underwent resection for T1-ACs, with a significant morbidity rate (Clavien-Dindo ≥3) of 38%, and a clinically relevant postoperative pancreatic fistula rate of 22.5%. The node positivity rate in resected T1-ACs was 25%. The 3- and 5-year OS rates were 77.9% and 74.5%, while recurrence-free survival (RFS) rates were 81.8% and 78.4%, respectively. There were 18 (19.6%) recurrences (2 local, 16 distant) during a median follow-up of 66.7 months. The 3- and 5-year OS after recurrence was 29.6% and 14.8% respectively. Lymph node metastasis was the sole significant factor affecting OS (HR 2.815, 95% CI: 1.114-7.112, p = 0.029) and RFS (HR 2.54, 95% CI: 0.978-6.595, p = 0.056).

CONCLUSION: T1-ACs have excellent survival after resection; however, about 20% of patients develop recurrence. Lymph node metastasis remains the most important factor affecting long-term survival.

PMID:40813200 | DOI:10.1016/j.hpb.2025.07.005