Categories
Nevin Manimala Statistics

ESR Essentials: common performance metrics in AI-practice recommendations by the European Society of Medical Imaging Informatics

Eur Radiol. 2025 Aug 3. doi: 10.1007/s00330-025-11890-w. Online ahead of print.

ABSTRACT

This article provides radiologists with practical recommendations for evaluating AI performance in radiology, ensuring alignment with clinical goals and patient safety. It outlines key performance metrics, including overlap metrics for segmentation, test-based metrics (e.g., sensitivity, specificity, and area under the receiver operating characteristic curve), and outcome-based metrics (e.g., precision, negative predictive value, F1-score, Matthews correlation coefficient, and area under the precision-recall curve). Key recommendations emphasize local validation using independent datasets, selecting task-specific metrics, and considering deployment context to ensure real-world performance matches claimed efficacy. Common pitfalls, such as overreliance on a single metric, misinterpretation in low-prevalence settings, and failure to account for clinical workflow, are addressed with mitigation strategies. Additional guidance is provided on threshold selection, prevalence-adjusted evaluation, and AI-generated image quality assessment. This guide equips radiologists to critically evaluate both commercially available and in-house developed AI tools, ensuring their safe and effective integration into clinical practice. CLINICAL RELEVANCE STATEMENT: This review provides guidance on selecting and interpreting AI performance metrics in radiology, ensuring clinically meaningful evaluation and safe deployment of AI tools. By addressing common pitfalls and promoting standardized reporting, it supports radiologists in making informed decisions, ultimately improving diagnostic accuracy and patient outcomes. KEY POINTS: Radiologists must evaluate performance metrics as they reflect acceptable performance in specific datasets but do not guarantee clinical utility. Independent evaluation tailored to the clinical setting is essential. Performance metrics must align with the intended task of the AI application-segmentation, detection, or classification-and be selected based on domain knowledge and clinical context. Sensitivity, specificity, area under the ROC curve, and accuracy must be interpreted with prevalence-dependent metrics (e.g., precision, F1 score, and Matthew’s correlation coefficient) calculated for the target population to ensure safe and effective clinical use.

PMID:40753524 | DOI:10.1007/s00330-025-11890-w

Categories
Nevin Manimala Statistics

Prospective effect of mesenchymal exosomes versus nanocurcumin-loaded mesenchymal exosomes on induced periodontitis in albino rats

Odontology. 2025 Aug 3. doi: 10.1007/s10266-025-01161-x. Online ahead of print.

ABSTRACT

Provide insights on the effect of mesenchymal exosomes and Nanocurcumin (NCUR)-loaded exosomes on periodontitis. To induce periodontitis, 42 rats were injected with 3 µL of a 10 mg/mL lipopolysaccharide (LPS) for 4 weeks and divided into 3 categories: untreated Periodontitis, Exosomes treated (single dose 200 µg exosomes), and exosomes loaded NCUR treated group (200 µg exosomes loaded with 200 µg NCUR). In addition, 14 rats were injected PBS to serve as control. Rats were sacrificed between 2 and 4 weeks. Rats were allocated in 7 groups; (Group I (Control), Group II (Periodontitis 2 weeks), Group III (Periodontitis + exosomes 2 weeks), Group IV (Periodontitis + loaded exosomes 2 weeks), Group V (Periodontitis 4 weeks), Group VI (Periodontitis + exosomes 4 weeks) and Group VII (Periodontitis + loaded exosomes 4 weeks). The specimens were prepared for histological, histochemical and ELISA analysis. Histological examination of Group I showed normal structure of periodontium. Groups II and V illustrated reduction in periodontal ligament and different stainability of cementum and bone. Group III revealed disordered periodontal fibers and irregular outlines of cementum and bone. The periodontal fibers in Groups IV and VII were obliquely oriented, and the cementum and bone were consistently stained. Group VI explored dense periodontal ligament. Cementum and alveolar bone showed regular outlines. Masson’s trichrome statistical results showed group VI has highest mean. Group V has the greatest IL-1β mean. NCUR-loaded exosomes were found to be more effective in decreasing inflammation and stimulating tissue regeneration in experimental periodontitis.

PMID:40753523 | DOI:10.1007/s10266-025-01161-x

Categories
Nevin Manimala Statistics

Comparison of surgical site infection in pediatric patients using NSQIP-P data during COVID-19 pandemic and non-pandemic periods

Pediatr Surg Int. 2025 Aug 3;41(1):241. doi: 10.1007/s00383-025-06147-y.

ABSTRACT

INTRODUCTION: Surgical site infections (SSIs) remain a significant post-operative complication, with rates varying across populations. The COVID-19 pandemic led to heightened infection control measures, which were expected to lower SSI rates. However, existing studies mainly focus on adult populations, leaving a gap in understanding the pandemic’s impact on pediatric surgeries.

METHODS: We used the National Surgical Quality Improvement Program in Pediatric Surgery (NSQIP-P) database to analyze SSI rates and lengths of stay (LOS) for pediatric patients from 2018 to 2021. We compared data from pre-pandemic (2018-2019) and pandemic (2020-2021) periods, adjusting for confounding variables, such as patient demographics, comorbidities, and surgical specialties.

RESULTS: Among 472,581 cases analyzed, SSI rates increased from 2.5% pre-pandemic to 2.88% during the pandemic. While the percentage of patients with LOS exceeding 2 days slightly decreased, SSI rates for those with prolonged LOS increased, highlighting a strong association between extended hospitalization and SSI risk. Pediatric Otolaryngology had the highest adjusted odds ratio (OR) for SSI (1.393), while pediatric surgery had the lowest (1.097).

DISCUSSION: Despite enhanced infection control protocols, SSI rates in pediatric surgeries increased during the COVID-19 pandemic. These findings emphasize that infection control measures alone may have been insufficient to mitigate SSIs in pediatric populations, even with efforts to reduce LOS. Further research is needed to explore the pandemic’s broader impact on pediatric surgical outcomes and the relationship between LOS and SSIs.

LEVELS OF EVIDENCE: Observational study, Level III.

PMID:40753520 | DOI:10.1007/s00383-025-06147-y

Categories
Nevin Manimala Statistics

Synergistic Effects of Body Mass Index in Early Adulthood and Recent Weight Gain in Reducing Mortality Risk Among Cancer Survivors

Nutr Cancer. 2025 Aug 3:1-9. doi: 10.1080/01635581.2025.2538266. Online ahead of print.

ABSTRACT

BACKGROUND: Cancer survivors face an elevated risk of mortality, and changes in body mass index (BMI) may play a critical prognostic role. This study examined BMI variations during early adulthood and recent years in relation to cancer-specific mortality and all-cause mortality.

METHODS: Data were drawn from the National Health and Nutrition Examination Survey. Statistical models were applied to evaluate associations, dose-response relationships, and threshold effects.

RESULTS: Among 2,024 cancer survivors, recent BMI increases were significantly associated with reduced cancer and all-cause mortality, whereas earlier BMI changes showed weaker associations. Compared with those in the lowest tertile, those with greater recent BMI increases had a 24%-44% lower risk of cancer mortality (P for trend = 0.016) and a 34%-45% lower risk of all-cause mortality (P for trend < 0.001). A non-linear association was identified, with a 5% BMI increase as the threshold; each 1% gain below this threshold was linked to a 4% mortality risk reduction (p < 0.001). Joint analysis revealed that a high early BMI combined with a ≥ 5% recent BMI increase significantly reduced mortality risk.

CONCLUSIONS: Moderate recent weight gain may improve survival among cancer survivors, underscoring the importance of individualized weight management strategies.

PMID:40753512 | DOI:10.1080/01635581.2025.2538266

Categories
Nevin Manimala Statistics

Modeling the Impact of Multicancer Early Detection Tests: A Review of Natural History of Disease Models

Med Decis Making. 2025 Aug 3:272989X251351639. doi: 10.1177/0272989X251351639. Online ahead of print.

ABSTRACT

IntroductionThe potential for multicancer early detection (MCED) tests to detect cancer at earlier stages is currently being evaluated in screening clinical trials. Once trial evidence becomes available, modeling will be necessary to predict the effects on final outcomes (benefits and harms), account for heterogeneity in determining clinical and cost-effectiveness, and explore alternative screening program specifications. The natural history of disease (NHD) component will use statistical, mathematical, or calibration methods. This work aims to identify, review, and critically appraise the existing literature for alternative modeling approaches proposed for MCED that include an NHD component.MethodsModeling approaches for MCED screening that include an NHD component were identified from the literature, reviewed, and critically appraised. Purposively selected (non-MCED) cancer-screening models were also reviewed. The appraisal focused on the scope, data sources, evaluation approaches, and the structure and parameterization of the models.ResultsFive different MCED models incorporating an NHD component were identified and reviewed, alongside 4 additional (non-MCED) models. The critical appraisal highlighted several features of this literature. In the absence of trial evidence, MCED effects are based on predictions derived from test accuracy. These predictions rely on simplifying assumptions with unknown impacts, such as the stage-shift assumption used to estimate mortality impacts from predicted stage shifts. None of the MCED models fully characterized uncertainty in the NHD or examined uncertainty in the stage-shift assumption.ConclusionThere is currently no modeling approach for MCEDs that can integrate clinical study evidence. In support of policy, it is important that efforts are made to develop models that make the best use of data from the large and costly clinical studies being designed and implemented across the globe.HighlightsIn the absence of trial evidence, published estimates of the effects of multicancer early detection (MCED) tests are based on predictions derived from test accuracy.These predictions rely on simplifying assumptions, such as the stage-shift assumption used to estimate mortality effects from predicted stage shifts. The effects of such simplifying assumptions are mostly unknown.None of the existing MCED models fully characterize uncertainty in the natural history of disease; none examine uncertainty in the stage-shift assumption.Currently, there is no modeling approach that can integrate clinical study evidence.

PMID:40753481 | DOI:10.1177/0272989X251351639

Categories
Nevin Manimala Statistics

Are Open-Ended Question Assessments an Emerging Trend in US Medical Education?

Teach Learn Med. 2025 Aug 3:1-10. doi: 10.1080/10401334.2025.2538051. Online ahead of print.

ABSTRACT

There is a growing amount of literature on the benefits of using open-ended questions (OEQs) to assess knowledge in medical education. However, it is unknown how many US medical schools include OEQs in their assessment toolkits and how they are being used. The purpose of this study was to determine if OEQ assessments are an emerging trend in US medical education. We distributed an online survey to assessment leadership at all 156 US accredited allopathic medical schools between September 2022 and April 2024. Questions focused on the use or future interest of OEQs to assess medical knowledge in the pre-clerkship and clerkship curriculum. We calculated descriptive statistics for prevalence and use rates, and completed a conventional content analysis for open-ended comments. Seventy-eight US medical schools completed the survey (50% response rate). Forty schools (51%) reported using OEQs for medical knowledge assessment. OEQs were used during the pre-clerkship (28 schools), clerkship (two schools) or both parts of the curriculum (10 schools). On average, OEQs accounted for 20% of the pre-clerkship and 11% of the clerkship assessments at each school. Schools used OEQs to assess students’ understanding, assess certain types of knowledge, and develop students’ deeper learning. Representatives at schools not currently using OEQs reported considering using them in the future but expressed concerns about the amount of time needed to implement them. Numerous schools are using OEQs to assess medical knowledge, suggesting that this assessment format is feasible. Institutions can be innovative in their assessments by extending beyond multiple-choice questions and incorporating other question formats, such as OEQs, to fit their educational needs. This study provides a foundation for future research to explore the utility of OEQs and how to overcome the challenges of implementing OEQ assessments.

PMID:40753474 | DOI:10.1080/10401334.2025.2538051

Categories
Nevin Manimala Statistics

Efficacy of Family Health Conversations on Mental Health, Family Wellbeing, and Family Functioning for Parents of Infants Requiring Mechanical Respiratory Support During Neonatal Intensive Care

J Fam Nurs. 2025 Aug 3:10748407251357216. doi: 10.1177/10748407251357216. Online ahead of print.

ABSTRACT

Having an infant requiring care in a neonatal intensive care unit (NICU) is challenging for parents. The aim was to investigate the effects of the Family Health Conversation (FamHC) model on self-reported mental health, family wellbeing, and family functioning in parents of infants requiring mechanical respiratory support during NICU care. This interventional study included 147 parents (72, intervention group; 75, control group). All participants received a study-specific questionnaire at three time points. The intervention trended toward positive effects on mental health, family wellbeing, and family functioning. However, all measurements showed considerable variation, and the estimated effects were not statistically significant at the 0.05 level. Regardless of the intervention, mental health symptoms decreased over time, whereas family wellbeing and functioning remained stable. To conclude, although the intervention trended favorable for all outcomes, no significant differences were observed between groups. Potential effects might be better identified using qualitative methodology or self-reporting measures in a larger sample.

PMID:40753473 | DOI:10.1177/10748407251357216

Categories
Nevin Manimala Statistics

Blurred Identity, Rising Distress: A Serial Mediation Approach to Social Media and Depression

J Psychol. 2025 Aug 3:1-22. doi: 10.1080/00223980.2025.2534801. Online ahead of print.

ABSTRACT

This study examines a serial mediation framework to gain a deeper understanding of how social media use affects mental health. Many young people experience a sense of emotional overload from constant connectivity (i.e., digital stress), which may be one of the earliest signs of psychological strain, and the impact on self-concept clarity may further compound these effects. Thus, we examined how digital stress and self-concept clarity may serially mediate the relation between social media use and depressive symptoms. The study sample consisted of 995 Romanian participants aged 17 to 79 (M = 25.05, SD = 9.52; 63.22% female). Results suggested a positive association between digital stress and social media use and a negative association between self-concept clarity, digital stress, and depressive symptoms. Results also indicated a significant link between prolonged social media usage and digital stress, as well as a correlation between elevated digital stress levels and low self-concept clarity scores, which in turn, seemed to contribute to the development of depressive symptoms. However, the relation between digital stress and self-concept clarity did not fully account for the positive correlation between social media usage time and depressive symptoms. Thus, the mediation effect was incomplete, as the direct relationship between social media use and depressive symptoms persisted, remaining positive and statistically significant. We discuss these findings in terms of their practical implications for mitigating the effects of social media use on individuals’ mental health, with a focus on the relationship between digital stress and self-concept clarity.

PMID:40753470 | DOI:10.1080/00223980.2025.2534801

Categories
Nevin Manimala Statistics

Clinical profiles and mortality risk factors in pediatric pulmonary hemorrhage: a singlecenter study in Saudi Arabia

Ann Saudi Med. 2025 Jul-Aug;45(4):235-242. doi: 10.5144/0256-4947.2025.235. Epub 2025 Aug 7.

ABSTRACT

BACKGROUND: Pulmonary hemorrhage (PH) is a rare, life-threatening event characterized by bleeding into the airways and lung parenchyma.

OBJECTIVES: To explore the clinical characteristics of PH patients and investigate mortality-related risk factors, providing a holistic understanding of patient outcomes in this population.

DESIGN: A retrospective cohort study.

SETTINGS: The Pediatric Intensive Care Unit (PICU) at King Faisal Specialist Hospital and Research Centre (KFSHRC), Riyadh, Saudi Arabia.

PATIENTS AND METHODS: Pediatric patients with PH episodes (aged 1 month to 14 years) who were admitted from January 2014 to September 2019.

MAIN OUTCOMES MEASURES: Clinical characteristics, outcomes, and mortality-related risk factors.

SAMPLE SIZE: 80 children.

RESULTS: The cohort had a sex ratio of 1:1 and a median age of 24 months [interquartile range: 9-78]. Medical histories included bone marrow transplant (51.3%), oncology cases (40.0%), chemotherapy (61.3%), chest infection (86.3%), and immunosuppressant use (71.3%). Additionally, most patients (87.5%) had acute respiratory distress syndrome during the PH episode. The overall PICU mortality rate was 82.5% (66/80), and was associated with thrombocytopenia, sepsis, renal impairment, liver dysfunction, multiorgan dysfunction, and altered code status in univariable analysis (all P <.05). Multivariate analysis identified sepsis, multiorgan dysfunction, and altered code status as key predictors of PICU mortality (P <.05).

CONCLUSION: The high mortality rate reported emphasizes the need for tailored interventions and heightened vigilance, particularly in immunocompromised children. Future research will expand on these findings to refine current management protocols and further improve patient care in pediatric PH.

LIMITATIONS: Retrospective study, single-center.

PMID:40753460 | DOI:10.5144/0256-4947.2025.235

Categories
Nevin Manimala Statistics

Length of stay in hospital and rehabilitation centers after stroke in Arab countries and Saudi Arabia: a systematic review and meta-analysis

Ann Saudi Med. 2025 Jul-Aug;45(4):256-269. doi: 10.5144/0256-4947.2025.256. Epub 2025 Aug 7.

ABSTRACT

BACKGROUND: Stroke rehabilitation is a vital component of post-stroke care, and the length of stay (LOS) in hospitals and rehabilitation centers varies across healthcare systems. This systematic review and meta-analysis assessed LOS among stroke survivors in Arab countries.

METHODS: A comprehensive literature search identified studies reporting LOS in stroke rehabilitation. A comprehensive literature search from the inception until March 2025 identified studies reporting LOS in stroke rehabilitation.

RESULTS: A total of 18 publications (25 datasets) involving 12 690 individuals were included in the meta-analysis. The pooled mean LOS was 25.67 days [95% confidence interval (CI): 16.22-35.11]. Subgroup analyses showed a longer LOS in Saudi Arabia (37.03 days, 95% CI: 24.11-49.95) compared to other Arab countries (8.87 days, 95% CI: 4.90-12.84), and in rehabilitation centers (46.71 days, 95% CI: 33.18-60.24) compared to acute hospital settings (9.07 days, 95% CI: 5.27-12.86). LOS varies widely across Arab countries and care settings.

CONCLUSION: These findings highlight the need to examine whether differences in LOS are associated with functional recovery and healthcare efficiency. However, substantial heterogeneity across studies and a lack of outcome data limit the interpretability of the results.

PMID:40753459 | DOI:10.5144/0256-4947.2025.256