Categories
Nevin Manimala Statistics

Bochum Burn Survival (BoBS) score – A novel machine learning-based burn survival prediction score developed with data from the German Burn Registry

Burns. 2025 Jul 14;51(8):107614. doi: 10.1016/j.burns.2025.107614. Online ahead of print.

ABSTRACT

BACKGROUND: Burn mortality prediction remains a critical aspect in burn medicine. Established scores, such as the ABSI or Baux score, experience continuous revision and improvement due to advances in critical care and surgical procedures. However, these scores often rely on predefined variables and limited statistical models. This study aimed to create a new prediction score that is based solely on machine learning techniques and to assess its performance against established traditional scoring systems.

METHODS: Using different advanced machine learning methods, data from the German burn registry, encompassing over 10,000 cases, were analyzed regarding the most relevant factors concerning mortality and a new prediction score was created. A new prediction model was constructed, employing algorithms such as random forests and gradient boosting. Internal validation was conducted using cross-validation to ensure robustness and reproducibility.

RESULTS: The Bochum Burn Survival (BoBS) score demonstrates strong predictive performance with an accuracy of 93.1 % and ROC AUC of 92.4 %, therefore surpassing traditional scores in predictive performance. Factors such as TBSA and age showed the strongest correlation with mortality, while comorbidities and treatment-specific variables contributed to model refinement. However, further adjustments and external validation beyond the German Burn Registry are crucial in the future.

DISCUSSION: The BoBS score represents a paradigm shift in burn mortality prediction, leveraging the potential of machine learning to analyze complex, high-dimensional datasets. Compared to traditional models, the BoBS score offers improved accuracy while providing insights into underexplored variables that might impact patient outcomes. But challenges remain in integrating such models into clinical workflows and validating them across diverse populations.

CONCLUSION: This score represents a significant advancement in burn mortality prediction by providing an interpretable, machine learning-based scoring system developed using multicenter data from the German Burn Registry. Its application has the potential to enhance decision-making in burn care, marking a significant step forward in personalized medicine for critically injured burn patients.

PMID:40700784 | DOI:10.1016/j.burns.2025.107614

Categories
Nevin Manimala Statistics

Comparative biomechanical analysis of combined lower and middle trapezius tendon transfer vs. isolated lower trapezius tendon transfer in irreparable posterosuperior massive rotator cuff tears

Clin Biomech (Bristol). 2025 Jul 15;128:106621. doi: 10.1016/j.clinbiomech.2025.106621. Online ahead of print.

ABSTRACT

BACKGROUND: Posterosuperior massive rotator cuff tears remain challenging to manage. While lower trapezius transfer restores posterior cuff function, it lacks the superior cuff’s biomechanical role. Middle trapezius tendon transfer has shown efficacy in addressing superior cuff deficiencies with dynamic joint-centering and spacer effects. This study aimed to compare the biomechanical effects of lower trapezius transfer alone versus combined lower and middle trapezius transfer for posterosuperior massive rotator cuff tears.

METHODS: Eight cadaveric shoulders were tested under four conditions: intact, posterosuperior cuff tear, lower trapezius transfer, and combined lower and middle trapezius transfer. Superior translation, subacromial contact pressure, and rotational range of motion were measured at multiple abduction and external rotation positions. Statistical analysis was performed using a linear mixed-effects model.

FINDINGS: Both lower trapezius and combined lower and middle trapezius transfers significantly reduced superior humeral head translation versus the tear condition (p < .041). The combined transfer restored translation to intact levels and was more effective than lower trapezius transfer alone at 0° and 20° abduction (p < .031). Subacromial contact pressure decreased significantly with both transfers at 20° and 40° abduction (p < .030), and with combined transfer also at 0° abduction and 30° ER (p < .042). Total rotational range of motion was preserved in all conditions.

INTERPRETATION: Combined lower and middle trapezius transfer offers superior biomechanical restoration of glenohumeral joint stability compared to lower trapezius transfer alone without compromising range of motion. These findings support the potential of dual tendon transfer in addressing both posterior and superior cuff deficiencies, warranting further clinical evaluation.

PMID:40700779 | DOI:10.1016/j.clinbiomech.2025.106621

Categories
Nevin Manimala Statistics

Data Submission to the American Spine Registry for Advanced Disease-Specific Certification in Spine Surgery

Orthop Nurs. 2025 Jul-Aug 01;44(4):231-235. doi: 10.1097/NOR.0000000000001137. Epub 2025 Jul 18.

ABSTRACT

In this study, the authors examine the establishment and significance of spine surgery registries, particularly the American Spine Registry (ASR). The historical development of spine registries in the United States is outlined along with the role of registries in evaluating clinical outcomes and improving healthcare quality. The ASR aims to collect comprehensive data on spine surgeries, including patient-reported outcomes and performance metrics, to enhance quality standards and clinical practice guidelines. Studies from European registries demonstrate the utility of registry data in identifying trends, assessing cost-effectiveness, and improving patient care. This paper also discusses the economic implications of participation in spine registries and emphasizes the potential cost savings and benefits for healthcare organizations. Ethical and legal considerations, data security, and patient confidentiality are addressed, along with the challenges associated with registry participation such as resource allocation. Increased transparency, collaboration, and clarity are needed to promote broader engagement in spine surgery registries.

PMID:40700763 | DOI:10.1097/NOR.0000000000001137

Categories
Nevin Manimala Statistics

Explaining alerts from a pediatric risk prediction model using clinical text

J Am Med Inform Assoc. 2025 Jul 23:ocaf121. doi: 10.1093/jamia/ocaf121. Online ahead of print.

ABSTRACT

OBJECTIVE: Risk prediction models are used in hospitals to identify pediatric patients at risk of clinical deterioration, enabling timely interventions and rescue. The objective of this study was to develop a new explainer algorithm that uses a patient’s clinical notes to generate text-based explanations for risk prediction alerts.

MATERIALS AND METHODS: We conducted a retrospective study of 39 406 patient admissions to the American Family Children’s Hospital at the University of Wisconsin-Madison (2009-2020). The pediatric Calculated Assessment of Risk and Triage (pCART) validated risk prediction model was used to identify children at risk for deterioration. A transformer model was trained to use clinical notes from the 12-hour period preceding each pCART score to predict whether a patient was flagged as at risk. Then, label-aware attention highlighted text phrases most important to an at-risk alert. The study cohort was randomly split into derivation (60%) and validation (20%) data, and a separate test (20%) was used to evaluate the explainer’s performance.

RESULTS: Our pCART Explainer algorithm performed well in discriminating at-risk pCART alert vs no alert (c-statistic 0.805). Sample explanations from pCART Explainer revealed clinically important phrases such as “rapid breathing,” “fall risk,” “distension,” and “grunting,” thereby demonstrating excellent face validity.

DISCUSSION: The pCART Explainer could quickly orient clinicians to the patient’s condition by drawing attention to key phrases in notes, potentially enhancing situational awareness and guiding decision-making.

CONCLUSION: We developed pCART Explainer, a novel algorithm that highlights text within clinical notes to provide medically relevant context about deterioration alerts, thereby improving the explainability of the pCART model.

PMID:40700686 | DOI:10.1093/jamia/ocaf121

Categories
Nevin Manimala Statistics

Psychometric Evaluation of the Patient Experience Colonoscopy Scale

J Eval Clin Pract. 2025 Aug;31(5):e70220. doi: 10.1111/jep.70220.

ABSTRACT

RATIONALE, AIMS AND OBJECTIVES: Colonoscopy, though common, can be uncomfortable, necessitating routine assessment of patient experience per European guidelines. Positive patient experiences are crucial as they influence willingness for repeat procedures. Patient-reported experience measures (PREMs) effectively capture patient perspectives through surveys, empowering patients to influence healthcare quality. These surveys identify areas for improvement and inform research, enhancing healthcare and its quality. The Patient Experience Colonoscopy Scale (PECS) is a colonoscopy-specific PREM that measures adult patient experience after an elective colonoscopy. It consists of items derived from the patient’s perspective and has been found to be content valid. The PECS is multidimensional and divided into five constructs: health motivation, discomfort, information, a caring relationship, and understanding. The current study aims to evaluate the measurement properties of the new PREM, called the PECS regarding reliability and construct validity.

METHOD: The sample comprised 331 adult patients who had undergone an elective colonoscopy at a University Hospital in Sweden. The PECS was evaluated using intraclass correlation coefficients, confirmatory factor analysis, and multi- and unidimensional Rasch analyses.

RESULTS: The test-retest reliability was acceptable, with an average intraclass correlation coefficient of 0.72. Construct validity was tested with three different techniques. The confirmatory factor analysis revealed that the theoretical bifactor model containing the five constructs was supported. The multi- and unidimensional Rasch analyses showed that approximately 60% of the items had acceptable values. Some violation of local independence and some evidence of differential item functioning with respect to age and gender were identified, but they all made subject matter sense. The PECS is well-targeted to patients with less positive experiences. The overall evaluation of the construct validity showed the PECS has acceptable measurement properties.

CONCLUSION: The PECS is a reliable and valid 30-item colonoscopy-specific PREM that can play an important role in gathering data for research and quality improvement initiatives that seek to incorporate patient perspectives on colonoscopy experiences. Some potential areas for improvement were found, but the PECS is ready to be utilised in clinical practice for the purpose of collecting patient experiences.

PMID:40700682 | DOI:10.1111/jep.70220

Categories
Nevin Manimala Statistics

Automatic Abstraction of Computed Tomography Imaging Indication Using Natural Language Processing for Evaluation of Surveillance Patterns in Long-Term Lung Cancer Survivors

JCO Clin Cancer Inform. 2025 Jul;9:e2400279. doi: 10.1200/CCI-24-00279. Epub 2025 Jul 23.

ABSTRACT

PURPOSE: Despite its routine use to monitor patients with lung cancer (LC), real-world evaluations of the impact of computed tomography (CT) surveillance on overall survival (OS) have been inconsistent. A major confounder is the absence of imaging indications because patients undergo CT scans for purposes beyond surveillance, like symptom evaluations (eg, cough) linked to poor survival. We propose a novel natural language processing model to predict CT imaging indications (surveillance v others).

METHODS: We used electronic health records of 585 long-term LC survivors (≥5 years) at Stanford, followed for up to 22 years. Their 3,362 post-5-year CT reports (including 1,672 manually annotated) were used for modeling by integrating structured variables (eg, CT intervals) with key-phrase analysis of radiology reports. Naïve analysis compared OS in patients with CT for any indications (including symptoms) versus those without post-5-year CT, as in previous studies. Using model-predicted indications, we conducted exploratory analyses to compare OS between those with post-5-year surveillance CT and those without.

RESULTS: The model showed high discrimination (AUC, 0.86), with key predictors including a longer interval (≥6-month) from the previous CT (odds ratios [OR], 5.50; P < .001) and surveillance-related key phrases (OR, 1.37; P = .03). Propensity-adjusted survival analysis indicated better OS for patients with any post-5-year surveillance CT versus those without (adjusted hazard ratio, 0.60; P = .016). By contrast, no significant survival difference was found (P = .53) between patients with any CT versus those without post-5-year CT.

CONCLUSION: Our model abstracted CT indications from real-world data with high discrimination. Exploratory analyses revealed the obscured imaging-OS association when considering indications, highlighting the model’s potential for future real-world studies.

PMID:40700679 | DOI:10.1200/CCI-24-00279

Categories
Nevin Manimala Statistics

Extraction of Social Determinants of Health From Electronic Health Records Using Natural Language Processing

JCO Clin Cancer Inform. 2025 Jul;9:e2400317. doi: 10.1200/CCI-24-00317. Epub 2025 Jul 23.

ABSTRACT

PURPOSE: Social Determinants of Health (SDoH) have a significant effect on health outcomes and inequalities. SDoH can be extracted from electronic health records (EHR) to aid policy development and research to improve population health. Automated extraction using artificial intelligence (AI) can improve efficiency and cost-effectiveness. The focus of this study was to autonomously extract comprehensive SDoH details from EHR using a natural language processing (NLP)-based AI pipeline.

MATERIALS AND METHODS: A curated set of 1,000 BC Cancer clinical documents with concentrated SDoH information served as the reference standard for training and evaluating NLP models. Two pipelines were used: an open-source pipeline trained on the annotated medical documents and an industrial pretrained solution used as a benchmark. Three experiments optimized the first pipeline’s performance, assessing the effect of including subtype word positions during training. The superior open-source pipeline was then used to extract SDoH information from 13,258 oncology documents.

RESULTS: The open-source pipeline achieved an average F1 score accuracy of 0.88 on the validation data set for extracting 13 SDoH factors, surpassing the benchmark by 5%. It excelled in detailed subtype extraction, while the benchmark performed better in identifying rarely annotated SDoH information in BC Cancer data set. Overall, 60,717 SDoH factors and associated details were extracted from BC Cancer EHR oncology documents. The most frequently extracted SDoH factors included tobacco use, employment status, marital status, alcohol consumption, and living status, occurring between 8k to 12k times.

CONCLUSION: This study demonstrates the potential of an NLP pipeline to extract SDoH factors from clinical notes, with strong performance on limited data, although data set-specific adjustments are needed for broader application across institutions.

PMID:40700678 | DOI:10.1200/CCI-24-00317

Categories
Nevin Manimala Statistics

Relationship between monocyte-to-lymphocyte ratio and anemia: a NHANES analysis

Hematology. 2025 Dec;30(1):2535817. doi: 10.1080/16078454.2025.2535817. Epub 2025 Jul 23.

ABSTRACT

BACKGROUND: Growing evidence supports the significant role of inflammatory factors in anemia. This paper intends to ascertain the potential link between MLR and anemia and explore potential mediators.

METHODS: Our analysis employed comprehensive data recourse from the National Health and Nutrition Examination Survey (NHANES) from 2005 to 2018 utilizing weighted logistic regression models to assess the link between MLR and anemia. Restricted cubic spline analyses were implemented to evaluate MLR-anemia nonlinear relationship. Threshold effect analysis identified a critical inflection point. To ensure robustness, we conducted extensive subgroup analyses stratified by demographic and clinical factors. The mediating role of serum albumin on the link between MLR and anemia was investigated through mediation analysis.

RESULTS: 28,616 participants were enrolled, with 2655 (9.28%) with anemia. After adjustment for all covariates, log2-transformed MLR (log2MLR) was linked with an enhanced risk of anemia (OR:1.49, 95%CI:1.33-1.65, P < 0.001). When log2MLR was categorized into quartiles, the trend remained consistent (P < 0.001). A nonlinear positive link was noted between log2MLR and anemia, with an inflection point at -2.812. No statistical interactions were unveiled in any subgroup analyses except for gender and diabetes (interaction P < 0.05). Interestingly, serum albumin partially mediated this association, accounting for 15.39% of the total effect.

CONCLUSION: This study presents groundbreaking findings on the role of MLR in anemia and the mediating effect of serum albumin, offering new perspectives on potential inflammatory pathways underlying hematological disorders.

PMID:40700677 | DOI:10.1080/16078454.2025.2535817

Categories
Nevin Manimala Statistics

Deep-Learning Model for Real-Time Prediction of Recurrence in Early-Stage Non-Small Cell Lung Cancer: A Multimodal Approach (RADAR CARE Study)

JCO Precis Oncol. 2025 Jul;9:e2500172. doi: 10.1200/PO-25-00172. Epub 2025 Jul 23.

ABSTRACT

PURPOSE: The surveillance protocol for early-stage non-small cell lung cancer (NSCLC) is not contingent upon individualized risk factors for recurrence. This study aimed to use comprehensive data from clinical practice to develop a deep-learning model for practical longitudinal monitoring.

METHODS: A multimodal deep-learning model with transformers was developed for real-time recurrence prediction using baseline clinical, pathological, and molecular data with longitudinal laboratory and radiologic data collected during surveillance. Patients with NSCLC (stage I to III) who underwent surgery with curative intent between January 2008 and September 2022 were included. The primary outcome was predicting recurrence within 1 year after the monitoring point. This study demonstrates the timely provision of risk scores (RADAR score) and determined thresholds and the corresponding AUC.

RESULTS: A total of 14,177 patients were enrolled (10,262 with stage I, 2,380 with stage II, and 1,703 with stage III). The model incorporated 64 clinical-pathological-molecular factors at baseline, along with longitudinal laboratory and computed tomography imaging interpretation data. The mean baseline RADAR score was 0.324 (standard deviation [SD], 0.256) in stage I, 0.660 (SD, 0.210) in stage II, and 0.824 (SD, 0.140) in stage III. The AUC for predicting relapse within 1 year of the monitoring point was 0.854 across all stages, with a sensitivity of 86.0% and a specificity of 71.3% (AUC = 0.872 in stage I, AUC = 0.737 in stage II, and AUC = 0.724 in stage III).

CONCLUSION: This pilot study introduces a deep-learning model that uses multimodal data from routine clinical practice to predict relapses in early-stage NSCLC. It demonstrates the timely provision of RADAR risk scores to clinicians for recurrence prediction, potentially guiding risk-adapted surveillance strategies and aggressive adjuvant systemic treatment.

PMID:40700672 | DOI:10.1200/PO-25-00172

Categories
Nevin Manimala Statistics

MIPS Under Scrutiny: Exploring the Association Between Providers With Fraudulent Practices and Quality Metrics Within MACRA’s Framework

J Eval Clin Pract. 2025 Aug;31(5):e70217. doi: 10.1111/jep.70217.

ABSTRACT

IMPORTANCE: Identifying how fraudulent practices affect quality performance metrics is crucial for enhancing healthcare delivery and maintaining the integrity of the Medicare system.

OBJECTIVE: To examine the association between fraud and abuse perpetrator providers (FAPs) and their performance on quality metrics within the Merit-Based Incentive Payment System (MIPS) under the Medicare Access and CHIP Reauthorization Act (MACRA).

DESIGN: A retrospective observational study using exact matching and propensity score matching to balance comparison groups.

SETTING: Analysis of Medicare Quality Payment Program (QPP) data from 2017 to 2021.

PARTICIPANTS: A total of 12,364 physician-year observations, including 1300 provider-year level FAPs identified between 2020 and 2023 and 11,064 matched non-FAPs.

EXPOSURES: Provider status as fraud and abuse perpetrators based on inclusion in the List of Excluded Individuals and Entities from the Office of Inspector General.

MAIN OUTCOMES AND MEASURES: MIPS scores across key categories: Final score, Quality score, Promoting Interoperability (PI) score, Improvement Activities (IA) score, and Cost score.

RESULTS: FAPs scored significantly lower than non-FAPs in Final score, Quality score, PI score, and IA score (all p < 0.05). The negative impact of FAP status was more pronounced among individual practitioners, while FAPs participating in Advanced Alternative Payment Models exhibited higher scores on certain metrics. No significant differences were observed in Cost scores between FAPs and non-FAPs.

CONCLUSIONS AND RELEVANCE: Fraudulent practices are associated with lower performance on quality-related metrics under MACRA’s MIPS framework, particularly among individual practitioners. While lower quality scores align with expectations for providers committing fraud, the absence of significant differences in Cost scores highlights potential shortcomings in the MIPS scoring system, suggesting that cost metrics may not be sufficiently sensitive to fraudulent practices. These findings underscore the need for continuous refinement of both quality and cost measures to enhance the integrity and effectiveness of healthcare delivery.

PMID:40700659 | DOI:10.1111/jep.70217