Categories
Nevin Manimala Statistics

Evaluating gender bias in large language models in long-term care

BMC Med Inform Decis Mak. 2025 Aug 11;25(1):274. doi: 10.1186/s12911-025-03118-0.

ABSTRACT

BACKGROUND: Large language models (LLMs) are being used to reduce the administrative burden in long-term care by automatically generating and summarising case notes. However, LLMs can reproduce bias in their training data. This study evaluates gender bias in summaries of long-term care records generated with two state-of-the-art, open-source LLMs released in 2024: Meta’s Llama 3 and Google Gemma.

METHODS: Gender-swapped versions were created of long-term care records for 617 older people from a London local authority. Summaries of male and female versions were generated with Llama 3 and Gemma, as well as benchmark models from Meta and Google released in 2019: T5 and BART. Counterfactual bias was quantified through sentiment analysis alongside an evaluation of word frequency and thematic patterns.

RESULTS: The benchmark models exhibited some variation in output on the basis of gender. Llama 3 showed no gender-based differences across any metrics. Gemma displayed the most significant gender-based differences. Male summaries focus more on physical and mental health issues. Language used for men was more direct, with women’s needs downplayed more often than men’s.

CONCLUSION: Care services are allocated on the basis of need. If women’s health issues are underemphasised, this may lead to gender-based disparities in service receipt. LLMs may offer substantial benefits in easing administrative burden. However, the findings highlight the variation in state-of-the-art LLMs, and the need for evaluation of bias. The methods in this paper provide a practical framework for quantitative evaluation of gender bias in LLMs. The code is available on GitHub.

PMID:40784946 | DOI:10.1186/s12911-025-03118-0

Categories
Nevin Manimala Statistics

Publisher Correction: Leveraging Multi-National Observational Study in Post-Marketing Safety Assessment: Challenges and Strategies

Ther Innov Regul Sci. 2025 Aug 10. doi: 10.1007/s43441-025-00858-z. Online ahead of print.

NO ABSTRACT

PMID:40784934 | DOI:10.1007/s43441-025-00858-z

Categories
Nevin Manimala Statistics

SARS-CoV-2 seroprevalence and COVID-19 vaccination coverage in two states of Nigeria from a population based household survey

Sci Rep. 2025 Aug 10;15(1):29272. doi: 10.1038/s41598-025-14253-z.

ABSTRACT

SARS-CoV-2 population-based seroprevalence surveys are useful for estimating the extent of SARS-CoV-2 infections, which may be underestimated by COVID-19 case counts. Surveys conducted in October 2020 in four Nigerian states showed that SARS-CoV-2 seroprevalence ranged from 9.3% in Gombe (northeast) to 25.2% in Enugu (southeast) after the first COVID-19 wave, more than 100 and 700 times higher than the official number of COVID-19 cases in these two states, respectively. We conducted a serosurvey after the second COVID-19 wave to evaluate the extent of SARS-CoV-2 infections, attitudes to COVID-19 vaccines, and COVID-19 vaccination coverage in two regions of Nigeria. Using the World Health Organization (WHO) Unity protocol, 34 enumeration areas (EAs) each in the Federal Capital Territory (FCT) (Northcentral Zone) and Kano State (Northwest Zone) were sampled in June 2021, using probability proportional to estimated size; 20 households in one EA were randomly selected. All consenting and assenting members of a household were asked about risk behaviors; adults who were 18 years and above (the eligible population for COVID-19 vaccination in Nigeria) responded to questions on COVID-19 vaccine attitudes and receipt. Blood and nasal/oropharyngeal samples were taken from all consenting and assenting household members. Blood samples collected were tested with the Luminex xMAP® SARS-CoV-2 Multi-Antigen IgG Assay and swabs by reverse-transcriptase-PCR (RT-PCR). Overall response rates were 76.8% in the FCT (n = 1,505 blood draws) and 80.4% in Kano State (n = 2,178 blood draws). Following the second COVID-19 wave in Nigeria, more than 40% of residents in the FCT (40.3%, 95% CI: 34.7-45.9) and Kano State (42.6%, 95% CI: 39.4-45.8) had evidence of prior SARS-CoV-2 infection. There were no active SARS-CoV-2 infections detected by RT-PCR in either the FCT or Kano State. In the FCT and Kano State, 3.4% and 1.6% of people surveyed reported receipt of any COVID-19 vaccine, three months after vaccines were available in country. In the FCT, 77.5% of adults were aware of COVID-19 vaccines, of whom 46.9% reported willingness to receive them. In Kano State, 48.7% of adults were aware of COVID-19 vaccines, of whom 61.1% were willing to receive them. In both regions, about 84% of those reporting unwillingness to accept COVID-19 vaccines cited concerns over vaccine safety. “Serosurvey findings revealed that SARS-CoV-2 infection was far more widespread in both the Federal Capital Territory and Kano State than indicated by reported case numbers. Despite high awareness, COVID-19 vaccine uptake remained low, primarily due to concerns about vaccine safety. These results highlight the urgent need for targeted risk communication to address vaccine hesitancy and improve coverage. Serosurveys provide valuable insights that can guide public health interventions and future pandemic preparedness in Nigeria.”

PMID:40784905 | DOI:10.1038/s41598-025-14253-z

Categories
Nevin Manimala Statistics

Development and validation of a machine learning model for predicting vulnerable carotid plaques using routine blood biomarkers and derived indicators: insights into sex-related risk patterns

Cardiovasc Diabetol. 2025 Aug 10;24(1):326. doi: 10.1186/s12933-025-02867-6.

ABSTRACT

BACKGROUND: Early detection of vulnerable carotid plaques is critical for stroke prevention. This study aimed to develop a machine learning model based on routine blood tests and derived indices to predict plaque vulnerability and assess sex-specific risk patterns across biomarker value ranges.

METHODS: We retrospectively included 1701 hospitalized patients from Suzhou Municipal Hospital (2019-2020), selected from an initial cohort of 10,028 individuals. All patients underwent carotid ultrasound, with vulnerable plaques identified using predefined imaging criteria. A total of 30 laboratory variables-including blood count, coagulation, and biochemistry-were extracted, alongside derived indices such as triglyceride-glucose index (TyG), atherogenic index of plasma (AIP), neutrophil-to-lymphocyte ratio (NLR) and others. Features were standardized and selected based on statistical and clinical relevance. Five machine learning models were trained using a 7:3 train-test split and evaluated by cross-validation. Model performance was assessed using AUC, sensitivity, and specificity. The best model was interpreted using SHapley Additive exPlanations (SHAP) analysis. Sex differences were explored using Mann-Whitney U tests and restricted cubic spline (RCS) modeling across value intervals.

RESULTS: The Random Forest model showed the highest predictive performance (AUC = 0.847; 95% CI 0.791-0.895; specificity = 89.4%; sensitivity = 64.2%). SHAP analysis identified gender, age, fibrinogen, NLR, creatinine, fasting blood glucose, uric acid to high-density lipoprotein ratio (UHR), TyG, systemic inflammation response index (SIRI), and lymphocyte count as top predictors. Significant sex-specific differences in SHAP values were observed for key biomarkers, including age, UHR, TyG, SIRI, and others. RCS modeling further revealed distinct sex-related patterns in plaque vulnerability across biomarker value ranges.

CONCLUSION: A Random Forest model integrating routine blood markers and derived indices accurately predicted vulnerable carotid plaques. The results underscore the importance of sex-specific risk assessment, highlighting differential effects of key biomarkers across genders and value intervals.

PMID:40784899 | DOI:10.1186/s12933-025-02867-6

Categories
Nevin Manimala Statistics

Calibrating multiplex serology for Helicobacter pylori

Diagn Progn Res. 2025 Aug 11;9(1):17. doi: 10.1186/s41512-025-00202-x.

ABSTRACT

BACKGROUND: Helicobacter pylori (H. pylori) is a bacterium that colonizes the stomach and is a major risk factor for gastric cancer, with an estimated 89% of non-cardia gastric cancer cases worldwide attributable to H. pylori. Prospective studies provide reliable evidence for quantifying the association between gastric cancer and H. pylori, as they circumvent the risk of a false negative due to possible reduction in antibody levels before cancer development.

METHODS: In a large-scale prospective study within the China Kadoorie Biobank, H. pylori infection is being analysed as a risk factor for gastric cancer. The presence of infection is typically determined by serological tests. The immunoblot test, although well established, is more labour intensive and uses a larger amount of plasma than the alternative high-throughput multiplex serology test. Immunoblot outputs a binary positive/negative serostatus classification, while multiplex outputs a vector of continuous antigen measurements. When mapping such multidimensional continuous measurements onto a binary classification, statistical challenges arise in defining classification cut-offs and accounting for the differences in infection evidence provided by different antigens. We discuss these challenges and propose a novel solution to optimize the translation of the continuous measurements from multiplex serology into probabilities of H. pylori infection, using classification algorithms (Bayesian additive regressive trees (BART), multidimensional monotone BART, logistic regression, random forest and elastic net). We (i) calibrate and apply classification models to predict probabilities of H. pylori infection given multiplex measurements, (ii) compare the predictive performance of the models using immunoblot as reference, (iii) discuss reasons for the differences in predictive performance and (iv) apply the calibrated models to gain insights on the relative strengths of infection evidence provided by the various antigens.

RESULTS: All models showed high discriminative ability with at least 95% area under the curve (AUC) estimates on the training and test data. There was no substantial difference between the performance of models on the training and test data.

CONCLUSIONS: Classification algorithms can be used to calibrate the H. pylori multiplex serology test to the immunoblot test in the China Kadoorie Biobank. This study furthers our understanding of the applicability of classification algorithms to the context of serologic tests.

PMID:40784889 | DOI:10.1186/s41512-025-00202-x

Categories
Nevin Manimala Statistics

Gait analysis and functional assessment of conservatively treated calcaneal fractures

Jt Dis Relat Surg. 2025 Jul 8;36(3):702-710. doi: 10.52312/jdrs.2025.2264. Epub 2025 Jul 8.

ABSTRACT

OBJECTIVES: This study aims to compare the functional scores and gait analysis data of patients undergoing conservative treatment after calcaneal fractures with healthy individuals and to evaluate both success of conservative treatment and the applicability and effectiveness of a novel smartphone-based gait analysis method in assessing post-fracture mobility.

PATIENTS AND METHODS: Between January 2017 and December 2022, a total of 30 patients (10 females, 20 males; mean age: 48.6±12.6 years; range, 19 to 65 years) who underwent conservative treatment due to calcaneal fractures and 30 healthy controls (12 females, 18 males; mean age: 45.3±12.7 years; range, 21 to 63 years) were retrospectively analyzed. Patients with completed fracture union and mobilized by full weight bearing on the fractured extremity were evaluated with ankle joint range of motion (ROM), American Orthopaedic Foot and Ankle Society (AOFAS), Short Form-36 (SF-36), Visual Analog Scale (VAS) functional scoring and gait analysis using the smartphone-based Gait Analyzer application, and the results were compared with the control group.

RESULTS: After conservative treatment, there was no statistically significant difference in the ankle ROM values (dorsiflexion p=0.359, plantarflexion p=0.240), AOFAS (p=0.211), and SF-36 scores (physical function p=0.188, pain p=0.483, health change p=0.894) of the patient and control groups. The mean VAS score of the patient group was 2.83±1.80, indicating higher scores than those of the control group (p=0.035). There was a statistically significant change between the groups in terms of all gait parameters (gait velocity p=0.010, step time p<0.001, step length p<0.001, cadence p<0.001, step time symmetry p<0.001, step length symmetry p<0.001, vert-COM p<0.001).

CONCLUSION: Although the functionality and gait patterns of the patients may be affected after conservative treatment of calcaneal fractures, the fact that there was no significant difference between the patient and control groups indicates that this treatment method can be preferred in this group of patients, particularly in extraarticular and Sanders type 1 intra-articular fractures, with appropriate rehabilitation.

PMID:40784003 | DOI:10.52312/jdrs.2025.2264

Categories
Nevin Manimala Statistics

Outcomes of conservatively treated midshaft clavicle fractures with butterfly fragment

Jt Dis Relat Surg. 2025 Jul 21;36(3):666-674. doi: 10.52312/jdrs.2025.2251. Epub 2025 Jul 21.

ABSTRACT

OBJECTIVES: The aim of this study was to evaluate whether fracture shortening, displacement, and the length of butterfly fragments were reliable radiographic indicators of secondary healing failure in displaced midshaft clavicle fractures with butterfly fragments and to determine whether these radiographic parameters were effective in predicting healing disorders and could be utilized as prognostic factors.

PATIENTS AND METHODS: Between January 2015 and January 2020, a total of 31 adult patients (29 males, 2 females; mean age: 43.6±13.2 years; range, 21 to 74 years) who presented with a closed displaced clavicle shaft fracture with butterfly fragments and were treated conservative using figure of eight bandages were retrospectively analyzed. Shortening, displacement, and butterfly fragment length were measured radiographically at diagnosis. The patients were evaluated at Weeks 4, 6, 12, and 24 after injury. The patients were divided into three groups: patients with unionized fractures, patients with delayed union, and patients with nonunion. In patients where radiographic union was not observed after four to six weeks, the figure-of-eight bandage treatment was continued. Delayed union was defined as the absence of radiographic signs of fracture consolidation within 12 weeks, and nonunion as the absence of fracture consolidation within 24 weeks.

RESULTS: Fractures in 13 (42%) patients healed within 12 weeks, 10 (32.2%) patients had delay healing between 12 and 24 weeks, and eight (25.8%) patients had nonunion. The median shortening was 18.37 (range, 3 to 42.9) mm, while median displacement ratio and butterfly fragment length were 125% (range, 83 to 93%) and 21.7 (range, 12 to 47.2) mm, respectively. No statistically significant difference in shortening was observed among the three groups (p=0.71). There was a significant difference in the amount of displacement between the healed fractures and delayed union groups (p=0.006) and the healed fractures and nonunion groups (p=0.002). There was also a significant difference in the butterfly fragment length between the healed fractures and nonunion groups (p=0.008). For each 1% increase in displacement, the relative risk of delayed union increased by 8%, and the risk of nonunion increased by 10%. A cut-off value of 125% optimally distinguished healed from unhealed fractures (area under the curve [AUC]=0.874). For differentiating delayed union from nonunion, the optimal threshold was 142.5% (AUC=0.713), indicating moderate diagnostic performance.

CONCLUSION: In adult clavicle shaft fractures with butterfly fragments, butterfly fragment length and clavicle shortening did not affect bone healing. In contrast, displacement was the only significant predictor of impaired bone healing.

PMID:40783999 | DOI:10.52312/jdrs.2025.2251

Categories
Nevin Manimala Statistics

The role of preoperative nutritional status in predicting surgical outcomes after total knee arthroplasty: A CONUT-based analysis

Jt Dis Relat Surg. 2025 Jul 21;36(3):604-611. doi: 10.52312/jdrs.2025.2412. Epub 2025 Jul 21.

ABSTRACT

OBJECTIVES: This study aims to investigate the association between the preoperative Controlling Nutritional Status (CONUT) score and two important postoperative outcomes, surgical site infection (SSI) and prolonged hospital stay, in patients aged 60 years and older undergoing total knee arthroplasty (TKA).

PATIENTS AND METHODS: Between February 2019 and December 2023, a total of 268 patients (54 males, 214 females; mean age: 68.2±5.9 years; range, 60 to 87 years) aged ≥60 years who underwent elective primary TKA were retrospectively analyzed. The nutritional status was assessed using the CONUT score, and patients were categorized as at nutritional risk (CONUT ≥2) or normal (CONUT 0-1). Primary outcomes were postoperative infection and length of hospitalization. Multivariate logistic regression was used to adjust for confounding variables including age, body mass index (BMI), American Society of Anesthesiologists (ASA) score, Visual Analog Scale (VAS), hemoglobin, C-reactive protein (CRP), and surgery duration.

RESULTS: Of the patients, 27.2% (n=73) were at nutritional risk. These patients had significantly higher rates of postoperative infection (11% vs. 3.1%, p=0.010) and longer hospital stays (5.5±1.7 vs. 1.5±0.5 days, p<0.001). A higher CONUT score was independently associated with increased risk of infection (adjusted odds ratio [OR]=4.12; 95% confidence interval [CI]: 1.33-12.7; p=0.014) and prolonged hospitalization (adjusted OR=4.03; 95% CI: 3.75-4.30; p<0.001).

CONCLUSION: The CONUT score is a valuable tool for preoperative risk assessment in TKA. High CONUT scores are associated with an increased risk of postoperative infection and prolonged hospitalization. Routine nutritional assessment using the CONUT score prior to surgery in older adults may help improve surgical outcomes, reduce complications and lower healthcare costs.

PMID:40783992 | DOI:10.52312/jdrs.2025.2412

Categories
Nevin Manimala Statistics

Hidden blood loss in anterior cervical discectomy and fusion with zero-profile anchored spacer for the treatment of cervical radiculopathy

Jt Dis Relat Surg. 2025 Jul 21;36(3):555-561. doi: 10.52312/jdrs.2025.2371. Epub 2025 Jul 21.

ABSTRACT

OBJECTIVES: This study aims to evaluate the hidden blood loss (HBL) and its possible risk factors after anterior cervical discectomy and fusion (ACDF) with zero-profile anchored spacer (ZPAS) in patients with cervical radiculopathy.

PATIENTS AND METHODS: Between January 2017 and January 2024, a total of 92 patients (44 males, 48 females; mean age: 73.2±10.0 years; range, 44 to 85 years) who underwent ACDF with ZPAS were retrospectively analyzed. Data collection encompassed baseline demographics including age, sex, height, weight, body mass index (BMI), disease duration, symptomatic laterality, and comorbidities and perioperative parameters such as the American Society of Anesthesiologists (ASA) score, operative levels, surgical time, intraoperative blood loss, and postoperative drainage volume. The HBL was quantified using the Sehat formula. Subsequent multivariate linear regression modeling was employed to identify independent predictors of HBL.

RESULTS: The mean surgical time was 152.6±27.6 min. The mean total blood loss (TBL) and HBL were 334.6±67.7 mL and 268.1±69.0 mL, respectively. Correlation analyses revealed significant associations between HBL and symptomatic laterality, hematocrit (Hct) loss, surgical levels, and surgical time (p<0.05). Multivariate linear regression further confirmed Hct loss, surgical levels, and surgical time as positive predictors of HBL (p<0.05).

CONCLUSION: Patients with cervical radiculopathy who underwent ACDF with ZPAS perioperatively had significant HBL. More Hct loss, more surgical levels, and longer surgical time were independent risk factors for increased HBL.

PMID:40783987 | DOI:10.52312/jdrs.2025.2371

Categories
Nevin Manimala Statistics

Hidden blood loss of percutaneous vertebroplasty in the treatment of spinal metastases of breast cancer

Jt Dis Relat Surg. 2025 Jul 21;36(3):535-542. doi: 10.52312/jdrs.2025.2393. Epub 2025 Jul 21.

ABSTRACT

OBJECTIVES: The aim of this study was to evaluate hidden blood loss (HBL) and to identify its possible risk factors after percutaneous vertebroplasty (PVP) in patients with spinal metastases from breast cancer.

PATIENTS AND METHODS: Between January 2020 and January 2024, a total of 54 female patients (mean age: 65.3±7.9 years, range, 47 to 79 years) with breast cancer and vertebral metastases who underwent PVP were retrospectively analyzed. Patient data were collected including demographic characteristics, oncological profiles, laboratory parameters, particularly pre- and postoperative hematocrit (Hct) levels, and clinical variables. The Sehat equation was employed to quantify HBL based on Hct alterations. To identify significant predictors of HBL, a multiple linear regression analysis of potential risk factors was carried out.

RESULTS: The mean surgical time was 32.0±8.5 min. Cement leakage occurred in 44.4% of cases. The mean hemoglobin (Hb) loss and Hct loss were 0.9±0.4 g/dL and 2.8±0.6%, respectively. The mean HBL was 287.2±57.4 mL. Multiple linear regression analysis showed that HBL was positively correlated with bone metastasis (p=0.010), surgical time (p=0.009), number of punctures (p=0.036), cement leakage (p=0.026), Hct loss (p=0.020), and TBL (p<0.001), while it was negatively correlated with postoperative Hct (p=0.024).

CONCLUSION: Bone metastasis, surgical time, number of punctures, cement leakage, Hct loss, and TBL are independent risk factors for HBL. Therefore, HBL warrants clinical attention in patients with spinal metastases from breast cancer undergoing PVP, particularly those with these risk factors.

PMID:40783985 | DOI:10.52312/jdrs.2025.2393