Categories
Nevin Manimala Statistics

Longitudinal Synthetic Data Generation by Artificial Intelligence to Accelerate Clinical and Translational Research in Breast Cancer

JCO Clin Cancer Inform. 2025 Nov;9:e2500033. doi: 10.1200/CCI-25-00033. Epub 2025 Nov 6.

ABSTRACT

PURPOSE: Real-world data (RWD) are critical for breast cancer (BC) research but are limited by privacy concerns, missing information, and data fragmentation. This study explores synthetic data (SD) generated through advanced generative models to address these challenges and create harmonized longitudinal data sets.

METHODS: A data set of 1052 patients with human epidermal growth factor receptor 2-positive and triple-negative BC from the Informatics for Integrating Biology and the Bedside (i2b2) platform was used. Advanced generative models, including generative adversarial networks (GANs), variational autoencoders (VAEs), and language models (LMs), were applied to generate synthetic longitudinal data sets replicating disease progression, treatment patterns, and clinical outcomes. The Synthethic Validation Framework (SAFE) powered by Train was used to evaluate the fidelity, utility, and privacy. SD were tested across three settings: (1) integration with i2b2 for privacy-preserving data sets; (2) multistate disease modeling to predict clinical outcomes; and (3) generation of synthetic control groups for clinical trials.

RESULTS: The synthetic data sets exhibited high fidelity (score 0.94) and ensured privacy, with temporal patterns validated through time-series analyses and Uniform Manifold Approximation and Projection embeddings. In setting A, SD accurately mirrored RWD on the i2b2 platform while maintaining privacy. In setting B, incorporating SD improved the predictive performance of a multistate disease progression model, increasing the C-index by up to 10%. In setting C, SD replicated the end points of the APT trial, demonstrating its feasibility for generating synthetic control arms with preserved statistical properties of the real data set.

CONCLUSION: AI-generated longitudinal SD effectively address key challenges in RWD use in BC. This approach can improve translational research and clinical trial design while ensuring robust privacy protection. Integration with platforms such as i2b2 highlights their scalability and potential for broader applications in oncology.

PMID:41197110 | DOI:10.1200/CCI-25-00033

Categories
Nevin Manimala Statistics

Recurrence rate of premalignant and early malignant lesions of the gastrointestinal tract following endoscopic submucosal dissection: a single-centre cohort

N Z Med J. 2025 Nov 7;138(1625):13-19. doi: 10.26635/6965.7129.

ABSTRACT

AIM: Endoscopic submucosal dissection (ESD) has become a well-established treatment option for premalignant and early malignant lesions of the gastrointestinal tract. This study aimed to evaluate the recurrence rate following ESD in a single tertiary centre cohort of patients.

METHODS: All consecutive patients who received ESD treatment for premalignant or early malignant lesions by a single endoscopist (AS) at Middlemore Hospital from 11 February 2019 to 6 October 2023 were included in this retrospective cohort study. The primary outcome was recurrence rate of premalignant and early malignant lesions of the gastrointestinal tract following ESD. Recurrence was defined as confirmed neoplasm on histopathology on first follow-up surveillance endoscopy. The target recurrence rate was less than or equal to 5%. Secondary outcome was recurrence stratified by location of the lesion, lesion size, en bloc resection status, R0 resection status and histopathological type of lesion.

RESULTS: A total of 119 ESD procedures were completed during the study time frame, with 91 having a surveillance endoscopy with a median time of 231 days. Twenty-eight cases did not have surveillance endoscopy completed. Three (3.3%) had recurrence of disease, of which two were oesophageal squamous cell carcinoma and one was rectal sessile serrated adenoma. We were unable to ascertain any statistically significant associations with regard to our secondary outcome variables.

CONCLUSION: This study supports the efficacy of ESD in our centre as a curative treatment modality for premalignant and early malignant gastrointestinal lesions, demonstrating a recurrence rate within the acceptable international benchmark.

PMID:41197092 | DOI:10.26635/6965.7129

Categories
Nevin Manimala Statistics

Global prevalence of stroke-associated pneumonia: a systematic review and meta-analysis of cross-sectional studies

Top Stroke Rehabil. 2025 Nov 6:1-16. doi: 10.1080/10749357.2025.2585118. Online ahead of print.

ABSTRACT

BACKGROUND: Neurological damage resulting from stroke can impair the respiratory system, leaving stroke patients susceptible to pulmonary complications, such as stroke-associated pneumonia (SAP). Due to the substantial mortality risk associated with SAP and its complications, we conducted a systematic review and meta-analysis to estimate the global prevalence of SAP.

METHOD: A comprehensive search of MEDLINE via PubMed, EMBASE, Scopus, and Web of Science was conducted from 1 January 2014 to 7 August 2024. The cross-sectional studies were selected. The primary study outcome was the prevalence of SAP. Subgroup analysis was implemented. The random-effects method was used to implement meta-analysis. The Joanna Briggs Institute tool for prevalence was used to assess risk of bias in the included studies.

RESULT: A comprehensive search of MEDLINE via PubMed, EMBASE, Scopus, and Web of Science was conducted from 1 January 2014 to 7 August 2024. The cross-sectional studies were selected. The primary study outcome was the prevalence of SAP. Subgroup analysis was implemented. The random-effects method was used to implement meta-analysis. The Joanna Briggs Institute tool for prevalence was used to assess risk of bias in the included studies. Result The present systematic review and meta-analysis integrated data from 24 studies, encompassing 4,272,805 participants. The prevalence of SAP was 18.3% (95% CI: 13.7-23.0). Results showed significant heterogeneity among studies regarding prevalence estimates (I2= 99.74%, p<0.001). Subgroup analysis by country revealed a substantial reduction in heterogeneity within one specific subgroup (I2=0.00, P=0.67). Utilizing Begg’s and Egger’s tests showed no statistically significant evidence (P = 0.9013 and P = 0.8398, respectively) of publication bias. The trim-and-fill analysis did not impute additional studies, implying that publication bias is unlikely. A leave-one-out sensitivity analysis revealed that excluding the study by Asgedom et al. resulted in the most significant change in the overall prevalence of SAP.

CONCLUSION: Our results emphasized the need to effectively identify and manage risk factors to reduce the likelihood of SAP.

PMID:41197074 | DOI:10.1080/10749357.2025.2585118

Categories
Nevin Manimala Statistics

HIV Screening Practices Among Youth Tested for Other Sexually Transmitted Infections in Pediatric Primary Care

WMJ. 2025;124(4):371-374.

ABSTRACT

INTRODUCTION: HIV remains a significant public health concern. In Wisconsin, new cases increased by 36% during 2020 through 2022, and 22% were 13 to 24 years old. Despite recommendations for routine HIV screening, youth testing remains inadequate. This study aimed to understand HIV screening practices among youth receiving care in pediatric primary care clinics in southeastern Wisconsin.

METHODS: Clinic HIV testing rates were measured in patients aged 12 to 26 undergoing gonorrhea and/or chlamydia testing at pediatric primary care clinics affiliated with a not-for-profit children’s hospital.

RESULTS: Youth HIV testing rates at all clinic sites were low (median 19.7%) ranging from 13.2% to 36.1%. Higher rates were seen in clinics with higher rates of sexually transmitted infections.

CONCLUSIONS: Interventions are needed to enhance HIV testing rates in pediatric primary care clinics.

PMID:41197057

Categories
Nevin Manimala Statistics

The Relative Impact of Risk Factors for Homelessness, Housing Barriers, and Health Care Barriers on Mental Health Outcomes: A Single-Center Study

WMJ. 2025;124(4):357-363.

ABSTRACT

BACKGROUND: Housing and health care both play crucial roles in overall health. Though housing and health care barriers negatively impact affect health, little is known about the relative influence of each. This study sought to understand the relationship between housing circumstance, barriers to care, and mental health outcomes among low-income, uninsured patients seen at a free clinic in Milwaukee, Wisconsin. This includes investigating the relative impact of risk factors for homelessness, housing barriers, and health care barriers on mental health.

METHODS: Surveys were administered to clinic patients (n = 94) from June to December 2023. Surveys assessed patient demographics, housing and health care barriers, and mental health outcomes, primarily measured by the Patient Health Questionnaire-2 (PHQ-2), General Anxiety Disorder-2 (GAD-2) questionnaire, modified loneliness scale, and individuals’ subjective mental health rating.

RESULTS: Increased health care barriers and socioenvironmental risk factors for homelessness significantly predicted worse PHQ-2 score, GAD-2 score, loneliness, and mental health rating. Despite significant associations, increased housing barriers did not significantly predict any of the 4 mental health metrics. Furthermore, neither housing barriers nor health care barriers significantly predicted recreational drug use, whereas socioenvironmental risk factors for homelessness were both a significant predictor and response of increased recreational drug use. The most frequently reported mental health care barriers were insurance coverage, financial barriers, and transportation issues. In addition, there was significantly lower patient trust in mental health care providers than in general medical providers, which may reflect increased stigma.

CONCLUSIONS: Compared to housing barriers, increased health care barriers significantly predicted worse mental health outcomes. This study emphasizes the importance of addressing health care barriers to improve mental health.

PMID:41197054

Categories
Nevin Manimala Statistics

Sarcopenic obesity and risk of falls: findings in Middle-aged and older Chinese population from CHARLS

Aging Clin Exp Res. 2025 Nov 6;37(1):314. doi: 10.1007/s40520-025-03215-0.

ABSTRACT

BACKGROUND: Sarcopenic obesity (SO) is increasingly recognized as a significant health concern, particularly among older populations. Existing literature indicates that SO elevates the risk for various adverse health outcomes such as cardiovascular diseases, fractures, higher all-cause mortality. However, evidence regarding its impact on the risk of falls remains limited and inconclusive. Our study aimed to investigate the association between SO and fall incidents.

METHODS: A total of 10,905 participants were enrolled from the baseline survey of the China Health and Retirement Longitudinal Study (CHARLS) 2015 wave. Participants were categorized into four groups according to sarcopenia and obesity status, with the neither sarcopenia nor obesity group serving as the reference. Logistic regression was utilized to evaluate the cross-sectional association between SO and falls. Furthermore, we tracked fall incidents reported in follow-up surveys conducted in CHARLS 2018 and 2020 wave. Cox regression analysis was performed to explore how SO affected the risk of falls. Stratified Cox analyses by age (< 60 vs. ≥60 years) were also performed.

RESULTS: In the cross-sectional analysis (2015), the SO group [OR (95% CI): 1.84 (1.42 ~ 2.37), P < 0.01] showed a higher risk of falls compared to the reference group; however, this association was not statistically significant after adjusting for potential confounding factors. In the longitudinal analysis (2015-2020), the SO group [HR (95%CI): 2.78 (2.03 ~ 3.80), P < 0.01] had a significantly increased risk of falls. The results remained similar after adjusting for age, sex [HR (95%CI): 1.44 (1.02 ~ 2.04), P < 0.05], and additional covariates [HR (95%CI): 1.43 (1.00 ~ 2.03), P < 0.05]. Notably, stratified Cox models showed that SO was significantly associated with fall risk in both age groups, with a stronger effect observed in participants under 60 years [HR (95%CI): 2.85 (1.13 ~ 7.17), P < 0.05] than in those aged 60 and above [HR (95%CI): 1.64 (1.08 ~ 2.50), P < 0.05].

CONCLUSION: Sarcopenic obesity is associated with an increased risk of falls among middle-aged and older adults, especially in longitudinal analyses. Age-stratified results suggest that the impact of SO on falls may be more pronounced in the middle-aged group. Our findings support the need for early identification and targeted interventions for individuals with SO to mitigate fall-related risks in aging populations.

PMID:41196484 | DOI:10.1007/s40520-025-03215-0

Categories
Nevin Manimala Statistics

DeepIMB: Imputation of non-biological zero counts in microbiome data

Genes Genomics. 2025 Nov 6. doi: 10.1007/s13258-025-01693-0. Online ahead of print.

ABSTRACT

BACKGROUND: The high prevalence of non-biological zero counts, arising from low sequencing depth and sampling variation, presents a significant challenge in microbiome data analysis. These zeros can distort taxon abundance distributions and hinder the identification of true biological signals, complicating downstream analyses.

OBJECTIVE: To address the challenges of non-biological zeros in microbiome datasets, we propose DeepIMB, a deep learning-based imputation method for microbiome data, specifically designed to accurately identify and impute non-biological zero counts while preserving biological integrity.

METHODS: DeepIMB operates in two main phases. First, it identifies non-biological zeros using a gamma-normal mixture model applied to the normalized, log-transformed taxon count matrix. Second, it imputes these zeros with a deep neural network model that integrates diverse sources of information, including taxon abundances, sample covariates, and phylogenetic distances, thereby learning complex, nonlinear relationships within microbiome data.

RESULTS: By leveraging integrated information from multiple data types, DeepIMB accurately imputes non-biological zeros while preserving true biological signals. In our two simulation studies, DeepIMB outperformed existing imputation methods in terms of mean squared error, Pearson correlation coefficient, and Wasserstein distance.

CONCLUSION: DeepIMB effectively addresses the challenges posed by non-biological zeros in microbiome data. By improving the quality of the data and the reliability of downstream analyses, DeepIMB represents a significant advancement in microbiome research methodologies.

PMID:41196474 | DOI:10.1007/s13258-025-01693-0

Categories
Nevin Manimala Statistics

Investigating the predictive role of inflammatory indices in cancer metastasis

Clin Transl Oncol. 2025 Nov 6. doi: 10.1007/s12094-025-04093-8. Online ahead of print.

ABSTRACT

BACKGROUND: Early detection of metastasis in cancer patients plays a pivotal role in improving treatment outcomes and increasing patient survival. This study aimed to evaluate the predictive role of inflammatory indices, including neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), monocyte-to-lymphocyte ratio (MLR), systemic immune inflammation index (SII), and systemic inflammation response index (SIRI), in identifying metastatic status.

METHODS: In this study, 60 cancer patients were enrolled between December 2023 and June 2024. Clinicopathological data and complete blood counts (CBCs) were collected prior to treatment initiation. The Receiver Operating Characteristic (ROC) curve was used to determine the optimal cutoff values of different baseline inflammatory indices for the metastatic status analysis.

RESULTS: The levels of inflammatory indices were greater in metastatic patients than in nonmetastatic patients; however, only the SIRI was significantly different (1.04 [0.76-1.69] vs. 0.71 [0.45-1.07]; P = 0.044). ROC curve analysis revealed that the area under the curve (AUC) for the SIRI was 0.652 (95% CI 0.507-0.797). Furthermore, broader combinations of the SIRI and MLR, either individually or in conjunction with the NLR, PLR, and/or SII, yielded multi-index models with greater discriminatory power and maintained statistical significance (p < 0.05).

CONCLUSION: The findings indicate that the SIRI, in combination with the MLR, plays a significant role in predicting the metastatic status of cancer patients.

PMID:41196459 | DOI:10.1007/s12094-025-04093-8

Categories
Nevin Manimala Statistics

Proteome-wide Mendelian randomization and colocalization analysis uncovers druggable targets for lung cancer across multiple phenotypes and complications

Discov Oncol. 2025 Nov 6;16(1):2048. doi: 10.1007/s12672-025-03910-4.

ABSTRACT

BACKGROUND: Lung cancer remains a leading cause of cancer-related mortality, necessitating novel therapeutic targets. The plasma proteome represents a key source for such targets.

METHODS: Proteome-wide Mendelian randomization (MR) and colocalization analyses were conducted to assess the causal effects of plasma proteins on lung cancer subtypes and complications. Genetic instruments (cis-pQTLs) for 2,090 proteins were derived from plasma proteome data (54,306 UK Biobank and 35,559 Icelandic participants). Lung cancer phenotype data were obtained from FinnGen R10.

RESULTS: MR identified seven plasma proteins showing significant causal associations with specific lung cancer phenotypes: high GGT1 increased non-small cell lung cancer (NSCLC) risk (OR 1.27, 95% CI 1.10-1.46; PFDR = 0.0261), GFRA2 increased the SCLC risk (OR 1.65, 95% CI 1.24-2.21; PFDR = 0.0462), and higher advanced glycosylation end-product specific receptor reduced the squamous cell carcinoma risk (OR 0.338 per SD increase, 95% CI 0.209-0.548; PFDR = 0.0138). Fifteen proteins showed associations with lung cancer complications. Colocalization strongly supported causal roles for eight proteins: FKBP1B (OR 1.15, 95% CI 1.09-1.22; PFDR = 0.00264), F11(OR 1.01, 95% CI 1.01-1.01; PFDR = 1.47 × 10– 23), ABO (OR 1.11, 95% CI 1.06-1.21; PFDR = 5.82 × 10– 9), F2 (OR 3.04, 95% CI 1.74-5.31; PFDR = 0.0102), and VSIG10L (OR 1.006, 95% CI 1.00-1.01; PFDR = 0.0159).

CONCLUSION: This study reveals causal proteins for various lung cancer phenotypes and complications, emphasizing causal pathways and potential therapeutic targets for lung cancer and providing new insights into its etiology, prevention, treatment, and therapy.

PMID:41196451 | DOI:10.1007/s12672-025-03910-4

Categories
Nevin Manimala Statistics

Economic and demographic influences on health expenditures: robust approaches for income and aging effects

Health Econ Rev. 2025 Nov 6;15(1):95. doi: 10.1186/s13561-025-00631-w.

ABSTRACT

BACKGROUND: Health expenditure is influenced by complex interactions between economic, demographic, social factors, with significant variations across countries. This study aims to investigate the determinants of health expenditures employing robust regression methods offering a more flexible and reliable approach to dealing with outliers and high data variation.

METHODS: This study employs robust regression methods, Weighted Least Squares (WLS) and MM-estimator regression, to examine the determinants of health expenditures. The analyses were conducted using data from 179 countries for the year 2021 with the R Studio.

RESULTS: The findings indicate that income and ageing are significant determinants of health expenditures, and sixteen outliers were identified. In contrast, education level, public health expenditure, disease patterns showed no significant effect.

CONCLUSION: This study fills gap in the literature by using robust regression methods to account for outliers and provides new insights into the role of economic and demographic factors in health expenditures.

PMID:41196444 | DOI:10.1186/s13561-025-00631-w