Categories
Nevin Manimala Statistics

Dynamic flexibility of the murine gut microbiota to morphine disturbance enables escape from the stable dysbiosis associated with addiction-like behavior

bioRxiv [Preprint]. 2025 Jun 1:2025.06.01.657215. doi: 10.1101/2025.06.01.657215.

ABSTRACT

Although opioids are effective analgesics, they can lead to problematic drug use behaviors that underlie opioid use disorder (OUD). Opioids also drive gut microbiota dysbiosis which is linked to altered opioid responses tied to OUD. To interrogate the role of the gut microbiota in a mouse model of OUD, we used a longitudinal paradigm of voluntary oral morphine self-administration to capture multiple facets of drug seeking and preserve both individual behavioral response and gut microbiota variation to examine associations between these two variables. After prolonged morphine consumption, only a subset of mice transitioned to a state we define statistically as compulsive. In compulsive mice, morphine fragmented the microbiota networks which subsequently reorganized to form robust novel connections. In contrast, the communities of non-compulsive mice also changed but were highly interconnected during morphine disturbance and maintained more continuity post morphine suggesting dynamic flexibility. Compulsive mice displayed a greater loss of functional diversity and a shift towards a new stable state dominated by potential pathobionts, whereas non-compulsive mice better preserved genera associated with gut health and broader functional diversity. These findings highlight how persistent and stable gut microbiota dysbiosis aligns with long-term behavioral changes underlying OUD, potentially contributing to relapse.

PMID:40501972 | PMC:PMC12154951 | DOI:10.1101/2025.06.01.657215

Categories
Nevin Manimala Statistics

Chevreul: An R Bioconductor Package for Exploratory Analysis of Full-Length Single Cell Sequencing

bioRxiv [Preprint]. 2025 Jun 1:2025.05.27.656486. doi: 10.1101/2025.05.27.656486.

ABSTRACT

Chevreul is an open-source R Bioconductor package and interactive R Shiny app for processing and visualization of single cell RNA sequencing (scRNA-seq) data. It differs from other scRNA- seq analysis packages in its ease of use, its capacity to analyze full-length RNA sequencing data for exon coverage and transcript isoform inference, and its support for batch correction. Chevreul enables exploratory analysis of scRNA-seq data using Bioconductor SingleCellExperiment or Seurat objects. Simple processing functions with sensible default settings enable batch integration, quality control filtering, read count normalization and transformation, dimensionality reduction, clustering at a range of resolutions, and cluster marker gene identification. Processed data can be visualized in an interactive R Shiny app with dynamically linked plots. Expression of gene or transcript features can be displayed on PCA, tSNE, and UMAP embeddings, heatmaps, or violin plots while differential expression can be evaluated with several statistical tests without extensive programming. Existing analysis tools do not provide specialized tools for isoform-level analysis or alternative splicing detection. By enabling isoform-level expression analysis for differential expression, dimensionality reduction and batch integration, Chevreul empowers researchers without prior programming experience to analyze full-length scRNA-seq data.

DATA AVAILABILITY: A test dataset formatted as a SingleCellExperiment object can be found at https://github.com/cobriniklab/chevreuldata .

AVAILABILITY & IMPLEMENTATION: Chevreul is implemented in R and the R package and integrated Shiny application are freely available at https://github.com/cobriniklab/chevreul .

PMID:40501968 | PMC:PMC12154678 | DOI:10.1101/2025.05.27.656486

Categories
Nevin Manimala Statistics

Robust statistical assessment of Oncogenotype to Organotropism translation in xenografted zebrafish

bioRxiv [Preprint]. 2025 Jun 1:2025.05.28.656734. doi: 10.1101/2025.05.28.656734.

ABSTRACT

Organotropism results from the functional versatility of metastatic cancer cells to survive and proliferate in diverse microenvironments. This adaptivity can originate in clonal variation of the spreading tumor and is often empowered by epigenetic and molecular reprogramming of cell regulatory circuits. Related to organotropic colonization of metastatic sites are environmentally-sensitive, differential responses of cancer cells to therapeutic attack. Accordingly, understanding the organotropic profile of a cancer and probing the underlying driver mechanisms are of high clinical importance. However, determining systematically the organotropism of one cancer versus the organotropism of another cancer, potentially with the granularity of comparing the same cancer type between patients or tracking the evolution of a cancer in a single patient for the purpose of personalized treatment, has remained very challenging. It requires a host organism that allows observation of the spreading pattern over relatively short experimental times. Moreover, organotropic patterns often tend to be statistically weak and superimposed by experimental variation. Thus, an assay for organotropism must give access to statistical powers that can separate ‘meaningful heterogeneity’, i.e., heterogeneity that determines organotropism, from ‘meaningless heterogeneity’, i.e., heterogeneity that causes experimental noise. Here we describe an experimental workflow that leverages the physiological properties of zebrafish larvae for an imaging-based assessment of organotropic patterns over a time-frame of 3 days. The workflow incorporates computer vision pipelines to automatically integrate the stochastic spreading behavior of a particular cancer xenograft in tens to hundreds of larvae allowing subtle trends in the colonization of particular organs to emerge above random cell depositions throughout the host organism. We validate our approach with positive control experiments comparing the spreading patterns of a metastatic sarcoma against non-transformed fibroblasts and the spreading patterns of two melanoma cell lines with previously established differences in metastatic propensity. We then show that integration of the spreading pattern of xenografts in 40 – 50 larvae is necessary and sufficient to generate a Fish Metastatic Atlas page that is representative of the organotropism of a particular oncogenotype and experimental condition. Finally, we apply the power of this assay to determine the function of the EWSR1::FLI1 fusion oncogene and its transcriptional target SOX6 as plasticity factors that enhance the adaptive capacity of metastatic Ewing sarcoma.

PMID:40501949 | PMC:PMC12154647 | DOI:10.1101/2025.05.28.656734

Categories
Nevin Manimala Statistics

Allele Specific Expression Quality Control Fills Critical Gap in Transcriptome Assisted Rare Variant Interpretation

bioRxiv [Preprint]. 2025 Jun 8:2025.05.30.657086. doi: 10.1101/2025.05.30.657086.

ABSTRACT

Allele-specific expression (ASE) captures the functional impact of genetic variation on transcription, offering a high-resolution view of cis-regulatory effects, but its quality can be diminished by technical, biological, and analysis artifacts. We introduce aseQC, a statistical framework that quantifies sample-level ASE quality in terms of the overall expected extra-binomial variation to exclude uncharacteristically noisy samples in a cohort to improve robustness of downstream analyses. Applying aseQC to a dataset of rare mendelian muscular disorders, successfully identified previously annotated low-quality cases demonstrating clinical genomic utility. When applied to 15,253 samples in extensively quality controlled GTEx project data, aseQC uncovered 563 low-quality samples that exhibit excessive allelic imbalance. We identify these to be associated with specific processing dates but not otherwise described adequately by any other quality control measures and metadata available in GTEx data. We show that these low-quality samples lead to 23.6 and 31.6 -fold increased ASE, and splicing outliers, degrading the performance of transcriptome analysis for rare variant interpretation. In contrast, we did not observe any adverse effect associated with inclusion of these samples in common-variant analysis using quantitative traits loci mapping. By enabling quick and reliable assessment of sample quality, aseQC presents a critical step for identifying subtle quality issues that remain critical for a successful analysis of rare variant effects using transcriptome data.

PMID:40501944 | PMC:PMC12157414 | DOI:10.1101/2025.05.30.657086

Categories
Nevin Manimala Statistics

MARLOWE: Taxonomic Characterization of Unknown Samples for Forensics Using De Novo Peptide Identification

bioRxiv [Preprint]. 2025 Jun 2:2024.09.30.615220. doi: 10.1101/2024.09.30.615220.

ABSTRACT

We present a computational tool, MARLOWE, for source organism characterization of unknown, forensic biological samples. The intent of MARLOWE is to address a gap in applying proteomics data analysis to forensic applications. MARLOWE produces a list of potential source organisms given confident peptide tags derived from de novo peptide sequencing and a statistical approach to assign peptides to organisms in a probabilistic manner, based on a broad sequence database. In this way, the algorithm assumes no a priori knowledge of potential sources, and the probabilistic way peptides are taxonomically assigned and then scored enables results to be unbiased (within the constraints of the sequence database). In a proof-of-concept study, we examined MARLOWE’s performance on two datasets, the Biodiversity dataset and the Bacillus cereus superspecies dataset. Not only did MARLOWE demonstrate successful characterization to true contributors in single source and binary mixtures in the Biodiversity dataset, but also provided sufficient specificity to distinguish species within a bacterial superspecies group. We also compared MARLOWE’s results to those of MiCId, a leading microbial identification/characterization tool based on proteomics database search. Comparison of the two tools using 225 mass spectrometry data files yielded comparable performance, with slightly higher accuracy and specificity for MiCId. At the species level, MARLOWE achieved a specificity of 91.4% at 5% FDR. These results suggest that MARLOWE is suitable for candidate- or lead-generation identification of single-organism and binary samples that can generate forensic leads and aid in selecting appropriate follow-on analyses in a forensic context.

PMID:40501933 | PMC:PMC12157597 | DOI:10.1101/2024.09.30.615220

Categories
Nevin Manimala Statistics

Comparing phenotypic manifolds with Kompot: Detecting differential abundance and gene expression at single-cell resolution

bioRxiv [Preprint]. 2025 Jun 7:2025.06.03.657769. doi: 10.1101/2025.06.03.657769.

ABSTRACT

Kompot is a statistical framework for holistic comparison of multi-condition single-cell datasets, supporting both differential abundance and differential expression. Differential abundance captures changes in how cells populate the phenotypic manifold across conditions, while differential expression identifies condition-specific changes in gene regulation that may be localized to particular regions of that manifold. Kompot models the distribution of cells and gene expression as continuous functions over a low-dimensional representation of cell states, enabling single-cell resolution inference with calibrated uncertainty estimates. Applying Kompot to aging murine bone marrow, we identified a continuum of shifts in hematopoietic stem cell and mature cell states, transcriptional remodeling of monocytes independent of compositional changes, and divergent regulation of oxidative stress response genes across cell types. By capturing both global and cell-state-specific effects of perturbation, Kompot reveals how aging reshapes cellular identity and regulatory programs across the hematopoietic landscape. This framework is broadly applicable to dissecting condition-specific effects in complex single-cell landscapes.

PMID:40501932 | PMC:PMC12157388 | DOI:10.1101/2025.06.03.657769

Categories
Nevin Manimala Statistics

Machine Learning for Missing Data Imputation in Alzheimer’s Research: Predicting Medial Temporal Lobe Flexibility

bioRxiv [Preprint]. 2025 May 27:2025.05.22.655574. doi: 10.1101/2025.05.22.655574.

ABSTRACT

BACKGROUND: Alzheimer’s disease (AD) begins years before symptoms appear, making early detection essential. The medial temporal lobe (MTL) is one of the earliest regions affected, and its network flexibility, a dynamic measure of brain connectivity, may serve as a sensitive biomarker of early decline. Cognitive (acquisition, generalization), genetic (APOE, ABCA7), and biochemical (P-tau217) markers may predict MTL dynamic flexibility. Given the high rate of missing data in AD research, this study uses machine learning with advanced imputation methods to predict MTL dynamic flexibility from multimodal predictors in an aging cohort.

METHODS: In an ongoing study at Rutgers’s Aging and Brain Health Alliance, data from 656 participants are utilized, including cognitive assessments, genetic and blood-derived biomarkers, and demographics. Due to MRI-related constraints, only 34.15% of participants had measurable MTL dynamic flexibility from resting-state fMRI. To estimate MTL dynamic flexibility from available data, we evaluated four missing data handling methods (case deletion, MICE, MissForest, and GAIN), and trained five regression models: Ridge, k-NN, SVR, regression trees (bagging, random forest, boosting), and ANN. Hyperparameters were optimized via grid search with 3-fold cross-validation. Model performance was assessed using mean absolute error (MAE), root mean squared error (RMSE), and runtime through 5-fold cross-validation repeated 25 times to ensure robustness in clinical data settings.

RESULTS: A total of 1,866 missing values (25.86%) were identified in the dataset, with only 42 complete cases (6.40%) remaining after listwise deletion, highlighting the need for effective imputation. In the initial analysis using only complete cases, support vector regression (SVR) achieved the lowest mean absolute error (MAE = 0.184), though overall performance was limited due to small sample size. In the second phase, three imputation techniques were applied, significantly improving model accuracy. MissForest combined with Random Forest produced the best results (MAE = 0.083), representing a 54.7% improvement over case deletion. Statistical analysis confirmed significant differences in performance across imputation methods (p < 0.001), with MissForest outperforming GAIN and MICE. GAIN was the fastest imputation method.

DISCUSSION: The findings underscore the importance of using robust imputation strategies to maximize data utility and model reliability in studies with high missingness. Further research is needed, particularly incorporating additional neuroimaging measures, to localize the brain regions most affected by biomarker-driven changes and to refine predictive models for clinical applications.

PMID:40501856 | PMC:PMC12154674 | DOI:10.1101/2025.05.22.655574

Categories
Nevin Manimala Statistics

Comparative effect of protein dosage on nitrogen balance and health outcomes in critically ill patients

J Pak Med Assoc. 2025 May;75(5):699-703. doi: 10.47391/JPMA.20753.

ABSTRACT

OBJECTIVE: To compare the impact of two different doses of proteins on nitrogen balance and clinical outcomes in critically ill patients.

METHODS: The randomised clinical trial was conducted from November 2020 to May 2021 at the intensive care unit of Shifa International Hospital, Islamabad, Pakistan, and comprised critically ill adult patients of either gender at nutritional risk. They were divided into Group I receiving 1g per kilogramme body weight of protein, and Group II receiving 2g per kilogramme body weight of protein. Sequential Organ Failure Assessment scores were calculated for each case. Data was analysed using SPSS 23.

RESULTS: Of the 88 patients, 45(51.13%) were in Group I; 28(62.2%) males and 17(37.8%) females with mean age 61±3.5 years. There were 43(48.86%) patients in Group II; 30(69.8%) males and 13(30.2%) females with mean age 64.4±11.6 years (p>0.05). There was no significant difference in nitrogen balance between the groups on day 1 (p=0.381). However, by the discharge day, nitrogen balance was significantly improved in Group II compared to Group I (p=0.001). There was a statistically weak negative relationship between nitrogen balance and Sequential Organ Failure Assessment score (r=-0.131). Nitrogen balance had no significant relationship with the number of ventilated days (r=-0.002), intensive care unit days (r=0.043) and length of hospital stay (r=0.089).

CLINICAL TRIAL REGISTRATION: ClinicalTrials.gov, NCT04468503.

CONCLUSIONS: Nitrogen balance was significantly better in the critically ill patients who received 2g protein per kilogramme body weight compared to those receiving 1g protein per kilogramme body weight.

PMID:40500809 | DOI:10.47391/JPMA.20753

Categories
Nevin Manimala Statistics

Laboratory-confirmed respiratory syncytial virus (RSV) hospitalizations: a national all ages cross-section evaluation, 2020-2024

Isr J Health Policy Res. 2025 Jun 11;14(1):36. doi: 10.1186/s13584-025-00693-5.

ABSTRACT

BACKGROUND: New vaccines and monoclonal antibody (mAb) against respiratory syncytial virus (RSV) were recently approved for adults and infants, respectively. However, their inclusion in national vaccination programs has been slow. Accurate assessment of RSV disease burden among all ages is essential for the global introduction of these agents.

METHODS: We evaluated all-ages burden of RSV hospitalizations, from 2020 to 2024, based on data collected by a new national laboratory-based hospital surveillance system. RSV-positive respiratory samples from patients hospitalized in general hospitals nationwide were reported. Data were analyzed by RSV circulation periods and age-group to determine hospitalization rates and 30-day mortality (30-DM) rates. We compared the laboratory-confirmed hospitalization rates with rates previously calculated based on ICD-9 codes.

RESULTS: RSV-confirmed hospitalizations were reported for all age-groups. The highest RSV hospitalization rates were found among patients < 1 year old. Patients ≥ 60 years old had the highest RSV hospitalization rates among ≥ 5 years old patients, and their 30-DM rates reached 14.7%, exceeding those of influenza. During the COVID-19 pandemic, lower rates of RSV-confirmed hospitalizations were reported among ≥ 60 years old patients, probably due to higher adherence to social distancing measures. We found higher numbers and rates of laboratory-confirmed hospitalizations among all age-groups ≥ 1 year old, than those previously reported by our group, based on ICD-9 codes.

CONCLUSIONS: Laboratory-confirmation of RSV is paramount for optimal assessment of RSV hospitalization burden, particularly beyond infancy, and for the global adoption of newly developed vaccines and mAb.

PMID:40500807 | DOI:10.1186/s13584-025-00693-5

Categories
Nevin Manimala Statistics

Wastewater-based epidemiological study on helminth egg detection in untreated sewage sludge from Brazilian regions with unequal income

Infect Dis Poverty. 2025 Jun 11;14(1):46. doi: 10.1186/s40249-025-01314-8.

ABSTRACT

BACKGROUND: Helminthiases are neglected diseases that affect billions of people worldwide, particularly those with inadequate sanitation, poor hygiene practices, and limited access to clean water. Due to frequent underreporting, wastewater-based epidemiology has emerged as a valuable tool for monitoring parasitic infections at population-level. This study aimed to detect and quantify helminth eggs in untreated sewage sludge from eight wastewater treatment plants located in different Brazilian socioeconomic regions.

METHODS: The study was conducted from June 2021 to December 2023 in Goiás and Federal District, the Brazilian federative unit with the highest income inequality. Samples were collected bimonthly (n = 121). Helminth eggs were recovered using centrifugation and flotation with a ZnSO4 solution (d = 1.30 g/ml). After 21-28 days of incubation in sulfuric acid, viable eggs were identified and counted using a Sedgewick-Rafter Chamber under an optical microscope. Statistical analyses included One-way analysis of variance (ANOVA) followed by Tukey’s multiple comparisons test to evaluate differences in helminth egg counts between low-, medium- and high-income regions.

RESULTS: Twelve helminth genera were identified, revealing significant differences in prevalence and diversity across socioeconomic strata. Cestode eggs, particularly Hymenolepis spp. (44.28%), were the most prevalent overall. Trematode eggs were less frequent but exhibited greater taxonomic diversity. Sludge from low-income areas had the highest egg concentration [16.61 ± 3.02 eggs per gram of dry mass ( eggs/g DM)], nearly five times greater than in high-income areas such as Brasília Norte (3.56 ± 0.55 eggs/g DM; P = 8.8 × 10⁻⁹). Ascaris spp. (19.27%) and Trichuris spp. (7.90%) predominated in low-income areas. Medium-income regions showed intermediate values, with notable regional variation.

CONCLUSIONS: Our results demonstrate that helminth egg diversity and concentration in sewage sludge are closely related to the socioeconomic characteristics of the served population. These findings may inform prevention and control strategies in vulnerable areas and support the development of public health and sanitation policies that address social and environmental inequalities in Brazil’s Central-Western region.

PMID:40500806 | DOI:10.1186/s40249-025-01314-8