Categories
Nevin Manimala Statistics

Stacked regressions and structured variance partitioning for interpretable brain maps

bioRxiv. 2023 Apr 24:2023.04.23.537988. doi: 10.1101/2023.04.23.537988. Preprint.

ABSTRACT

Relating brain activity associated with a complex stimulus to different properties of that stimulus is a powerful approach for constructing functional brain maps. However, when stimuli are naturalistic, their properties are often correlated (e.g., visual and semantic features of natural images, or different layers of a convolutional neural network that are used as features of images). Correlated properties can act as confounders for each other and complicate the interpretability of brain maps, and can impact the robustness of statistical estimators. Here, we present an approach for brain mapping based on two proposed methods: stacking different encoding models and structured variance partitioning . Our stacking algorithm combines encoding models that each use as input a feature space that describes a different stimulus attribute. The algorithm learns to predict the activity of a voxel as a linear combination of the outputs of different encoding models. We show that the resulting combined model can predict held-out brain activity better or at least as well as the individual encoding models. Further, the weights of the linear combination are readily interpretable; they show the importance of each feature space for predicting a voxel. We then build on our stacking models to introduce structured variance partitioning, a new type of variance partitioning that takes into account the known relationships between features. Our approach constrains the size of the hypothesis space and allows us to ask targeted questions about the similarity between feature spaces and brain regions even in the presence of correlations between the feature spaces. We validate our approach in simulation, showcase its brain mapping potential on fMRI data, and release a Python package. Our methods can be useful for researchers interested in aligning brain activity with different layers of a neural network, or with other types of correlated feature spaces.

PMID:37163111 | PMC:PMC10168225 | DOI:10.1101/2023.04.23.537988

Categories
Nevin Manimala Statistics

A benchmark study on current GWAS models in admixed populations

bioRxiv. 2023 Apr 30:2023.04.27.538299. doi: 10.1101/2023.04.27.538299. Preprint.

ABSTRACT

OBJECTIVE: The performances of popular Genome-wide association study (GWAS) models haven’t been examined yet in a consistent manner under the scenario of genetic admixture, which introduces several challenging aspects such as heterogeneity of minor allele frequency (MAF), a wide spectrum of case-control ratio, and varying effect sizes etc.

METHODS: We generated a cohort of synthetic individuals (N=19,234) that simulates 1) a large sample size; 2) two-way admixture [Native American-European ancestry] and 3) a binary phenotype. We then examined the inflation factors produced by three popular GWAS tools: GMMAT, SAIGE, and Tractor. We also computed power calculations under different MAFs, case-control ratios, and varying ancestry percentages. Then, we employed a cohort of Peruvians (N=249) to further examine the performances of the testing models on 1) real genetic data and 2) small sample sizes. Finally, we validated these findings using an independent Peruvian cohort (N=109) included in 1000 Genome project (1000G).

RESULT: In the synthetic cohort, SAIGE performed better than GMMAT and Tractor in terms of type-I error rate, especially under severe unbalanced case-control ratio. On the contrary, power analysis identified Tractor as the best method to pinpoint ancestry-specific causal variants, but showed decreased power when no adequate heterogeneity of the true effect sizes was simulated between ancestries. The real Peruvian data showed that Tractor is severely affected by small sample sizes, and produced severely inflated statistics, which we replicated in the 1000G Peruvian cohort.

DISCUSSION: The current study illustrates the limitations of available GWAS tools under different scenarios of genetic admixture. We urge caution when interpreting results under complex population scenarios.

PMID:37163101 | PMC:PMC10168347 | DOI:10.1101/2023.04.27.538299

Categories
Nevin Manimala Statistics

Genome-wide analysis of CRISPR perturbations indicates that enhancers act multiplicatively and without epistatic-like interactions

bioRxiv. 2023 Apr 27:2023.04.26.538501. doi: 10.1101/2023.04.26.538501. Preprint.

ABSTRACT

A single gene may be regulated by multiple enhancers, but how they work in concert to regulate transcription is poorly understood. Prior studies have mostly examined enhancers at single loci and have reached inconsistent conclusions about whether epistatic-like interactions exist between them. To analyze enhancer interactions throughout the genome, we developed a statistical framework for CRISPR regulatory screens that utilizes negative binomial generalized linear models that account for variable guide RNA (gRNA) efficiency. We reanalyzed a single-cell CRISPR interference experiment that delivered random combinations of enhancer-targeting gRNAs to each cell and interrogated interactions between 3,808 enhancer pairs. We found that enhancers act multiplicatively with one another to control gene expression, but our analysis provides no evidence for interaction effects between pairs of enhancers regulating the same gene. Our findings illuminate the regulatory behavior of multiple enhancers and our statistical framework provides utility for future analyses studying interactions between enhancers.

PMID:37163096 | PMC:PMC10168320 | DOI:10.1101/2023.04.26.538501

Categories
Nevin Manimala Statistics

Occupational exposure to whole-body vibration and neck pain in the Swedish general population

Ergonomics. 2023 May 10:1-14. doi: 10.1080/00140139.2023.2210792. Online ahead of print.

ABSTRACT

The primary aim of this study was to determine if occupational exposure to whole-body vibration (WBV) was associated with reporting neck pain. A cross-sectional study was conducted on a sample of the general population living in northern Sweden, aged 24 to 76 years. Data was retrieved through a digital survey that collected information on exposure to WBV and biomechanical exposures as well as subjectively reported neck pain. The study included 5,017 participants (response rate 44%). Neck pain was reported by 269 men (11.8%) and 536 women (20.2%). There was a statistically significant association between reporting occupational exposure to WBV half the time or more (adjusted OR 1.91; 95% CI 1.22-3.00) and reporting neck pain. In gender-stratified analyses, the same pattern was observed in men, while there were too few women to determine any association. We conclude that occupational exposure to whole-body vibration was associated with neck pain in men.

PMID:37161844 | DOI:10.1080/00140139.2023.2210792

Categories
Nevin Manimala Statistics

Study on the explicitation of implicit knowledge and the construction of knowledge graph on moxibustion in medical case records of ZHOU Mei-sheng‘s Jiusheng

Zhongguo Zhen Jiu. 2023 May 12;43(5):584-90.

ABSTRACT

To explore the methods of the explicitation of implicit knowledge and the construction of knowledge graph on moxibustion in medical case records of ZHOU Meisheng‘s Jiusheng. The medical case records data of Jiusheng was collected, the frequency statistic was analyzed based on Python3.8.6, complex network analysis was performed using Gephi9.2 software, community analysis was performed by the ancient and modern medical case cloud platform V2.3.5, and analysis and verification of correlation graph and weight graph were proceed by Neo4j3.5.25 image database. The disease systems with frequency≥10 % were surgery, ophthalmology and otorhinolaryngology, locomotor, digestive and respiratory systems. The diseases under the disease system were mainly carbuncle, arthritis, lumbar disc herniation and headache. The commonly used moxibustion methods were fumigating moxibustion, blowing moxibustion, direct moxibustion and warming acupuncture. The core prescription of points obtained by complex network analysis included Yatong point, Zhiyang(GV 9), Sanyinjiao(SP 6), Dazhui(GV 14), Zusanli(ST 36), Lingtai(GV 10), Xinshu(BL 15), Zhijian point and Hegu(LI 4), which were basically consistent with high-frequency points. A total of 6 communities were obtained by community analysis, corresponding to different diseases. Through the analysis of correlation graph, 13 pairs of strong association rule points were obtained. The correlation between Zhiyang(GV 9)-Dazhui(GV 14) and Yatong point-Lingtai(GV 10) was the strongest. The acupoints with high correlation with Yatong point were Zhiyang(GV 9), Lingtai(GV 10), Dazhui(GV 14), Zusanli(ST 36) and Sanyinjiao(SP 6). In the weight graph of the high-frequency disease system, the relationship of the first weight of the surgery system disease was fumigating moxibustion-carbuncle-Yatong point, and the relationship of the first weight of the ophthalmology and otorhinolaryngology system disease was blowing moxibustion-laryngitis-Hegu (LI 4). The results of correlation graph and weight graph are consistent with the results of data mining, which can be used as an effective way to study the knowledge base of moxibustion diagnosis and treatment in the future.

PMID:37161813

Categories
Nevin Manimala Statistics

Development of a stand-alone precalculated Monte Carlo code to calculate the dose by alpha and beta emitters from the Ra-224 decay chain

Med Phys. 2023 May 10. doi: 10.1002/mp.16446. Online ahead of print.

ABSTRACT

BACKGROUND: Recent developments in alpha and beta emitting radionuclide therapy highlight the importance of developing efficient methods for patient-specific dosimetry. Traditional tabulated methods such as Medical Internal Radiation Dose (MIRD) estimate the dose at the organ level while more recent numerical methods based on Monte Carlo (MC) simulations are able to calculate dose at the voxel level. A precalculated MC (PMC) approach was developed in this work as an alternative to time-consuming fully simulated MC. Once the spatial distribution of alpha and beta emitters is determined using imaging and/or numerical methods, the PMC code can be used to achieve an accurate voxelized 3D distribution of the deposited energy without relying on full MC calculations.

PURPOSE: To implement the PMC method to calculate energy deposited by alpha and beta particles emitted from the Ra-224 decay chain.

METHODS: The GEANT4 (version 10.7) MC toolkit was used to generate databases of precalculated tracks to be integrated in the PMC code as well as to benchmark its output. In this regard, energy spectra of alpha and beta particles emitted by the Ra-224 decay chain were generated using GAMOS (version 6.2.0) and imported into GEANT4 macro files. Either alpha or beta emitting sources were defined at the center of a homogeneous phantom filled with various materials such as soft tissue, bone, and lung where particles were emitted either mono-directionally (for database generation) or isotropically (for benchmarking). Two heterogeneous phantoms were used to demonstrate PMC code compatibility with boundary crossing events. Each precalculated database was generated step-by-step by storing particle track information from GEANT4 simulations followed by its integration in a PMC code developed in MATLAB. For a user-defined number of histories, one of the tracks in a given database was selected randomly and rotated randomly to reflect an isotropic emission. Afterward, deposited energy was divided between voxels based on step length in each voxel using a ray-tracing approach. The radial distribution of deposited energy was benchmarked against fully simulated MC calculations using GEANT4. The effect of the GEANT4 parameter StepMax on the accuracy and speed of the code was also investigated.

RESULTS: In the case of alpha decay, primary alpha particles show the highest contribution (>99%) in deposited energy compared to their secondary particles. In most cases, protons act as the main secondary particles in the deposition of energy. However, for a lung phantom, using a range cutoff parameter of 10 µm on primary alpha particles yields a higher contribution of secondary electrons than protons. Differences between deposited energy calculated by PMC and fully simulated MC are within 2% for all alpha and beta emitters in homogeneous and heterogeneous phantoms. Additionally, statistical uncertainties are less than 1% for voxels with doses higher than 5% of the maximum dose. Moreover, optimization of the parameter StepMax is necessary to achieve the best tradeoff between code accuracy and speed.

CONCLUSIONS: The PMC code shows good performance for dose calculations deposited by alpha and beta emitters. As a stand-alone algorithm, it is suitable to be integrated into clinical treatment planning systems.

PMID:37161766 | DOI:10.1002/mp.16446

Categories
Nevin Manimala Statistics

Detecting the skewness of data from the five-number summary and its application in meta-analysis

Stat Methods Med Res. 2023 May 10:9622802231172043. doi: 10.1177/09622802231172043. Online ahead of print.

ABSTRACT

For clinical studies with continuous outcomes, when the data are potentially skewed, researchers may choose to report the whole or part of the five-number summary (the sample median, the first and third quartiles, and the minimum and maximum values) rather than the sample mean and standard deviation. In the recent literature, it is often suggested to transform the five-number summary back to the sample mean and standard deviation, which can be subsequently used in a meta-analysis. However, if a study contains skewed data, this transformation and hence the conclusions from the meta-analysis are unreliable. Therefore, we introduce a novel method for detecting the skewness of data using only the five-number summary and the sample size, and meanwhile, propose a new flow chart to handle the skewed studies in a different manner. We further show by simulations that our skewness tests are able to control the type I error rates and provide good statistical power, followed by a simulated meta-analysis and a real data example that illustrate the usefulness of our new method in meta-analysis and evidence-based medicine.

PMID:37161735 | DOI:10.1177/09622802231172043

Categories
Nevin Manimala Statistics

Incorporating biological knowledge in analyses of environmental mixtures and health

Stat Med. 2023 May 10. doi: 10.1002/sim.9765. Online ahead of print.

ABSTRACT

A key goal of environmental health research is to assess the risk posed by mixtures of pollutants. As epidemiologic studies of mixtures can be expensive to conduct, it behooves researchers to incorporate prior knowledge about mixtures into their analyses. This work extends the Bayesian multiple index model (BMIM), which assumes the exposure-response function is a nonparametric function of a set of linear combinations of pollutants formed with a set of exposure-specific weights. The framework is attractive because it combines the flexibility of response-surface methods with the interpretability of linear index models. We propose three strategies to incorporate prior toxicological knowledge into construction of indices in a BMIM: (a) imposing directional homogeneity constraints on the weights, (b) structuring index weights by exposure transformations, and (c) placing informative priors on the index weights. We propose a novel prior specification that combines spike-and-slab variable selection with an informative Dirichlet distribution based on relative potency factors often derived from previous toxicological studies. In simulations we show that the proposed priors improve inferences when prior information is correct and can protect against misspecification suffered by naïve toxicological models when prior information is incorrect. Moreover, different strategies may be mixed-and-matched for different indices to suit available information (or lack thereof). We demonstrate the proposed methods on an analysis of data from the National Health and Nutrition Examination Survey and incorporate prior information on relative chemical potencies obtained from toxic equivalency factors available in the literature.

PMID:37161723 | DOI:10.1002/sim.9765

Categories
Nevin Manimala Statistics

Impact of a Remote Virtual Reality Curriculum Pilot on Clinician Conflict Communication Skills

Hosp Pediatr. 2023 May 10:e2022006990. doi: 10.1542/hpeds.2022-006990. Online ahead of print.

ABSTRACT

OBJECTIVES: Conflict management skills are essential for interprofessional team functioning, however existing trainings are time and resource intensive. We hypothesized that a curriculum incorporating virtual reality (VR) simulations would enhance providers’ interprofessional conflict communication skills and increase self-efficacy.

METHODS: We conducted a randomized controlled pilot study of the Conflict Instruction through Virtual Immersive Cases (CIVIC) curriculum among inpatient clinicians at a pediatric satellite campus. Participants viewed a 30-minute didactic presentation on conflict management and subsequently completed CIVIC (intervention group) or an alternative VR curriculum on vaccine counseling (control group), both of which allowed for verbal interactions with screen-based avatars. Three months following VR training, all clinicians participated in a unique VR simulation focused on conflict management that was recorded and scored using a rubric of observable conflict management behaviors and a Global Entrustment Scale (GES). Differences between groups were evaluated using generalized linear models. Self-efficacy was also assessed immediately pre, post, and 3 months postcurriculum. Differences within and between groups were assessed with paired independent and 2-sample t-tests, respectively.

RESULTS: Forty of 51 participants (78%) completed this study. The intervention group (n = 17) demonstrated better performance on the GES (P = .003) and specific evidence-based conflict management behaviors, including summarizing team member’s concerns (P = .02) and checking for acceptance of the plan (P = .02), as well as statistical improvements in 5 self-efficacy measures compared with controls.

CONCLUSIONS: Participants exposed to CIVIC demonstrated enhanced conflict communication skills and reported increased self-efficacy compared with controls. VR may be an effective method of conflict communication training.

PMID:37161716 | DOI:10.1542/hpeds.2022-006990

Categories
Nevin Manimala Statistics

Epidemiology of Salmonellosis Among Infants in the United States: 1968-2015

Pediatrics. 2023 May 10:e2021056140. doi: 10.1542/peds.2021-056140. Online ahead of print.

ABSTRACT

OBJECTIVES: Describe characteristics of gastroenteritis, bacteremia, and meningitis caused by nontyphoidal Salmonella among US infants.

METHODS: We analyze national surveillance data during 1968-2015 and active, sentinel surveillance data during 1996-2015 for culture-confirmed Salmonella infections by syndrome, year, serotype, age, and race.

RESULTS: During 1968-2015, 190 627 culture-confirmed Salmonella infections among infants were reported, including 165 236 (86.7%) cases of gastroenteritis, 6767 (3.5%) bacteremia, 371 (0.2%) meningitis, and 18 253 (9.7%) with other or unknown specimen sources. Incidence increased during the late 1970s-1980s, declined during the 1990s-early 2000s, and has gradually increased since the mid-2000s. Infants’ median age was 4 months for gastroenteritis and bacteremia and 2 months for meningitis. The most frequently reported serotypes were Typhimurium (35 468; 22%) for gastroenteritis and Heidelberg for bacteremia (1954; 29%) and meningitis (65; 18%). During 1996-2015 in sentinel site surveillance, median annual incidence of gastroenteritis was 120, bacteremia 6.2, and meningitis 0.25 per 100 000 infants. Boys had a higher incidence of each syndrome than girls in both surveillance systems, but most differences were not statistically significant. Overall, hospitalization and fatality rates were 26% and 0.1% for gastroenteritis, 70% and 1.6% for bacteremia, and 96% and 4% for meningitis. During 2004-2015, invasive salmonellosis incidence was higher for Black (incident rate ratio, 2.7; 95% confidence interval, 2.6-2.8) and Asian (incident rate ratio, 1.8; 95% confidence interval, 1.7-1.8) than white infants.

CONCLUSIONS: Salmonellosis causes substantial infant morbidity and mortality; serotype heidelberg caused the most invasive infections. Infants with meningitis were younger than those with bacteremia or gastroenteritis. Research into risk factors for infection and invasive illness could inform prevention efforts.

PMID:37161700 | DOI:10.1542/peds.2021-056140