Categories
Nevin Manimala Statistics

Reproducibility and robustness of economics and political science research

Nature. 2026 Apr;652(8108):151-156. doi: 10.1038/s41586-026-10251-x. Epub 2026 Apr 1.

ABSTRACT

Science aspires to be cumulative. Reproducibility efforts strengthen science by testing the reliability of published findings, promoting self-correction, and informing policy-making1. Computational reproductions, whereby independent researchers reproduce the results of published studies, are an essential diagnostic tool2-10. Such efforts should have greater visibility11-16. However, little social science reproduction and robustness has been conducted at scale10,13,17-23. Here we reproduced original analyses and conducted robustness checks of 110 articles that were published in leading economics and political science journals with mandatory data and code sharing policies17,18. We found that more than 85% of published claims were computationally reproducible. In robustness checks, our reanalyses showed that 72% of statistically significant estimates remain significant and in the same direction, and the median reproduced effect size is nearly the same as the originally published effect size (that is, 99% of the published effect size). Additionally, 6 independent research teams examined 12 pre-specified hypotheses about determinants of robustness. Research teams with more experience found lower levels of robustness, and robustness did not correlate with author characteristics or data availability.

PMID:41922705 | DOI:10.1038/s41586-026-10251-x

Categories
Nevin Manimala Statistics

Investigating the analytical robustness of the social and behavioural sciences

Nature. 2026 Apr;652(8108):135-142. doi: 10.1038/s41586-025-09844-9. Epub 2026 Apr 1.

ABSTRACT

The same dataset can be analysed in different justifiable ways to answer the same research question, potentially challenging the robustness of empirical science1-3. In this crowd initiative, we investigated the degree to which research findings in the social and behavioural sciences are contingent on analysts’ choices. We examined a stratified random sample of 100 studies published between 2009 and 2018, in which, for one claim per study, at least five reanalysts independently reanalysed the original data. The statistical appropriateness of the reanalyses was assessed in peer evaluations, and the robustness indicators were inspected along a range of research characteristics and study designs. We found that 34% of the independent reanalyses yielded the same result (within a tolerance region of ±0.05 Cohen’s d) as the original report; with a four times broader tolerance region, this indicator increased to 57%. Of the reanalyses conducted, 74% reached the same conclusion as the original investigation, 24% yielded no effects or inconclusive results and 2% reported the opposite effect. This exploratory study indicates that the common single-path analyses in social and behavioural research should not be simply assumed to be robust to alternative analyses4. Therefore, we recommend the development and use of practices to explore and communicate this neglected source of uncertainty.

PMID:41922703 | DOI:10.1038/s41586-025-09844-9

Categories
Nevin Manimala Statistics

Investigating the replicability of the social and behavioural sciences

Nature. 2026 Apr;652(8108):143-150. doi: 10.1038/s41586-025-10078-y. Epub 2026 Apr 1.

ABSTRACT

Pursuing replicability – independent evidence for previous claims – is important for creating generalizable knowledge1,2. Here we attempted replications of 274 claims of positive results from 164 quantitative papers published from 2009 to 2018 in 54 journals in the social and behavioural sciences. Replications were high powered on average to detect the original effect size (median of 99.6%), used original materials when relevant and available, and were peer reviewed in advance through a standardized internal protocol. Replications showed statistically significant results in the original pattern for 151 of 274 claims (55.1% (95% confidence interval (CI) 49.2-60.9%)) and for 80.8 of 164 papers (49.3% (95% CI 43.8-54.7%)), weighed for replicating multiple claims per paper. We observed modest variation in replication rates across disciplines (42.5-63.1%), although some estimates had high uncertainty. The median Pearson’s r effect size was 0.25 (95% CI 0.21-0.27) for original studies and 0.10 (95% CI 0.09-0.13) for replication studies, an 82.4% (95% CI 67.8-88.2%) reduction in shared variance. Thirteen methods for evaluating replication success provided estimates ranging from 28.6% to 74.8% (median of 49.3%). Some decline in effect size and significance is expected based on power to detect original effects and regression to the mean because we replicated only positive results. We observe that challenges for replicability extend across social-behavioural sciences, illustrating the importance of identifying conditions that promote or inhibit replicability3,4.

PMID:41922700 | DOI:10.1038/s41586-025-10078-y

Categories
Nevin Manimala Statistics

Investigating the reproducibility of the social and behavioural sciences

Nature. 2026 Apr;652(8108):126-134. doi: 10.1038/s41586-026-10203-5. Epub 2026 Apr 1.

ABSTRACT

Published claims should be reproducible, yielding the same result when the same analysis is applied to the same data1,2. Here we assess reproducibility in a stratified random sample of 600 papers published from 2009 to 2018 in 62 journals spanning the social and behavioural sciences. The authors of 144 (24.0%, 95% confidence interval (CI) = 20.8-27.6%) papers made data available to assess reproducibility and, for 38 others, we obtained source data to reconstruct the dataset. We assessed 143 out of the 182 available datasets and found that 76.6 (53.6%, 95% CI = 45.8-60.7%) papers were rated as precisely reproducible and 105.0 (73.5%, 95% CI = 66.4-80.0%) were rated as at least approximately reproducible (within 15% of the original effects or within 0.05 of original P values) after inverse weighting each of the 551 claims by the number of claims per paper. We observed higher reproducibility for papers from political science and economics compared with other fields, for more recent papers compared with older papers and for papers from journals that require data sharing. Implementation of measures to verify that research is reproducible is needed to support trustworthiness in the complex enterprise of knowledge production3,4.

PMID:41922699 | DOI:10.1038/s41586-026-10203-5

Categories
Nevin Manimala Statistics

Mitochondrial hyperoxidation contributes to warm ischemia-reperfusion injury in rat and pig livers

Commun Med (Lond). 2026 Apr 1. doi: 10.1038/s43856-026-01551-4. Online ahead of print.

ABSTRACT

BACKGROUND: Mitochondrial dysfunction is a critical factor in several diseases, but current in situ assessment methods are severely limited. Non-invasive monitoring of mitochondrial redox state using resonance Raman Spectroscopy (RRS) offers a promising solution. This study aims to demonstrate RRS utility with liver models of warm ischemia-reperfusion injury in organ transplantation.

METHODS: Lewis rat (female) and Yorkshire pig (both sexes) livers were evaluated during reperfusion by subnormothermic machine perfusion, with 3-6 replicates per study group, and statistical comparisons using unpaired two-tailed Student’s t-tests with Welch’s correction for potentially unequal variance. RRS provides in situ quantification of the overall mitochondrial redox state, and herein further refined to resolve the redox state of individual complex III and IV.

RESULTS: Here we show that RRS can differentiate non-viable rat livers (3 h warm ischemia, WI) from viable 1 h WI and fresh controls as early as 30 mins into reperfusion. RRS also identifies dysfunction at complex III characterized by hyperoxidation during reperfusion. This guides us to test methylene blue, which acts as an alternate electron donor to bypass complex III, as treatment rescuing mitochondria from WI-induced reperfusion injury. When tested on pig marginal livers with extended WI (30-45 mins), our RRS-guided treatment enables recovery of hemodynamics and oxygen/lactate values that approached controls without WI.

CONCLUSIONS: RRS assessment and guided treatment with methylene blue provide two lines of evidence indicating that mitochondrial hyperoxidation, specifically at complex III, is a critical mechanism underlying warm ischemia-reperfusion injury. This study demonstrates the potential of RRS for transplantation and broader applications.

PMID:41922695 | DOI:10.1038/s43856-026-01551-4

Categories
Nevin Manimala Statistics

Obstacle-aware inverse kinematics of variable-length continuum robots via teaching-learning-based optimization with experimental validation

Sci Rep. 2026 Apr 1. doi: 10.1038/s41598-026-46132-6. Online ahead of print.

ABSTRACT

Continuum robots offer high dexterity and compliance, which makes them attractive for tasks in confined, hazardous, and hard-to-reach environments. Despite this potential, inverse kinematics (IK) for multi-section continuum robots remains challenging due to strong nonlinearities and redundancy, and the problem becomes more demanding when each section can actively change its backbone length. This paper addresses obstacle-aware IK for a cable-driven variable-length continuum robot by formulating IK as a constrained optimization problem built on a constant-curvature forward kinematic model. A teaching-learning-based optimization (TLBO) algorithm is adopted to search for section bending angles, orientation angles, and section lengths that minimize end-effector tracking error while avoiding static obstacles through a capsule-based penalty constraint handling strategy that accounts for the robot’s physical radial dimension. The approach is evaluated through multiple three-dimensional MATLAB simulations, including linear and circular trajectory tracking with and without obstacle avoidance, and is benchmarked against particle swarm optimization (PSO), a real-coded genetic algorithm (GA), and differential evolution (DE) over 30 independent runs. Statistical analysis shows that TLBO achieves the best or near-best tracking accuracy (mean error [Formula: see text] mm, best [Formula: see text] mm) while requiring no algorithm-specific tuning parameters. The method is further validated experimentally on a Continuum Bionic Handling Assistant (CBHA) platform by comparing the IK-derived cable-length profiles with potentiometer-based measurements. The results demonstrate accurate trajectory tracking in simulation and good agreement with experimental cable-length measurements, supporting the feasibility of TLBO for constrained IK of variable-length continuum robots.

PMID:41922682 | DOI:10.1038/s41598-026-46132-6

Categories
Nevin Manimala Statistics

Model-based economic analysis under uncertainty for PFAS treatment by granular activated carbon and ion exchange technologies

J Environ Manage. 2026 Mar 31;404:129407. doi: 10.1016/j.jenvman.2026.129407. Online ahead of print.

ABSTRACT

Recent drinking water regulations have imposed remediation for per- and polyfluoroalkyl substances (PFAS). In response, treatment facilities may be required to retrofit existing treatment schemes to treat PFAS below maximum contaminant levels (MCLs). Adsorption technologies such as granular activated carbon (GAC) and ion exchange (IX) have been demonstrated to be effective; however, there are limited techno-economic metrics available that provide guidance on technology selection and design for diverse PFAS-containing source water conditions. Process systems engineering (PSE) tools that traditionally perform these analyses are hindered by the data availability, model validity, and understanding of treatment phenomena for emerging contaminants. This work employs published data regressions, statistical models, process models, techno-economic analyses, and other process systems tools in a model-based uncertainty framework to consider the limitations of emerging contaminant research. Through this analysis framework, economic results are provided as probabilistic distributions based on the uncertainty of the models and diverse conditions that treatment facilities experience. Regressed parameter distributions and model predictive performance trends for each technology are identified based on PFAS structure and chain length. GAC systems are evaluated at consistently lower levelized costs of water (LCOWs) with less economic risk over IX systems considering uncertainty across most design conditions and PFAS species. Both technologies are evaluated to have comparable adsorbent usage intensity on a volume basis, indicative of similar sustainability.

PMID:41921266 | DOI:10.1016/j.jenvman.2026.129407

Categories
Nevin Manimala Statistics

Supersensitive and robust disease monitoring in oropharyngeal cancer patients by circulating tumor HPV-DNA sequencing (ctHPV-DNAseq)

Transl Oncol. 2026 Mar 31;67:102744. doi: 10.1016/j.tranon.2026.102744. Online ahead of print.

ABSTRACT

BACKGROUND: Post-treatment disease monitoring of HPV-positive oropharyngeal squamous cell carcinoma (OPSCC) is challenging. Liquid biopsies could improve disease monitoring, but the variety in methods hampers clinical implementation. In this study, target-enrichment sequencing to detect circulating tumor HPV DNA (ctHPV-DNA) was applied in liquid biopsies of HPV-positive OPSCC patients, and robust statistical readouts determined. Next, it was investigated whether longitudinal plasma monitoring could accurately diagnose residual and recurrent disease.

METHODS: The target-enrichment panel included 29 cancer genes and high-risk HPV genomes. The assay was tested on plasma from 30 non-cancer controls and 33 patients with HPV-positive tumors, 15 of whom had residual or recurrent disease, and 18 who remained disease-free. Samples were analyzed from baseline to 24 months after treatment.

RESULTS: By determining and applying robust statistical cut-off values, ctHPV-DNA could be detected in plasma of all patients with HPV-positive OPSCC at baseline, and was absent in plasma of all non-cancer controls. In OPSCC patients who remained disease-free, post-treatment plasma samples were negative for ctHPV-DNA. In contrast, ctHPV-DNA was detected in plasma of all OPSCC patients with recurrent disease to a year before clinical diagnosis. Cases suspect for residual disease in the neck, but with a necrotic metastasis without vital tumor after resection, tested correctly negative for ctHPV-DNA in plasma.

CONCLUSIONS: Target-enrichment sequencing of plasma shows 100% accurate detection of ctHPV-DNA at baseline. Longitudinal monitoring enables early recurrence detection and correct diagnosis of non-vital residual disease. The data indicate that liquid biopsy could improve post-treatment follow-up in HPV-positive OPSCC patients.

PMID:41921264 | DOI:10.1016/j.tranon.2026.102744

Categories
Nevin Manimala Statistics

Baseline symptom severity and response to multisession cathodal transcranial direct current stimulation in adolescents and young adults with autism: an exploratory analysis

J Psychiatr Res. 2026 Mar 26;198:203-209. doi: 10.1016/j.jpsychires.2026.03.043. Online ahead of print.

ABSTRACT

BACKGROUND: Identifying baseline clinical characteristics associated with treatment response is crucial for optimising intervention strategies in people with autism spectrum disorder (ASD). This study aimed to distinguish responders from non-responders to a multisession prefrontal transcranial direct current stimulation (tDCS) protocol and to explore how baseline individual characteristics relate to treatment outcomes.

METHODS: Using complementary inferential statistics and predictive modelling approaches, we analysed baseline clinical characteristics of adolescents and young adults with ASD who underwent cathodal tDCS (1.5 mA) over the left dorsolateral prefrontal cortex, coupled with cognitive remediation training. Baseline symptom severity was assessed using the Autism Diagnostic Interview-Revised (ADI-R). Normalised feature importance and Shapley additive explanations (SHAP) were used to identify features associated with restricted and repetitive behaviour (RRB) reduction, and a model-derived classification threshold was estimated to stratify potential responders and non-responders within this protocol.

RESULTS: Inferential statistics revealed that responders had significantly lower social interaction symptom severity than non-responders (p < .001; Cohen’s d = 0.41). Predictive modelling further indicated that lower baseline verbal communication severity was associated with greater RRB reduction (averaged SHAP value across models = -0.017). A model-derived classification threshold of 19.6 on the ADI-R verbal communication sub-score was identified to differentiate responders and non-responders within this protocol.

CONCLUSION: Lower baseline verbal communication severity is associated with greater behavioural improvement following this multisession prefrontal tDCS protocol. These findings should be interpreted as exploratory and hypothesis-generating. Replication in larger, independent cohorts is required to assess the robustness and generalisability of these findings.

TRIAL REGISTRATION: ClinicalTrials.gov (ID: NCT05035511).

PMID:41921246 | DOI:10.1016/j.jpsychires.2026.03.043

Categories
Nevin Manimala Statistics

Interactions between mental health predictors on post-concussive depressive symptoms among service members and veterans with concussion

J Psychiatr Res. 2026 Mar 17;198:194-202. doi: 10.1016/j.jpsychires.2026.03.019. Online ahead of print.

ABSTRACT

PURPOSE: US service members and veterans (SMVs) are at an increased risk for both concussion and mental health disorders such as depression and post-traumatic stress disorders (PTSD). Although depression history has been shown associated with elevated post-concussive depressive symptoms, it is unclear whether this relationship changes in the presence of other mental health conditions such as PTSD. This study evaluated whether the relationship between depression history and the level of post-concussive depressive symptoms varied by pre-injury PTSD.

METHODS: Data from 427 SMVs with concussion history from a US military medical center was used for this cross-sectional study. Concussion, pre-injury depression, and PTSD were assessed through medical record review and self-report, and the level of post-injury depressive symptoms was measured using the Center for Epidemiologic Studies- Depression Scale. Poisson regression with robust error variance was utilized to evaluate the association of pre-injury depression with clinically-elevated depressive symptoms post-injury, and interaction by pre-injury PTSD.

RESULTS: Participants with (vs. without) pre-injury depression were significantly more likely to have clinically-elevated depressive symptoms post-injury, but only in the presence of pre-injury PTSD (PR = 2.02, CI = 1.45, 2.81) and not without (PR = 1.12, CI = 0.84, 1.50). Interaction by pre-injury PTSD was statistically significant (p < 0.001).

CONCLUSIONS: Depression history has been shown to elevate post-concussive depressive symptoms; however, the findings of this study suggest that this association may exist only in the presence of pre-injury PTSD. Identification of SMVs with concomitant depression and PTSD history may further inform the concussion treatment of those who may likely have clinically-elevated post-concussive depressive symptoms.

PMID:41921245 | DOI:10.1016/j.jpsychires.2026.03.019