Categories
Nevin Manimala Statistics

Scientific software – Quality not always good

Computational tools are indispensable in almost all scientific disciplines. Especially in cases where large amounts of research data are generated and need to be quickly processed, reliable, carefully developed software is crucial for analyzing and correctly interpreting such data. Nevertheless, scientific software can have quality quality deficiencies. To evaluate software quality in an automated way, computer scientists have designed the SoftWipe tool.
Categories
Nevin Manimala Statistics

Agreement between laboratory-based and non-laboratory-based Framingham risk score in Southern Iran

Sci Rep. 2021 May 24;11(1):10767. doi: 10.1038/s41598-021-90188-5.

ABSTRACT

The Framingham 10-year cardiovascular disease risk is measured by laboratory-based and non-laboratory-based models. This study aimed to determine the agreement between these two models in a large population in Southern Iran. In this study, the baseline data of 8138 individuals participated in the Pars cohort study were used. The participants had no history of cardiovascular disease or stroke. For the laboratory-based risk model, scores were determined based on age, sex, current smoking, diabetes, systolic blood pressure (SBP) and treatment status, total cholesterol, and High-Density Lipoprotein. For the non-laboratory-based risk model, scores were determined based on age, sex, current smoking, diabetes, SBP and treatment status, and Body Mass Index. The agreement between these two models was determined by Bland Altman plots for agreement between the scores and kappa statistic for agreement across the risk groups. Bland Altman plots showed that the limits of agreement were reasonable for females < 60 years old (95% CI: -2.27-4.61%), but of concern for those ≥ 60 years old (95% CI: -3.45-9.67%), males < 60 years old (95% CI: -2.05-8.91%), and males ≥ 60 years old (95% CI: -3.01-15.23%). The limits of agreement were wider for males ≥ 60 years old in comparison to other age groups. According to the risk groups, the agreement was better in females than in males, which was moderate for females < 60 years old (kappa = 0.57) and those ≥ 60 years old (kappa = 0.51). The agreement was fair for the males < 60 years old (kappa = 0.39) and slight for those ≥ 60 years old (Kappa = 0.14). The results showed that in overall participants, the agreement between the two risk scores was moderate according to risk grouping. Therefore, our results suggest that the non-laboratory-based risk model can be used in resource-limited settings where individuals cannot afford laboratory tests and extensive laboratories are not available.

PMID:34031448 | DOI:10.1038/s41598-021-90188-5

Categories
Nevin Manimala Statistics

Power determination in vitamin D randomised control trials and characterising factors affecting it through a novel simulation-based tool

Sci Rep. 2021 May 24;11(1):10804. doi: 10.1038/s41598-021-90019-7.

ABSTRACT

Thousands of observational studies have linked vitamin D deficiency with numerous diseases, but randomised controlled trials (RCTs) often fail to show benefit of supplementation. Population characteristics and trial design have long been suspected to undermine power but were not systematically investigated. We propose a flexible generative model to characterise benefit of vitamin D supplementation at the individual level, and use this to quantify power in RCTs. The model can account for seasonality and population heterogeneity. In a simulated 1-year trial with 1000 participants per arm and assuming a 25-hydroxyvitamin D (25OHD) increase of 20 nmol/L due to the intervention, with baseline 25OHD in the population of 15, 35, 50, 60 and 75 nmol/L, the power to detect intervention effect was 77%, 99%, 95%, 68% and 19%, respectively. The number of participants required per arm to achieve 80% power according to baseline 25OHD of 15-60 nmol/L was 1200, 400, 600 and 1400, respectively. As expected, larger increases in 25OHD due to supplementation improved power in certain scenarios. For a population baseline of 50 nmol/L, with 1500 participants in each arm, there was 100% power to detect a 20 nmol/L 25OHD increase while it was 76% for a 10 nmol/L increase. Population characteristics and trial design, including temporal considerations, have a dramatic impact on power and required sample size in vitamin D RCTs.

PMID:34031451 | DOI:10.1038/s41598-021-90019-7

Categories
Nevin Manimala Statistics

Addressing the COVID-19 transmission in inner Brazil by a mathematical model

Sci Rep. 2021 May 24;11(1):10760. doi: 10.1038/s41598-021-90118-5.

ABSTRACT

In 2020, the world experienced its very first pandemic of the globalized era. A novel coronavirus, SARS-CoV-2, is the causative agent of severe pneumonia and has rapidly spread through many nations, crashing health systems and leading a large number of people to death. In Brazil, the emergence of local epidemics in major metropolitan areas has always been a concern. In a vast and heterogeneous country, with regional disparities and climate diversity, several factors can modulate the dynamics of COVID-19. What should be the scenario for inner Brazil, and what can we do to control infection transmission in each of these locations? Here, a mathematical model is proposed to simulate disease transmission among individuals in several scenarios, differing by abiotic factors, social-economic factors, and effectiveness of mitigation strategies. The disease control relies on keeping all individuals’ social distancing and detecting, followed by isolating, infected ones. The model reinforces social distancing as the most efficient method to control disease transmission. Moreover, it also shows that improving the detection and isolation of infected individuals can loosen this mitigation strategy. Finally, the effectiveness of control may be different across the country, and understanding it can help set up public health strategies.

PMID:34031456 | DOI:10.1038/s41598-021-90118-5

Categories
Nevin Manimala Statistics

Total nitrogen is the main soil property associated with soil fungal community in karst rocky desertification regions in southwest China

Sci Rep. 2021 May 24;11(1):10809. doi: 10.1038/s41598-021-89448-1.

ABSTRACT

Karst rocky desertification (KRD) is a type of land deterioration, resulting in the degraded soil and a delicate ecosystem. Previous studies focused on the influence of KRD on the animals and plants, the impact of KRD on microorganisms, especially soil fungi remains to be discovered. This study reveals the change in the soil fungal community in response to KRD progression in southwest China. Illumina HiSeq was used to survey the soil fungal community. Results showed that the soil fungal community in the severe KRD (SKRD) was noticeably different from that in other KRD areas. Statistical analyses suggested that soil TN was the primary factor associated with the fungal community, followed by pH. Phylum Ascomycota was significantly abundant in non-degraded soils; whereas Basidiomycota predominated in SKRD. The ratio of Ascomycota/Basidiomycota significantly decreased along with KRD progression, which might be used as an indicator of KRD severity. Phylum Basidiomycota was sensitive to changes in all the soil properties but AP. Genus Sebacina might have the potential to promote vegetation and land restoration in KRD areas. This study fills a gap of knowledge on changes in soil fungal communities in accordance with KRD progression.

PMID:34031439 | DOI:10.1038/s41598-021-89448-1

Categories
Nevin Manimala Statistics

Improved methods for RNAseq-based alternative splicing analysis

Sci Rep. 2021 May 24;11(1):10740. doi: 10.1038/s41598-021-89938-2.

ABSTRACT

The robust detection of disease-associated splice events from RNAseq data is challenging due to the potential confounding effect of gene expression levels and the often limited number of patients with relevant RNAseq data. Here we present a novel statistical approach to splicing outlier detection and differential splicing analysis. Our approach tests for differences in the percentages of sequence reads representing local splice events. We describe a software package called Bisbee which can predict the protein-level effect of splice alterations, a key feature lacking in many other splicing analysis resources. We leverage Bisbee’s prediction of protein level effects as a benchmark of its capabilities using matched sets of RNAseq and mass spectrometry data from normal tissues. Bisbee exhibits improved sensitivity and specificity over existing approaches and can be used to identify tissue-specific splice variants whose protein-level expression can be confirmed by mass spectrometry. We also applied Bisbee to assess evidence for a pathogenic splicing variant contributing to a rare disease and to identify tumor-specific splice isoforms associated with an oncogenic mutation. Bisbee was able to rediscover previously validated results in both of these cases and also identify common tumor-associated splice isoforms replicated in two independent melanoma datasets.

PMID:34031440 | DOI:10.1038/s41598-021-89938-2

Categories
Nevin Manimala Statistics

Mapping the transcriptomics landscape of post-traumatic stress disorder symptom dimensions in World Trade Center responders

Transl Psychiatry. 2021 May 24;11(1):310. doi: 10.1038/s41398-021-01431-6.

ABSTRACT

Gene expression has provided promising insights into the pathophysiology of post-traumatic stress disorder (PTSD); however, specific regulatory transcriptomic mechanisms remain unknown. The present study addressed this limitation by performing transcriptome-wide RNA-Seq of whole-blood samples from 226 World Trade Center responders. The investigation focused on differential expression (DE) at the gene, isoform, and for the first time, alternative splicing (AS) levels associated with the symptoms of PTSD: total burden, re-experiencing, avoidance, numbing, and hyperarousal subdimensions. These symptoms were associated with 76, 1, 48, 15, and 49 DE genes, respectively (FDR < 0.05). Moreover, they were associated with 103, 11, 0, 43, and 32 AS events. Avoidance differed the most from other dimensions with respect to DE genes and AS events. Gene set enrichment analysis (GSEA) identified pathways involved in inflammatory and metabolic processes, which may have implications in the treatment of PTSD. Overall, the findings shed a novel light on the wide range of transcriptomic alterations associated with PTSD at the gene and AS levels. The results of DE analysis associated with PTSD subdimensions highlights the importance of studying PTSD symptom heterogeneity.

PMID:34031375 | DOI:10.1038/s41398-021-01431-6

Categories
Nevin Manimala Statistics

Mining mutation contexts across the cancer genome to map tumor site of origin

Nat Commun. 2021 May 24;12(1):3051. doi: 10.1038/s41467-021-23094-z.

ABSTRACT

The vast preponderance of somatic mutations in a typical cancer are either extremely rare or have never been previously recorded in available databases that track somatic mutations. These constitute a hidden genome that contrasts the relatively small number of mutations that occur frequently, the properties of which have been studied in depth. Here we demonstrate that this hidden genome contains much more accurate information than common mutations for the purpose of identifying the site of origin of primary cancers in settings where this is unknown. We accomplish this using a projection-based statistical method that achieves a highly effective signal condensation, by leveraging DNA sequence and epigenetic contexts using a set of meta-features that embody the mutation contexts of rare variants throughout the genome.

PMID:34031376 | DOI:10.1038/s41467-021-23094-z

Categories
Nevin Manimala Statistics

CRISPECTOR provides accurate estimation of genome editing translocation and off-target activity from comparative NGS data

Nat Commun. 2021 May 24;12(1):3042. doi: 10.1038/s41467-021-22417-4.

ABSTRACT

Controlling off-target editing activity is one of the central challenges in making CRISPR technology accurate and applicable in medical practice. Current algorithms for analyzing off-target activity do not provide statistical quantification, are not sufficiently sensitive in separating signal from noise in experiments with low editing rates, and do not address the detection of translocations. Here we present CRISPECTOR, a software tool that supports the detection and quantification of on- and off-target genome-editing activity from NGS data using paired treatment/control CRISPR experiments. In particular, CRISPECTOR facilitates the statistical analysis of NGS data from multiplex-PCR comparative experiments to detect and quantify adverse translocation events. We validate the observed results and show independent evidence of the occurrence of translocations in human cell lines, after genome editing. Our methodology is based on a statistical model comparison approach leading to better false-negative rates in sites with weak yet significant off-target activity.

PMID:34031394 | DOI:10.1038/s41467-021-22417-4

Categories
Nevin Manimala Statistics

Electroencephalography-detected neurophysiology of internet addiction disorder and internet gaming disorder in adolescents – A review

Med J Malaysia. 2021 May;76(3):401-413.

ABSTRACT

INTRODUCTION: Internet Addiction Disorder (IAD) is an umbrella term for various types of Internet-based behavioural addiction, whereas Internet Gaming Disorder (IGD) addresses a specific type of IAD that is postulated to be due to a lack of control in impulse inhibition. IGD is an area of concern in the Diagnostic and Statistics Manual of Mental Disorders (DSM-5), which can be objectively assessed by dysfunctional behaviour and the increasing time of being online, particularly during the COVID-19 pandemic. Electroencephalography (EEG) identifies amplitude changes in the evoked response potential (ERP) among IGDs, correlated with underlying comorbidities.

MATERIALS AND METHODS: A scoping review was performed to elaborate on the research regarding resting-state EEG and task-based EEG, particularly for Go/No-go paradigms pertaining to subjects with IAD or specifically IGD. The role of EEG was identified in its diagnostic capability to identify the salient changes that occurred in the response to reward network and the executive control network, using restingstate and task-based EEG. The implication of using EEG in monitoring the therapy for IAD and IGD was also reviewed.

RESULTS: EEG generally revealed reduced beta waves and increased theta waves in addicts. IGD with depression demonstrated increased theta and decreased alpha waves. Whereas increased P300, a late cognitive ERP component, was frequently associated with impaired excessive allocation of attentional resources of the IAD towards addiction-specific cues. IGD had increased whole brain delta waves at baseline, which showed significant reduction post therapy.

CONCLUSION: EEG can identify distinct neurophysiological changes among Internet Addiction Disorder and Internet Gaming Disorder that are akin to substance abuse disorders.

PMID:34031341