Categories
Nevin Manimala Statistics

msmsEDA & msmsTests: Label-Free Differential Expression by Spectral Counts

Methods Mol Biol. 2023;2426:197-242. doi: 10.1007/978-1-0716-1967-4_10.

ABSTRACT

msmsTests is an R/Bioconductor package providing functions for statistical tests in label-free LC-MS/MS data by spectral counts. These functions aim at discovering differentially expressed proteins between two biological conditions. Three tests are available: Poisson GLM regression, quasi-likelihood GLM regression, and the negative binomial of the edgeR package. The three models admit blocking factors to control for nuisance variables. To assure a good level of reproducibility a post-test filter is available, where (1) a minimum effect size considered biologically relevant, and (2) a minimum expression of the most abundant condition, may be set. A companion package, msmsEDA, proposes functions to explore datasets based on msms spectral counts. The provided graphics help in identifying outliers, the presence of eventual batch factors, and check the effects of different normalizing strategies. This protocol illustrates the use of both packages on two examples: A purely spike-in experiment of 48 human proteins in a standard yeast cell lysate; and a cancer cell-line secretome dataset requiring a biological normalization.

PMID:36308691 | DOI:10.1007/978-1-0716-1967-4_10

Categories
Nevin Manimala Statistics

Statistical Analysis of Quantitative Peptidomics and Peptide-Level Proteomics Data with Prostar

Methods Mol Biol. 2023;2426:163-196. doi: 10.1007/978-1-0716-1967-4_9.

ABSTRACT

Prostar is a software tool dedicated to the processing of quantitative data resulting from mass spectrometry-based label-free proteomics. Practically, once biological samples have been analyzed by bottom-up proteomics, the raw mass spectrometer outputs are processed by bioinformatics tools, so as to identify peptides and quantify them, notably by means of precursor ion chromatogram integration. From that point, the classical workflows aggregate these pieces of peptide-level information to infer protein-level identities and amounts. Finally, protein abundances can be statistically analyzed to find out proteins that are significantly differentially abundant between compared conditions. Prostar original workflow has been developed based on this strategy. However, recent works have demonstrated that processing peptide-level information is often more accurate when searching for differentially abundant proteins, as the aggregation step tends to hide some of the data variabilities and biases. As a result, Prostar has been extended by workflows that manage peptide-level data, and this protocol details their use. The first one, deemed “peptidomics,” implies that the differential analysis is conducted at peptide level, independently of the peptide-to-protein relationship. The second workflow proposes to aggregate the peptide abundances after their preprocessing (i.e., after filtering, normalization, and imputation), so as to minimize the amount of protein-level preprocessing prior to differential analysis.

PMID:36308690 | DOI:10.1007/978-1-0716-1967-4_9

Categories
Nevin Manimala Statistics

Towards a More Accurate Differential Analysis of Multiple Imputed Proteomics Data with mi4limma

Methods Mol Biol. 2023;2426:131-140. doi: 10.1007/978-1-0716-1967-4_7.

ABSTRACT

Imputing missing values is a common practice in label-free quantitative proteomics. Imputation replaces a missing value by a user-defined one. However, the imputation itself is not optimally considered downstream of the imputation process. In particular, imputed datasets are considered as if they had always been complete. The uncertainty due to the imputation is not properly taken into account. Hence, the mi4p package provides a more accurate statistical analysis of multiple-imputed datasets. A rigorous multiple imputation methodology is implemented, leading to a less biased estimation of parameters and their variability, thanks to Rubin’s rules. The imputation-based peptide’s intensities’ variance estimator is then moderated using Bayesian hierarchical models. This estimator is finally included in moderated t-test statistics to provide differential analyses results.

PMID:36308688 | DOI:10.1007/978-1-0716-1967-4_7

Categories
Nevin Manimala Statistics

Left-Censored Missing Value Imputation Approach for MS-Based Proteomics Data with GSimp

Methods Mol Biol. 2023;2426:119-129. doi: 10.1007/978-1-0716-1967-4_6.

ABSTRACT

Missing values caused by the limit of detection or quantification (LOD/LOQ) were widely observed in mass spectrometry (MS)-based omics studies and could be recognized as missing not at random (MNAR). MNAR leads to biased statistical estimations and jeopardizes downstream analyses. Although a wide range of missing value imputation methods was developed for omics studies, a limited number of methods were designed appropriately for the situation of MNAR. To facilitate MS-based omics studies, we introduce GSimp, a Gibbs sampler-based missing value imputation approach, to deal with left-censor missing values in MS-proteomics datasets. In this book, we explain the MNAR and elucidate the usage of GSimp for MNAR in detail.

PMID:36308687 | DOI:10.1007/978-1-0716-1967-4_6

Categories
Nevin Manimala Statistics

Integrating Identification and Quantification Uncertainty for Differential Protein Abundance Analysis with Triqler

Methods Mol Biol. 2023;2426:91-117. doi: 10.1007/978-1-0716-1967-4_5.

ABSTRACT

Protein quantification for shotgun proteomics is a complicated process where errors can be introduced in each of the steps. Triqler is a Python package that estimates and integrates errors of the different parts of the label-free protein quantification pipeline into a single Bayesian model. Specifically, it weighs the quantitative values by the confidence we have in the correctness of the corresponding PSM. Furthermore, it treats missing values in a way that reflects their uncertainty relative to observed values. Finally, it combines these error estimates in a single differential abundance FDR that not only reflects the errors and uncertainties in quantification but also in identification. In this tutorial, we show how to (1) generate input data for Triqler from quantification packages such as MaxQuant and Quandenser, (2) run Triqler and what the different options are, (3) interpret the results, (4) investigate the posterior distributions of a protein of interest in detail, and (5) verify that the hyperparameter estimations are sensible.

PMID:36308686 | DOI:10.1007/978-1-0716-1967-4_5

Categories
Nevin Manimala Statistics

Validation of MS/MS Identifications and Label-Free Quantification Using Proline

Methods Mol Biol. 2023;2426:67-89. doi: 10.1007/978-1-0716-1967-4_4.

ABSTRACT

In the proteomics field, the production and publication of reliable mass spectrometry (MS)-based label-free quantitative results is a major concern. Due to the intrinsic complexity of bottom-up proteomics experiments (requiring aggregation of data relating to both precursor and fragment peptide ions into protein information, and matching this data across samples), inaccuracies and errors can occur throughout the data-processing pipeline. In a classical label-free quantification workflow, the validation of identification results is critical since errors made at this first stage of the workflow may have an impact on the following steps and therefore on the final result. Although false discovery rate (FDR) of the identification is usually controlled by using the popular target-decoy method, it has been demonstrated that this method can sometimes lead to inaccurate FDR estimates. This protocol shows how Proline can be used to validate identification results by using the method based on the Benjamini-Hochberg procedure and then quantify the identified ions and proteins in a single software environment providing data curation capabilities and computational efficiency.

PMID:36308685 | DOI:10.1007/978-1-0716-1967-4_4

Categories
Nevin Manimala Statistics

Antecedents of self-protective behavior during the COVID-19 pandemic in Bangladesh

WHO South East Asia J Public Health. 2022 Jan-Jun;11(1):32-41. doi: 10.4103/WHO-SEAJPH.WHO-SEAJPH_172_21.

ABSTRACT

CONTEXT: Self-protective behavior (SPB) plays a significant role in controlling the spread of infection of a pandemic like coronavirus disease (COVID-19). Little research has been conducted to examine critical factors influencing SPB, especially in a developing country like Bangladesh.

AIMS: This study aimed to develop and test a theoretical model based on the extended information-motivation-behavior (IMB) skills model to investigate factors associated with SPB among Bangladeshi people.

METHODS: An online, cross-sectional survey was conducted on Bangladesh citizens (18 years and older) from June 1 and July 31, 2020. A total of 459 responses were used to assess the proposed model’s overall fit and test the hypothesized relationships among the model constructs.

STATISTICAL ANALYSIS USED: The data were analyzed using partial least squares structural equation modeling to identify relationships among model variables.

RESULTS: Health information-seeking behavior, health motivation, self-efficacy, and health consciousness (HC) (P < 0.05) had a significant impact on SPB among Bangladeshi people. The results identified the consequences of various degrees of HC on SPB in the COVID-19 outbreak.

CONCLUSIONS: This study confirms the IMB model’s applicability for analyzing SPB among people in developing countries like Bangladesh. The findings of this study could guide policymakers to develop and implement targeted strategies to ensure timely and transparent information for motivating people to improve SPB during the COVID-19 and in case of a future outbreak of an epidemic.

PMID:36308271 | DOI:10.4103/WHO-SEAJPH.WHO-SEAJPH_172_21

Categories
Nevin Manimala Statistics

Predictors and causes of in-hospital maternal deaths within 120 h of admission at a tertiary hospital in South-Western, Nigeria: A retrospective cohort study

Niger Postgrad Med J. 2022 Oct-Dec;29(4):325-333. doi: 10.4103/npmj.npmj_180_22.

ABSTRACT

BACKGROUND: An efficient, comprehensive emergency obstetrics care (CEMOC) can considerably reduce the burden of maternal mortality (MM) in Nigeria. Information about the risk of maternal death within 120 h of admission can reflect the quality of CEMOC offered.

AIM: This study aims to determine the predictors and causes of maternal death within 120 h of admission at the Lagos University Teaching Hospital, LUTH, Lagos South-Western, Nigeria.

METHODS: We conducted a retrospective cohort study amongst consecutive maternal deaths at a hospital in South-Western Nigeria, from 1 January 2007 to 31 December 2017, using data from patients’ medical records. We compared participants that died within 120 h to participants that survived beyond 120 h. Survival life table analysis, Kaplan-Meier plots and multivariable Cox proportional hazard regression were conducted to evaluate the factors affecting survival within 120 h of admission. Stata version 16 statistical software (StatCorp USA) was used for analysis.

RESULTS: Of the 430 maternal deaths, 326 had complete records. The mean age of the deceased was 30.7± (5.9) years and median time to death was 24 (5-96) h. Two hundred and sixty-eight (82.2%) women out of 326 died within 120 h of admission. Almost all maternal deaths from uterine rupture (95.2%) and most deaths from obstetric haemorrhage (87.3%), induced miscarriage (88.9%), sepsis (82.9%) and hypertensive disorders of pregnancy (77.9%) occurred within 120 h of admission. Admission to the intensive care unit (P = 0.007), cadre of admitting doctor (P < 0.001), cause of death (P = 0.036) and mode of delivery (P = 0.012) were independent predictors of hazard of death within 120 h.

CONCLUSION: The majority (82.2%) of maternal deaths occurred within 120 h of admission. Investment in the prevention and acute management of uterine rupture, obstetric haemorrhage, sepsis and hypertensive disorders of pregnancy can help to reduce MM within 120 h in our environment.

PMID:36308262 | DOI:10.4103/npmj.npmj_180_22

Categories
Nevin Manimala Statistics

Socio-demographic factors influencing measures of cognitive function of early adolescent students in abuja, Nigeria

Niger Postgrad Med J. 2022 Oct-Dec;29(4):317-324. doi: 10.4103/npmj.npmj_157_22.

ABSTRACT

BACKGROUND: The brain in the early adolescent period undergoes enhanced changes with the radical reorganisation of the neuronal network leading to improvement in cognitive capacity. A complex interplay exists between environment and genetics that influences the outcome of intellectual capability. We, therefore, aimed to evaluate the relationship between socio-demographic variables and measures of cognitive function (intelligence quotient [IQ] and academic performance) of early adolescents.

METHODS: The study was a descriptive cross-sectional study of early adolescents aged 10-14 years. Raven’s Standard Progressive Matrices was used to assess the IQ and academic performance was assessed by obtaining the average of all the subjects’ scores in the last three terms that made up an academic year. A confidence interval of 95% was assumed and a value of P < 0.05 was considered statistically significant.

RESULTS: The overall mean (standard deviation) age of the study population was 11.1 years (±1.3) with male-to-female ratio of 1:1. Female sex was associated with better academic performance with P = 0.004. The students with optimal IQ performance were more likely (61.7%) to perform above average than those with sub-optimal IQ performance (28.6%). As the mother’s age increased, the likelihood of having optimal IQ performance increased 1.04 times (odds ratio [OR] = 1.04; 95 confidence interval [CI] = 1.01-1.07). Students in private schools were three times more likely to have optimal IQ performance than those from public schools (OR = 2.79; 95 CI = 1.65-4.71).

CONCLUSION: The present study demonstrated that students’ IQ performance and the female gender were associated with above-average academic performance. The predictors of optimal IQ performance found in this study were students’ age, maternal age and school type.

PMID:36308261 | DOI:10.4103/npmj.npmj_157_22

Categories
Nevin Manimala Statistics

Acceptability, appropriateness and feasibility of webinar in strengthening research capacity in COVID-19 era in Nigeria

Niger Postgrad Med J. 2022 Oct-Dec;29(4):288-295. doi: 10.4103/npmj.npmj_167_22.

ABSTRACT

INTRODUCTION: The challenges posed by the COVID-19 pandemic have necessitated the increasing use of online virtual training platforms. The objectives of the study were to assess the acceptability, appropriateness and feasibility of virtual space in strengthening the research capacity in Nigeria.

MATERIALS AND METHODS: Data were collected through an adapted online questionnaire from participants following a 2-day webinar. Both descriptive and inferential (bivariate and multivariate) analyses were done.

RESULTS: The findings of the study revealed that 55.2% of participants (n = 424) were males and 66.0% (n = 424) were early career researchers. Two hundred and thirty-six participants (55.7%) (n = 424) reported very good acceptability, 67.9% (n = 424) reported very good appropriateness while 54.7% (n = 424) reported good feasibility of webinar for research capacity strengthening. The rating of knowledge obtained from the webinar as ‘excellent’ increased the odds of acceptability (odd ratio [OR] = 38.30; P < 0.001), appropriateness (OR = 15.65; P < 0.05), and feasibility (OR = 20.85; P < 0.05). Furthermore, the preference for zoom and other online platforms for learning increased odds of acceptability of the webinar (OR = 2.29; confidence interval [CI]: 0.97-57.39; P < 0.05), appropriateness (OR = 2.55; CI: 1.10-5.91; P < 0.05) and feasibility (OR = 2.34; CI: 0.96-5.74; P < 0.05).

CONCLUSION: The study concluded that webinar was acceptable, appropriate and feasible for strengthening research capacity, although poor internet connectivity and cost of data were the major challenges in Nigeria. However, a learner-centred approach in contents’ delivery that ensures optimal learning has the potential of enhancing research capacity strengthening via virtual space.

PMID:36308257 | DOI:10.4103/npmj.npmj_167_22