Categories
Nevin Manimala Statistics

Dysregulation of alternative splicing contributes to multiple myeloma pathogenesis

Br J Cancer. 2023 Jan 2. doi: 10.1038/s41416-022-02124-7. Online ahead of print.

ABSTRACT

BACKGROUND: Dysregulation of alternative splicing (AS) triggers many tumours, understanding the roles of splicing events during tumorigenesis would open new avenues for therapies and prognosis in multiple myeloma (MM).

METHODS: Molecular, genetic, bioinformatic and statistic approaches are used to determine the mechanism of the candidate splicing factor (SF) in myeloma cell lines, myeloma xenograft models and MM patient samples.

RESULTS: GSEA reveals a significant difference in the expression pattern of the alternative splicing pathway genes, notably enriched in MM patients. Upregulation of the splicing factor SRSF1 is observed in the progression of plasma cell dyscrasias and predicts MM patients’ poor prognosis. The c-indices of the Cox model indicated that SRSF1 improved the prognostic stratification of MM patients. Moreover, SRSF1 knockdown exerts a broad anti-myeloma activity in vitro and in vivo. The upregulation of SRSF1 is caused by the transcription factor YY1, which also functions as an oncogene in myeloma cells. Through RNA-Seq, we systematically verify that SRSF1 promotes the tumorigenesis of myeloma cells by switching AS events.

CONCLUSION: Our results emphasise the importance of AS for promoting tumorigenesis of MM. The candidate SF might be considered as a valuable therapeutic target and a potential prognostic biomarker for MM.

PMID:36593359 | DOI:10.1038/s41416-022-02124-7

Categories
Nevin Manimala Statistics

Computational method of the cardiovascular diseases classification based on a generalized nonlinear canonical decomposition of random sequences

Sci Rep. 2023 Jan 2;13(1):59. doi: 10.1038/s41598-022-27318-0.

ABSTRACT

Decision support systems can seriously help medical doctors in the diagnosis of different diseases, especially in complicated cases. This article is devoted to recognizing and diagnosing heart disease based on automatic computer processing of the electrocardiograms (ECG) of patients. In the general case, the change of the ECG parameters can be presented as a random sequence of the signals under processing. Developing new computational methods for such signal processing is an important research problem in creating efficient medical decision support systems. Authors consider the possibility of increasing the diagnostic accuracy of cardiovascular diseases by implementing of the new proposed computational method of information processing. This method is based on the generalized nonlinear canonical decomposition of a random sequence of the change of cardiogram parameters. The use of a nonlinear canonical model makes it possible to significantly simplify the maximum likelihood criterion for classifying diseases. This simplification is provided by the transition from a multi-dimensional distribution density of cardiogram parameters to a product of one-dimensional distribution densities of independent random coefficients of a nonlinear canonical decomposition. The absence of any restrictions on the class of random sequences under study makes it possible to achieve maximum accuracy in diagnosing cardiovascular diseases. Functional diagrams for implementing the proposed method reflecting the features of its application are presented. The quantitative parameters of the core of the computational diagnostic procedure can be determined in advance based on the preliminary statistical data of the ECGs for different heart diseases. That is why the developed method is quite simple in terms of computation (computing complexity, accuracy, computing time, etc.) and can be implemented in medical computer decision systems for monitoring cardiovascular diseases and for their diagnosis in real time. The results of the numerical experiment confirm the high accuracy of the developed method for classifying cardiovascular diseases.

PMID:36593356 | DOI:10.1038/s41598-022-27318-0

Categories
Nevin Manimala Statistics

Circulating vitamin D and breast cancer risk: an international pooling project of 17 cohorts

Eur J Epidemiol. 2023 Jan 3. doi: 10.1007/s10654-022-00921-1. Online ahead of print.

ABSTRACT

Laboratory and animal research support a protective role for vitamin D in breast carcinogenesis, but epidemiologic studies have been inconclusive. To examine comprehensively the relationship of circulating 25-hydroxyvitamin D [25(OH)D] to subsequent breast cancer incidence, we harmonized and pooled participant-level data from 10 U.S. and 7 European prospective cohorts. Included were 10,484 invasive breast cancer cases and 12,953 matched controls. Median age (interdecile range) was 57 (42-68) years at blood collection and 63 (49-75) years at breast cancer diagnosis. Prediagnostic circulating 25(OH)D was either newly measured using a widely accepted immunoassay and laboratory or, if previously measured by the cohort, calibrated to this assay to permit using a common metric. Study-specific relative risks (RRs) for season-standardized 25(OH)D concentrations were estimated by conditional logistic regression and combined by random-effects models. Circulating 25(OH)D increased from a median of 22.6 nmol/L in consortium-wide decile 1 to 93.2 nmol/L in decile 10. Breast cancer risk in each decile was not statistically significantly different from risk in decile 5 in models adjusted for breast cancer risk factors, and no trend was apparent (P-trend = 0.64). Compared to women with sufficient 25(OH)D based on Institute of Medicine guidelines (50- < 62.5 nmol/L), RRs were not statistically significantly different at either low concentrations (< 20 nmol/L, 3% of controls) or high concentrations (100- < 125 nmol/L, 3% of controls; ≥ 125 nmol/L, 0.7% of controls). RR per 25 nmol/L increase in 25(OH)D was 0.99 [95% confidence intervaI (CI) 0.95-1.03]. Associations remained null across subgroups, including those defined by body mass index, physical activity, latitude, and season of blood collection. Although none of the associations by tumor characteristics reached statistical significance, suggestive inverse associations were seen for distant and triple negative tumors. Circulating 25(OH)D, comparably measured in 17 international cohorts and season-standardized, was not related to subsequent incidence of invasive breast cancer over a broad range in vitamin D status.

PMID:36593337 | DOI:10.1007/s10654-022-00921-1

Categories
Nevin Manimala Statistics

Bee species perform distinct foraging behaviors that are best described by different movement models

Sci Rep. 2023 Jan 2;13(1):71. doi: 10.1038/s41598-022-26858-9.

ABSTRACT

In insect-pollinated plants, the foraging behavior of pollinators affects their pattern of movement. If distinct bee species vary in their foraging behaviors, different models may best describe their movement. In this study, we quantified and compared the fine scale movement of three bee species foraging on patches of Medicago sativa. Bee movement was described using distances and directions traveled between consecutive racemes. Bumble bees and honey bees traveled shorter distances after visiting many flowers on a raceme, while the distance traveled by leafcutting bees was independent of flower number. Transition matrices and vectors were calculated for bumble bees and honey bees to reflect their directionality of movement within foraging bouts; leafcutting bees were as likely to move in any direction. Bee species varied in their foraging behaviors, and for each bee species, we tested four movement models that differed in how distances and directions were selected, and identified the model that best explained the movement data. The fine-scale, within-patch movement of bees could not always be explained by a random movement model, and a general model of movement could not be applied to all bee species.

PMID:36593317 | DOI:10.1038/s41598-022-26858-9

Categories
Nevin Manimala Statistics

Work-related head injury and industry sectors in Finland: causes and circumstances

Int Arch Occup Environ Health. 2023 Jan 3. doi: 10.1007/s00420-022-01950-9. Online ahead of print.

ABSTRACT

OBJECTIVE: Despite the continuous development of occupational safety, the prevalence of work-related head injuries is excessive. To promote prevention, we conducted a study evaluating the risks and pathways that precede head injuries in different economic activity sectors.

METHODS: In Finland, more than 90% of employees are covered by inclusive statutory workers’ compensation. We obtained data on occupational head injuries in 2010-2017 from an insurance company database. The European Statistics on Accidents at Work (ESAW) variables represented the characteristics of the accidents and the injury. We analysed the risk factors, contributing events and injury mechanisms in 20 industry sectors, based on the Statistical Classification of Economic Activities in the European Community (NACE).

RESULTS: In the 32,898 cases, the most commonly affected area was the eyes (49.6%). The highest incidence of head injuries was in construction (15.7 per 1000 insurance years). Construction, manufacturing, and human health and social work activities stood out due to their distinctive ESAW category counts. ‘Working with hand-held tools’ [risk ratio (RR) 2.23, 95% confidence interval (CI) 2.14-2.32] in construction and ‘operating machines’ (RR 3.32, 95% CI 3.01-3.66) and ‘working with hand-held tools’ (1.99, 1.91-2.07) in manufacturing predicted head injury. The risk related to parameters of violence and threats in health and social work activities was nearly ninefold the risk of other sectors.

CONCLUSION: The risks and pathways preceding head injuries varied considerably. The highest head injury rates were in construction and manufacturing. Violence emerged as a major risk factor in human health and social work activities.

PMID:36593301 | DOI:10.1007/s00420-022-01950-9

Categories
Nevin Manimala Statistics

A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population

Sci Rep. 2023 Jan 2;13(1):13. doi: 10.1038/s41598-022-27264-x.

ABSTRACT

Risk prediction models are frequently used to identify individuals at risk of developing hypertension. This study evaluates different machine learning algorithms and compares their predictive performance with the conventional Cox proportional hazards (PH) model to predict hypertension incidence using survival data. This study analyzed 18,322 participants on 24 candidate features from the large Alberta’s Tomorrow Project (ATP) to develop different prediction models. To select the top features, we applied five feature selection methods, including two filter-based: a univariate Cox p-value and C-index; two embedded-based: random survival forest and least absolute shrinkage and selection operator (Lasso); and one constraint-based: the statistically equivalent signature (SES). Five machine learning algorithms were developed to predict hypertension incidence: penalized regression Ridge, Lasso, Elastic Net (EN), random survival forest (RSF), and gradient boosting (GB), along with the conventional Cox PH model. The predictive performance of the models was assessed using C-index. The performance of machine learning algorithms was observed, similar to the conventional Cox PH model. Average C-indexes were 0.78, 0.78, 0.78, 0.76, 0.76, and 0.77 for Ridge, Lasso, EN, RSF, GB and Cox PH, respectively. Important features associated with each model were also presented. Our study findings demonstrate little predictive performance difference between machine learning algorithms and the conventional Cox PH regression model in predicting hypertension incidence. In a moderate dataset with a reasonable number of features, conventional regression-based models perform similar to machine learning algorithms with good predictive accuracy.

PMID:36593280 | DOI:10.1038/s41598-022-27264-x

Categories
Nevin Manimala Statistics

Impact of coronavirus disease 2019 on pediatric intestinal intussusception in the United States

Pediatr Radiol. 2023 Jan 3. doi: 10.1007/s00247-022-05572-8. Online ahead of print.

ABSTRACT

BACKGROUND: Masking and social distancing to mitigate the spread of the SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) virus curbed the spread of other viruses. Given a potential link between viral illnesses and ileocolic intussusception, the purpose of this study is to characterize trends in incidence, diagnosis and management of pediatric intussusception in the United States in the context of the coronavirus disease 2019 (COVID-19) pandemic.

MATERIALS AND METHODS: This cross-sectional retrospective study used the Pediatric Hospital Information System and included children (ages 0-17 years) with a primary diagnosis of intussusception (ICD-10 [International Classification of Diseases, Tenth Revision]: K56.1) from January 2018 to December 2021. Descriptive statistics and chi-square analyses were used to characterize and compare proportions pre-COVID (2018 and 2019) to 2020 and 2021.

RESULTS: Eight thousand one hundred forty-three encounters met inclusion criteria. Intussusception diagnoses declined in 2020 (n = 1,480) compared to 2019 (n = 2,321) and 2018 (n = 2,171) but returned to pre-COVID levels in 2021 (n = 2,171). Patient age was similar across years (mean age in years: 2018: 2.3; 2019: 2.1; 2020: 2.3; 2021: 2.3). There was no significant change in the proportion of patients who underwent imaging in 2020 (96% [1,415/1,480]) compared to the other years in the study (2018: 96% [2,093/2,171], P = 0.21; 2019: 97% [2,253/2,321], P = 0.80; 2021: 96% [1,415/1,480], P = 0.85). There was a statistically significant but minimal increase in the proportion of cases treated with surgery in 2020 compared to 2019 (2020: 17.8% vs. 2019: 15%, P = 0.02); however, this was not replicated in the pairwise comparison of 2020 to 2018 (2020: 17.8% vs. 2018: 16.4%, P = 0.23). There was a statistically significant increase in the proportion of cases treated with surgery in 2020 compared to 2021 (2020: 17.8% vs. 2021: 14%, P = 0.001).

CONCLUSION: Pediatric intussusception diagnoses decreased at a national level in 2020 compared to previous years, with a rebound increase in 2021. This may reflect a secondary benefit of public health interventions imposed to curb the spread of COVID-19.

PMID:36593279 | DOI:10.1007/s00247-022-05572-8

Categories
Nevin Manimala Statistics

Two phase feature-ranking for new soil dataset for Coxiella burnetii persistence and classification using machine learning models

Sci Rep. 2023 Jan 2;13(1):29. doi: 10.1038/s41598-022-26956-8.

ABSTRACT

Coxiella burnetii (Cb) is a hardy, stealth bacterial pathogen lethal for humans and animals. Its tremendous resistance to the environment, ease of propagation, and incredibly low infectious dosage make it an attractive organism for biowarfare. Current research on the classification of Coxiella and features influencing its presence in the soil is generally confined to statistical techniques. Machine learning other than traditional approaches can help us better predict epidemiological modeling for this soil-based pathogen of public significance. We developed a two-phase feature-ranking technique for the pathogen on a new soil feature dataset. The feature ranking applies methods such as ReliefF (RLF), OneR (ONR), and correlation (CR) for the first phase and a combination of techniques utilizing weighted scores to determine the final soil attribute ranks in the second phase. Different classification methods such as Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Logistic Regression (LR), and Multi-Layer Perceptron (MLP) have been utilized for the classification of soil attribute dataset for Coxiella positive and negative soils. The feature-ranking methods established that potassium, chromium, cadmium, nitrogen, organic matter, and soluble salts are the most significant attributes. At the same time, manganese, clay, phosphorous, copper, and lead are the least contributing soil features for the prevalence of the bacteria. However, potassium is the most influential feature, and manganese is the least significant soil feature. The attribute ranking using RLF generates the most promising results among the ranking methods by generating an accuracy of 80.85% for MLP, 79.79% for LR, and 79.8% for LDA. Overall, SVM and MLP are the best-performing classifiers, where SVM yields an accuracy of 82.98% and 81.91% for attribute ranking by CR and RLF; and MLP generates an accuracy of 76.60% for ONR. Thus, machine models can help us better understand the environment, assisting in the prevalence of bacteria and decreasing the chances of false classification. Subsequently, this can assist in controlling epidemics and alleviating the devastating effect on the socio-economics of society.

PMID:36593267 | DOI:10.1038/s41598-022-26956-8

Categories
Nevin Manimala Statistics

Bacterial nanocellulose production using Cantaloupe juice, statistical optimization and characterization

Sci Rep. 2023 Jan 2;13(1):51. doi: 10.1038/s41598-022-26642-9.

ABSTRACT

The bacterial nanocellulose has been used in a wide range of biomedical applications including carriers for drug delivery, blood vessels, artificial skin and wound dressing. The total of ten morphologically different bacterial strains were screened for their potential to produce bacterial nanocellulose (BNC). Among these isolates, Bacillus sp. strain SEE-3 exhibited potent ability to produce the bacterial nanocellulose. The crystallinity, particle size and morphology of the purified biosynthesized nanocellulose were characterized. The cellulose nanofibers possess a negatively charged surface of – 14.7 mV. The SEM images of the bacterial nanocellulose confirms the formation of fiber-shaped particles with diameters of 20.12‒47.36 nm. The TEM images show needle-shaped particles with diameters of 30‒40 nm and lengths of 560‒1400 nm. X-ray diffraction show that the obtained bacterial nanocellulose has crystallinity degree value of 79.58%. FTIR spectra revealed the characteristic bands of the cellulose crystalline structure. The thermogravimetric analysis revealed high thermal stability. Optimization of the bacterial nanocellulose production was achieved using Plackett-Burman and face centered central composite designs. Using the desirability function, the optimum conditions for maximum bacterial nanocellulose production was determined theoretically and verified experimentally. Maximum BNC production (20.31 g/L) by Bacillus sp. strain SEE-3 was obtained using medium volume; 100 mL/250 mL conical flask, inoculum size; 5%, v/v, citric acid; 1.5 g/L, yeast extract; 5 g/L, temperature; 37 °C, Na2HPO4; 3 g/L, an initial pH level of 5, Cantaloupe juice concentration of 81.27 percent and peptone 11.22 g/L.

PMID:36593253 | DOI:10.1038/s41598-022-26642-9

Categories
Nevin Manimala Statistics

Prediction of tide level based on variable weight combination of LightGBM and CNN-BiGRU model

Sci Rep. 2023 Jan 2;13(1):9. doi: 10.1038/s41598-022-26213-y.

ABSTRACT

Accurate tide level prediction is crucial to human activities in coastal areas. Many practical applications show that compared with traditional harmonic analysis, long short-term memory (LSTM), gated recurrent units (GRUs) and other neural networks, along with ensemble learning models, such as light gradient boosting machine (LightGBM) and eXtreme gradient boosting (XGBoost), can achieve extremely high prediction accuracy in relatively stationary time series. Therefore, this paper proposes a variable weight combination model based on LightGBM and CNN-BiGRU with relevant research. It uses the variable weight combination method to weight and synthesize the prediction results of the two base models so that the combination model has a stronger ability to capture time series features and fits the data well. The experimental results show that in contrast to the base model LightGBM, the RMSE value and MAE value of the combination model are reduced by 43.2% and 44.7%, respectively; in contrast to the base model CNN-BiGRU, the RMSE value and MAE value of the combination model are reduced by 35.3% and 39.1%, respectively. This means that the variable weight combination model can greatly improve the accuracy of tide level prediction. In addition, we use tidal data from different geographical environments to further verify the good universality of the model. This study provides a new idea and method for tide prediction.

PMID:36593233 | DOI:10.1038/s41598-022-26213-y