Categories
Nevin Manimala Statistics

Identifying and analyzing sepsis states: A retrospective study on patients with sepsis in ICUs

PLOS Digit Health. 2022 Nov 10;1(11):e0000130. doi: 10.1371/journal.pdig.0000130. eCollection 2022 Nov.

ABSTRACT

Sepsis accounts for more than 50% of hospital deaths, and the associated cost ranks the highest among hospital admissions in the US. Improved understanding of disease states, progression, severity, and clinical markers has the potential to significantly improve patient outcomes and reduce cost. We develop a computational framework that identifies disease states in sepsis and models disease progression using clinical variables and samples in the MIMIC-III database. We identify six distinct patient states in sepsis, each associated with different manifestations of organ dysfunction. We find that patients in different sepsis states are statistically significantly composed of distinct populations with disparate demographic and comorbidity profiles. Our progression model accurately characterizes the severity level of each pathological trajectory and identifies significant changes in clinical variables and treatment actions during sepsis state transitions. Collectively, our framework provides a holistic view of sepsis, and our findings provide the basis for future development of clinical trials, prevention, and therapeutic strategies for sepsis.

PMID:36812596 | DOI:10.1371/journal.pdig.0000130

Categories
Nevin Manimala Statistics

Functional connectivity based machine learning approach for autism detection in young children using MEG signals

J Neural Eng. 2023 Feb 22. doi: 10.1088/1741-2552/acbe1f. Online ahead of print.

ABSTRACT

OBJECTIVE: Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder, and identifying early autism biomarkers plays a vital role in improving detection and subsequent life outcomes. This study aims to reveal hidden biomarkers in the patterns of functional brain connectivity as recorded by the neuro-magnetic brain responses in children with ASD.

APPROACH: We recorded resting-state MEG signals from thirty children with ASD (4-7 years) and thirty age, gender-matched typically developing (TD) children. We used a complex coherency-based functional connectivity analysis to understand the interactions between different brain regions of the neural system. The work characterizes the large-scale neural activity at different brain oscillations using functional connectivity analysis and assesses the classification performance of coherence-based (COH) measures for autism detection in young children. A comparative study has also been carried out on COH-based connectivity networks both region-wise and sensor-wise to understand frequency-band-specific connectivity patterns and their connections with autism symptomatology. We used Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers in the machine learning framework with a 5-fold cross-validation technique.

MAIN RESULTS: To classify ASD from TD children, the COH connectivity feature yields the highest classification accuracy of 91.66% in the high gamma (50-100 Hz) frequency band. In region-wise connectivity analysis, the second highest performance is in the delta band (1-4 Hz) after the gamma band. Combining the delta and gamma band features, we achieved a classification accuracy of 95.03% and 93.33% in the ANN and SVM classifiers, respectively. Using classification performance metrics and further statistical analysis, we show that ASD children demonstrate significant hyperconnectivity.

SIGNIFICANCE: Our findings support the weak central coherency theory in autism detections. Further, despite its lower complexity, we show that region-wise coherence analysis outperforms the sensor-wise connectivity analysis. Altogether, these results demonstrate the functional brain connectivity patterns as an appropriate biomarker of autism in young children.

PMID:36812588 | DOI:10.1088/1741-2552/acbe1f

Categories
Nevin Manimala Statistics

A proposed de-identification framework for a cohort of children presenting at a health facility in Uganda

PLOS Digit Health. 2022 Aug 24;1(8):e0000027. doi: 10.1371/journal.pdig.0000027. eCollection 2022 Aug.

ABSTRACT

Data sharing has enormous potential to accelerate and improve the accuracy of research, strengthen collaborations, and restore trust in the clinical research enterprise. Nevertheless, there remains reluctancy to openly share raw data sets, in part due to concerns regarding research participant confidentiality and privacy. Statistical data de-identification is an approach that can be used to preserve privacy and facilitate open data sharing. We have proposed a standardized framework for the de-identification of data generated from cohort studies in children in a low-and-middle income country. We applied a standardized de-identification framework to a data sets comprised of 241 health related variables collected from a cohort of 1750 children with acute infections from Jinja Regional Referral Hospital in Eastern Uganda. Variables were labeled as direct and quasi-identifiers based on conditions of replicability, distinguishability, and knowability with consensus from two independent evaluators. Direct identifiers were removed from the data sets, while a statistical risk-based de-identification approach using the k-anonymity model was applied to quasi-identifiers. Qualitative assessment of the level of privacy invasion associated with data set disclosure was used to determine an acceptable re-identification risk threshold, and corresponding k-anonymity requirement. A de-identification model using generalization, followed by suppression was applied using a logical stepwise approach to achieve k-anonymity. The utility of the de-identified data was demonstrated using a typical clinical regression example. The de-identified data sets was published on the Pediatric Sepsis Data CoLaboratory Dataverse which provides moderated data access. Researchers are faced with many challenges when providing access to clinical data. We provide a standardized de-identification framework that can be adapted and refined based on specific context and risks. This process will be combined with moderated access to foster coordination and collaboration in the clinical research community.

PMID:36812586 | DOI:10.1371/journal.pdig.0000027

Categories
Nevin Manimala Statistics

Predictability and stability testing to assess clinical decision instrument performance for children after blunt torso trauma

PLOS Digit Health. 2022 Aug 8;1(8):e0000076. doi: 10.1371/journal.pdig.0000076. eCollection 2022 Aug.

ABSTRACT

OBJECTIVE: The Pediatric Emergency Care Applied Research Network (PECARN) has developed a clinical-decision instrument (CDI) to identify children at very low risk of intra-abdominal injury. However, the CDI has not been externally validated. We sought to vet the PECARN CDI with the Predictability Computability Stability (PCS) data science framework, potentially increasing its chance of a successful external validation.

MATERIALS & METHODS: We performed a secondary analysis of two prospectively collected datasets: PECARN (12,044 children from 20 emergency departments) and an independent external validation dataset from the Pediatric Surgical Research Collaborative (PedSRC; 2,188 children from 14 emergency departments). We used PCS to reanalyze the original PECARN CDI along with new interpretable PCS CDIs developed using the PECARN dataset. External validation was then measured on the PedSRC dataset.

RESULTS: Three predictor variables (abdominal wall trauma, Glasgow Coma Scale Score <14, and abdominal tenderness) were found to be stable. A CDI using only these three variables would achieve lower sensitivity than the original PECARN CDI with seven variables on internal PECARN validation but achieve the same performance on external PedSRC validation (sensitivity 96.8% and specificity 44%). Using only these variables, we developed a PCS CDI which had a lower sensitivity than the original PECARN CDI on internal PECARN validation but performed the same on external PedSRC validation (sensitivity 96.8% and specificity 44%).

CONCLUSION: The PCS data science framework vetted the PECARN CDI and its constituent predictor variables prior to external validation. We found that the 3 stable predictor variables represented all of the PECARN CDI’s predictive performance on independent external validation. The PCS framework offers a less resource-intensive method than prospective validation to vet CDIs before external validation. We also found that the PECARN CDI will generalize well to new populations and should be prospectively externally validated. The PCS framework offers a potential strategy to increase the chance of a successful (costly) prospective validation.

PMID:36812570 | DOI:10.1371/journal.pdig.0000076

Categories
Nevin Manimala Statistics

Population analysis of mortality risk: Predictive models from passive monitors using motion sensors for 100,000 UK Biobank participants

PLOS Digit Health. 2022 Oct 20;1(10):e0000045. doi: 10.1371/journal.pdig.0000045. eCollection 2022 Oct.

ABSTRACT

Many studies have utilized physical activity for predicting mortality risk, using measures such as participant walk tests and self-reported walking pace. The rise of passive monitors to measure participant activity without requiring specific actions opens the possibility for population level analysis. We have developed novel technology for this predictive health monitoring, using limited sensor inputs. In previous studies, we validated these models in clinical experiments with carried smartphones, using only their embedded accelerometers as motion sensors. Using smartphones as passive monitors for population measurement is critically important for health equity, since they are already ubiquitous in high-income countries and increasingly common in low-income countries. Our current study simulates smartphone data by extracting walking window inputs from wrist worn sensors. To analyze a population at national scale, we studied 100,000 participants in the UK Biobank who wore activity monitors with motion sensors for 1 week. This national cohort is demographically representative of the UK population, and this dataset represents the largest such available sensor record. We characterized participant motion during normal activities, including daily living equivalent of timed walk tests. We then compute walking intensity from sensor data, as input to survival analysis. Simulating passive smartphone monitoring, we validated predictive models using only sensors and demographics. This resulted in C-index of 0.76 for 1-year risk decreasing to 0.73 for 5-year. A minimum set of sensor features achieves C-index of 0.72 for 5-year risk, which is similar accuracy to other studies using methods not achievable with smartphone sensors. The smallest minimum model uses average acceleration, which has predictive value independent of demographics of age and sex, similar to physical measures of gait speed. Our results show passive measures with motion sensors can achieve similar accuracy to active measures of gait speed and walk pace, which utilize physical walk tests and self-reported questionnaires.

PMID:36812566 | DOI:10.1371/journal.pdig.0000045

Categories
Nevin Manimala Statistics

A pilot study of the Earable device to measure facial muscle and eye movement tasks among healthy volunteers

PLOS Digit Health. 2022 Jun 30;1(6):e0000061. doi: 10.1371/journal.pdig.0000061. eCollection 2022 Jun.

ABSTRACT

The Earable device is a behind-the-ear wearable originally developed to measure cognitive function. Since Earable measures electroencephalography (EEG), electromyography (EMG), and electrooculography (EOG), it may also have the potential to objectively quantify facial muscle and eye movement activities relevant in the assessment of neuromuscular disorders. As an initial step to developing a digital assessment in neuromuscular disorders, a pilot study was conducted to determine whether the Earable device could be utilized to objectively measure facial muscle and eye movements intended to be representative of Performance Outcome Assessments, (PerfOs) with tasks designed to model clinical PerfOs, referred to as mock-PerfO activities. The specific aims of this study were: To determine whether the Earable raw EMG, EOG, and EEG signals could be processed to extract features describing these waveforms; To determine Earable feature data quality, test re-test reliability, and statistical properties; To determine whether features derived from Earable could be used to determine the difference between various facial muscle and eye movement activities; and, To determine what features and feature types are important for mock-PerfO activity level classification. A total of N = 10 healthy volunteers participated in the study. Each study participant performed 16 mock-PerfOs activities, including talking, chewing, swallowing, eye closure, gazing in different directions, puffing cheeks, chewing an apple, and making various facial expressions. Each activity was repeated four times in the morning and four times at night. A total of 161 summary features were extracted from the EEG, EMG, and EOG bio-sensor data. Feature vectors were used as input to machine learning models to classify the mock-PerfO activities, and model performance was evaluated on a held-out test set. Additionally, a convolutional neural network (CNN) was used to classify low-level representations of the raw bio-sensor data for each task, and model performance was correspondingly evaluated and compared directly to feature classification performance. The model’s prediction accuracy on the Earable device’s classification ability was quantitatively assessed. Study results indicate that Earable can potentially quantify different aspects of facial and eye movements and may be used to differentiate mock-PerfO activities. Specially, Earable was found to differentiate talking, chewing, and swallowing tasks from other tasks with observed F1 scores >0.9. While EMG features contribute to classification accuracy for all tasks, EOG features are important for classifying gaze tasks. Finally, we found that analysis with summary features outperformed a CNN for activity classification. We believe Earable may be used to measure cranial muscle activity relevant for neuromuscular disorder assessment. Classification performance of mock-PerfO activities with summary features enables a strategy for detecting disease-specific signals relative to controls, as well as the monitoring of intra-subject treatment responses. Further testing is needed to evaluate the Earable device in clinical populations and clinical development settings.

PMID:36812552 | DOI:10.1371/journal.pdig.0000061

Categories
Nevin Manimala Statistics

A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

PLOS Digit Health. 2022 Jun 13;1(6):e0000062. doi: 10.1371/journal.pdig.0000062. eCollection 2022 Jun.

ABSTRACT

Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors to create parsimonious scores, but such ‘black box’ variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach using the recently developed Shapley variable importance cloud (ShapleyVIC) that accounts for variability in variable importance across models. Our approach evaluates and visualizes overall variable contributions for in-depth inference and transparent variable selection, and filters out non-significant contributors to simplify model building steps. We derive an ensemble variable ranking from variable contributions across models, which is easily integrated with an automated and modularized risk score generator, AutoScore, for convenient implementation. In a study of early death or unplanned readmission after hospital discharge, ShapleyVIC selected 6 variables from 41 candidates to create a well-performing risk score, which had similar performance to a 16-variable model from machine-learning-based ranking. Our work contributes to the recent emphasis on interpretability of prediction models for high-stakes decision making, providing a disciplined solution to detailed assessment of variable importance and transparent development of parsimonious clinical risk scores.

PMID:36812536 | DOI:10.1371/journal.pdig.0000062

Categories
Nevin Manimala Statistics

Effectiveness of a tailored web app on sun protection intentions and its implications for skin cancer prevention: A randomized controlled trial

PLOS Digit Health. 2022 May 12;1(5):e0000032. doi: 10.1371/journal.pdig.0000032. eCollection 2022 May.

ABSTRACT

Skin cancers related to sunexposure are rising globally, yet largely preventable. Digital solutions enable individually tailored prevention and may play a crucial role in reducing disease burden. We developed SUNsitive, a theory-guided web app to facilitate sun protection and skin cancer prevention. The app collected relevant information through a questionnaire and provided tailored feedback on personal risk, adequate sun protection, skin cancer prevention, and overall skin health. SUNsitive’s effect on sun protection intentions and a set of secondary outcomes was evaluated with a two-arm randomized controlled trial (n = 244). At 2 weeks post-intervention, we did not find any statistical evidence for the intervention’s effect on the primary outcome or any of the secondary outcomes. However, both groups reported improved intentions to sun protect compared to their baseline values. Furthermore, our process outcomes suggest that approaching sun protection and skin cancer prevention with a digital tailored “questionnaire-feedback” format is feasible, well-perceived, and well accepted. Trial registration: Protocol registration: ISRCTN registry (ISRCTN10581468).

PMID:36812525 | DOI:10.1371/journal.pdig.0000032

Categories
Nevin Manimala Statistics

Spatial aggregation choice in the era of digital and administrative surveillance data

PLOS Digit Health. 2022 Jun 3;1(6):e0000039. doi: 10.1371/journal.pdig.0000039. eCollection 2022 Jun.

ABSTRACT

Traditional disease surveillance is increasingly being complemented by data from non-traditional sources like medical claims, electronic health records, and participatory syndromic data platforms. As non-traditional data are often collected at the individual-level and are convenience samples from a population, choices must be made on the aggregation of these data for epidemiological inference. Our study seeks to understand the influence of spatial aggregation choice on our understanding of disease spread with a case study of influenza-like illness in the United States. Using U.S. medical claims data from 2002 to 2009, we examined the epidemic source location, onset and peak season timing, and epidemic duration of influenza seasons for data aggregated to the county and state scales. We also compared spatial autocorrelation and tested the relative magnitude of spatial aggregation differences between onset and peak measures of disease burden. We found discrepancies in the inferred epidemic source locations and estimated influenza season onsets and peaks when comparing county and state-level data. Spatial autocorrelation was detected across more expansive geographic ranges during the peak season as compared to the early flu season, and there were greater spatial aggregation differences in early season measures as well. Epidemiological inferences are more sensitive to spatial scale early on during U.S. influenza seasons, when there is greater heterogeneity in timing, intensity, and geographic spread of the epidemics. Users of non-traditional disease surveillance should carefully consider how to extract accurate disease signals from finer-scaled data for early use in disease outbreaks.

PMID:36812505 | DOI:10.1371/journal.pdig.0000039

Categories
Nevin Manimala Statistics

Prognostic significance of epidermal growth factor receptor and programmed cell death-ligand 1 co-expression in esophageal squamous cell carcinoma

Aging (Albany NY). 2023 Feb 20;15. doi: 10.18632/aging.204535. Online ahead of print.

ABSTRACT

Our study aimed to observe the correlation between epidermal growth factor receptor (EGFR) and programmed cell death-ligand 1 (PD-L1) expression and evaluate prognostic potential of their co-expression in esophageal squamous cell carcinoma (ESCC) patients. EGFR and PD-L1 expression were evaluated by immunohistochemical analysis. We revealed that there was a positive correlation between EGFR and PD-L1 expression in ESCC (P = 0.004). According to the positive relationship between EGFR and PD-L1, all patients were divided into four groups: EGFR (+)/PD-L1 (+), EGFR (+)/PD-L1 (-), EGFR (-)/PD-L1 (+), and EGFR (-)/PD-L1 (-). In 57 ESCC patients without surgery, we found that EGFR and PD-L1 co-expression were statistically correlated with a lower objective response rate (ORR) (p = 0.029), overall survival (OS) (p = 0.018) and progression-free survival (PFS) (p = 0.045) than those with one or none positive protein. Furthermore, PD-L1 expression has a significant positive correlation with infiltration level of 19 immune cells, EGFR expression was significantly correlated with infiltration level of 12 immune cells. The infiltration level of CD8 T cell and B cell were negatively correlated with EGFR expression. On the contrary with EGFR, the infiltration level of CD8 T cell, and B cell were positively correlated with PD-L1 expression. In conclusion, EGFR and PD-L1 co-expression could predict poor ORR and survival in ESCC without surgery, indicating a subset of patients who may benefit from a combination of targeted therapy against EGFR and PD-L1, which may expand the population benefiting from immunotherapy and reduce the occurrence of hyper progressive diseases.

PMID:36812484 | DOI:10.18632/aging.204535