Categories
Nevin Manimala Statistics

Preliminary Evaluation of Fine-Tuning the OpenDeLD Deidentification Pipeline Across Multi-Center Corpora

Stud Health Technol Inform. 2024 Aug 22;316:719-723. doi: 10.3233/SHTI240515.

ABSTRACT

Automatic deidentification of Electronic Health Records (EHR) is a crucial step in secondary usage for biomedical research. This study introduces evaluation of an intricate hybrid deidentification strategy to enhance patient privacy in secondary usage of EHR. Specifically, this study focuses on assessing automatic deidentification using OpenDeID pipeline across diverse corpora for safeguarding sensitive information within EHR datasets by incorporating diverse corpora. Three distinct corpora were utilized: the OpenDeID v2 corpus containing pathology reports from Australian hospitals, the 2014 i2b2/UTHealth deidentification corpus with clinical narratives from the USA, and the 2016 CEGS N-GRID identification corpus comprising psychiatric notes. The OpenDeID pipeline employs a hybrid approach based on deep learning and contextual rules. Pre-processing steps involved harmonizing and addressing encoding and format issues. Precision, Recall, F-measure metrics were used to assess the performance. The evaluation metrics demonstrated the superior performance of the Discharge Summary BioBERT model. Trained on three corpora with a total of 4,038 reports, the best performing model exhibited robust deidentification capabilities when applied to EHR. It achieved impressive micro-averaged F1-scores of 0.9248 and 0.9692 for strict and relaxed settings, respectively. These results offer valuable insights into the model’s efficacy and its potential role in safeguarding patient privacy in secondary usage of EHR.

PMID:39176896 | DOI:10.3233/SHTI240515

Categories
Nevin Manimala Statistics

Synthetic Generation of Patient Service Utilization Data: A Scalability Study

Stud Health Technol Inform. 2024 Aug 22;316:705-709. doi: 10.3233/SHTI240511.

ABSTRACT

To address privacy and ethical issues in using health data for machine learning, we evaluate the scalability of advanced synthetic data generation methods like GANs, VAEs, copulaGAN, and transformer models specifically for patient service utilization data. Our study examines five models on data from a Canadian health authority, focusing on training and generation efficiency, data resemblance, and practical utility. Our findings indicate that statistical models excel in efficiency, while most models produce synthetic data that closely mirrors real data, and is also useful for real-world applications.

PMID:39176892 | DOI:10.3233/SHTI240511

Categories
Nevin Manimala Statistics

Term Candidate Generation to Enrich Clinical Terminologies with Large Language Models

Stud Health Technol Inform. 2024 Aug 22;316:695-699. doi: 10.3233/SHTI240509.

ABSTRACT

Annotated language resources derived from clinical routine documentation form an intriguing asset for secondary use case scenarios. In this investigation, we report on how such a resource can be leveraged to identify additional term candidates for a chosen set of ICD-10 codes. We conducted a log-likelihood analysis, considering the co-occurrence of approximately 1.9 million de-identified ICD-10 codes alongside corresponding brief textual entries from problem lists in German. This analysis aimed to identify potential candidates with statistical significance set at p < 0.01, which were used as seed terms to harvest additional candidates by interfacing to a large language model in a second step. The proposed approach can identify additional term candidates at suitable performance values: hypernyms MAP@5=0.801, synonyms MAP@5 = 0.723 and hyponyms MAP@5 = 0.507. The re-use of existing annotated clinical datasets, in combination with large language models, presents an interesting strategy to bridge the lexical gap in standardized clinical terminologies and real-world jargon.

PMID:39176890 | DOI:10.3233/SHTI240509

Categories
Nevin Manimala Statistics

Development of a Framework for Establishing ‘Gold Standard’ Outbreak Data from Submitted SARS-CoV-2 Genome Samples

Stud Health Technol Inform. 2024 Aug 22;316:1962-1966. doi: 10.3233/SHTI240818.

ABSTRACT

Submitted genomic data for respiratory viruses reflect the emergence and spread of new variants. Although delays in submission limit the utility of these data for prospective surveillance, they may be useful for evaluating other surveillance sources. However, few studies have investigated the use of these data for evaluating aberration detection in surveillance systems. Our study used a Bayesian online change point detection algorithm (BOCP) to detect increases in the number of submitted genome samples as a means of establishing ‘gold standard’ dates of outbreak onset in multiple countries. We compared models using different data transformations and parameter values. BOCP detected change points that were not sensitive to different parameter settings. We also found data transformations were essential prior to change point detection. Our study presents a framework for using global genomic submission data to develop ‘gold standard’ dates about the onset of outbreaks due to new viral variants.

PMID:39176877 | DOI:10.3233/SHTI240818

Categories
Nevin Manimala Statistics

Challenges in Daily Computerized Assessment of Cognitive Functions of Post-COVID Patients

Stud Health Technol Inform. 2024 Aug 22;316:1950-1954. doi: 10.3233/SHTI240815.

ABSTRACT

While it would be quite helpful to learn more about the daily fluctuations of fatigue and cognitive impairments of post-COVID patients, their condition can make investigating these especially challenging. By discussing these issues with post-COVID patients and clinical practitioners, we identified six challenges that specifically apply to daily computerized assessment of cognitive functions of post-COVID patients. We proposed solutions for each of the challenges which can be summarized as offering a carefully planned and flexible study design to participants and monitoring their well-being throughout the assessments. We argue that when the proposed precautions are taken, it is feasible to conduct a study that will generate valuable insights into the trajectories of (cognitive) post-COVID symptoms.

PMID:39176874 | DOI:10.3233/SHTI240815

Categories
Nevin Manimala Statistics

Impact of Terrorism on the Use of Healthcare Services in Burkina Faso Between 2015 and 2022

Stud Health Technol Inform. 2024 Aug 22;316:1938-1942. doi: 10.3233/SHTI240812.

ABSTRACT

Burkina Faso has been facing a security crisis due to terrorism since 2015. This study aims to assess the impact of the attacks on the use of healthcare services. This is a secondary study on data from the country’s health data warehouse and the ACLED security data warehouse. After a description, generalized additive models were used to assess the impact of attacks on the use of health services. Between January 2015 and December 2022, 2449 kidnap/disappearance attacks, armed attacks, bombings and landmine explosions were perpetrated, causing 4965 deaths. The Sahel region was the most targeted (36.37% of attacks and 50.57% of deaths). Only population density had a significant impact on the use of health services (p<5%). The models were valid. Our study has shown that, despite the persistent insecurity in Burkina Faso, people are resilient and, above all, continue to seek out the most important healthcare services. It is therefore important to work to maintain the supply of these services.

PMID:39176871 | DOI:10.3233/SHTI240812

Categories
Nevin Manimala Statistics

TikTok and YouTube Shorts by Autistic Individuals for Increasing Autism Awareness

Stud Health Technol Inform. 2024 Aug 22;316:1891-1895. doi: 10.3233/SHTI240802.

ABSTRACT

INTRODUCTION: Autistic individuals, parents, organizations, and healthcare systems worldwide are actively sharing content aimed at increasing awareness about autism. This study aims at analyzing the type of contents presented in TikTok and YouTube Shorts videos under the hashtag #actuallyautistic and their potential to increase autism awareness.

METHODS: A sample of 60 videos were downloaded and analyzed (n=30 from TikTok and n=30 from YouTube Shorts). Video contents were analyzed using both thematic analysis and the AFINN sentiment analysis tool. The understandability and actionability of the videos were assessed with The Patient Education Materials Assessment Tool for Audiovisual Materials (PEMAT A/V).

RESULTS: The contents of these videos covered five main themes: Stigmatization; Sensory difficulties; Masking; Stimming; and Communication difficulties. No statistically significant differences were found on sentiment expressed on videos from both channels. TikTok videos received significantly more views, comments, and likes than videos on YouTube Shorts. The PEMAT A/V tool showed that there is a high level of understandability, but little reference to actionability.

DISCUSSION: Autistic people videos content spread valid and reliable information in hopes of normalizing difficulties and provide hope and comfort to others in similar situations.

CONCLUSIONS: Social media videos posted by autistic individuals provide accurate portrayals about autism but lack information on actionability. These shared personal stories can help increase public literacy about autism, dispel autism stigmas and emphasize individuality.

PMID:39176861 | DOI:10.3233/SHTI240802

Categories
Nevin Manimala Statistics

Nationwide Electronic Prescription Services in Finland in 2010-2023

Stud Health Technol Inform. 2024 Aug 22;316:1884-1888. doi: 10.3233/SHTI240800.

ABSTRACT

This research aimed to follow up a 14-year period (2010-2023) public and private healthcare service organizations’ and community pharmacies’ entries to and exits from the centralized, interoperable and shared electronic Prescription Services in Finland. Our material were the official Social Welfare and Healthcare Organization Registry and the official Pharmacy Registry; their data were extracted in January 2024. Outcomes were continuous registration of services or registered exist from the services. In addition, we used information from the Kanta Services for presenting monthly and annual number of electronic prescriptions and medicine dispensations on national level. In 2010-2023, totally 838 community pharmacies’ and their subsidiary pharmacies’ entries to and 24 exits from the nationwide Prescription Services took place, and in total 814 pharmacy outlets had the Prescription Services in production in 2023. Totally, 1980 public and private healthcare service organizations’ entries to and 494 exits from the Prescription Service took place, and 1486 organizations had the Prescription Services in production in 2023. Healthcare service organizations recorded totally 303.8 million electronic prescriptions into the Prescription Services. Recorded numbers were lower during the Covid-19 epidemic in Finland in 2020-2021. We also observed seasonal effects in the time series. Pharmacies recorded totally 660.4 million medicine dispensations (purchases) into the Prescription Services with an increasing trend year after year. We also observed seasonal effects in the dispensation time series.

PMID:39176859 | DOI:10.3233/SHTI240800

Categories
Nevin Manimala Statistics

Decision Support in Cardiac Surgery: Early Exploration of Requirements with Cardiac Anesthetists and Surgeons

Stud Health Technol Inform. 2024 Aug 22;316:1827-1831. doi: 10.3233/SHTI240786.

ABSTRACT

Successful implementation of clinical decision support tools is rare, the key barrier being the lack of user involvement during development. Following the idea, development, exploration, assessment, long-term follow-up (IDEAL) framework, this study aims to provide early insights into the current challenges, clinical processes, and priorities when developing new decision support tools in cardiac surgery. Using a qualitative approach, semi-structured interviews were conducted with cardiac anesthetists and surgeons from three Scottish cardiac centers. Thematic analysis identified adverse postoperative outcomes, ageing cardiac patient population and changing surgical procedures to be the main challenges in cardiac surgery. Existing risk prediction tools were largely not used due to a perceived lack of utility and validation. This study underscores the need to shift focus towards predicting postoperative complications, instead of mortality. It emphasizes the importance of early collaboration with clinical experts and stakeholders in developing decision support systems that are fit for purpose. By identifying the priorities of cardiac clinicians, the study lays the groundwork for developing clinically meaningful prediction models.

PMID:39176846 | DOI:10.3233/SHTI240786

Categories
Nevin Manimala Statistics

Comparison of Imputation Methods for Categorical Real-World Prostate Cancer Data with Natural Order

Stud Health Technol Inform. 2024 Aug 22;316:1800-1804. doi: 10.3233/SHTI240780.

ABSTRACT

Missing values (NA) often occur in cancer research, which may be due to reasons such as data protection, data loss, or missing follow-up data. Such incomplete patient information can have an impact on prediction models and other data analyses. Imputation methods are a tool for dealing with NA. Cancer data is often presented in an ordered categorical form, such as tumour grading and staging, which requires special methods. This work compares mode imputation, k nearest neighbour (knn) imputation, and, in the context of Multiple Imputation by Chained Equations (MICE), logistic regression model with proportional odds (mice_polr) and random forest (mice_rf) on a real-world prostate cancer dataset provided by the Cancer Registry of Rhineland-Palatinate in Germany. Our dataset contains relevant information for the risk classification of patients and the time between date of diagnosis and date of death. For the imputation comparison, we use Rubin’s (1974) Missing Completely At Random (MCAR) mechanism to remove 10%, 20%, 30%, and 50% observations. The results are evaluated and ranked based on the accuracy per patient. Mice_rf performs significantly best for each percentage of NA, followed by knn, and mice_polr performs significantly worst. Furthermore, our findings indicate that the accuracy of imputation methods increases with a lower number of categories, a relatively even proportion of patients in the categories, or a majority of patients in a particular category.

PMID:39176840 | DOI:10.3233/SHTI240780