Categories
Nevin Manimala Statistics

Educational Utility of Clinical Vignettes Generated in Japanese by ChatGPT-4: Mixed Methods Study

JMIR Med Educ. 2024 Aug 13;10:e59133. doi: 10.2196/59133.

ABSTRACT

BACKGROUND: Evaluating the accuracy and educational utility of artificial intelligence-generated medical cases, especially those produced by large language models such as ChatGPT-4 (developed by OpenAI), is crucial yet underexplored.

OBJECTIVE: This study aimed to assess the educational utility of ChatGPT-4-generated clinical vignettes and their applicability in educational settings.

METHODS: Using a convergent mixed methods design, a web-based survey was conducted from January 8 to 28, 2024, to evaluate 18 medical cases generated by ChatGPT-4 in Japanese. In the survey, 6 main question items were used to evaluate the quality of the generated clinical vignettes and their educational utility, which are information quality, information accuracy, educational usefulness, clinical match, terminology accuracy (TA), and diagnosis difficulty. Feedback was solicited from physicians specializing in general internal medicine or general medicine and experienced in medical education. Chi-square and Mann-Whitney U tests were performed to identify differences among cases, and linear regression was used to examine trends associated with physicians’ experience. Thematic analysis of qualitative feedback was performed to identify areas for improvement and confirm the educational utility of the cases.

RESULTS: Of the 73 invited participants, 71 (97%) responded. The respondents, primarily male (64/71, 90%), spanned a broad range of practice years (from 1976 to 2017) and represented diverse hospital sizes throughout Japan. The majority deemed the information quality (mean 0.77, 95% CI 0.75-0.79) and information accuracy (mean 0.68, 95% CI 0.65-0.71) to be satisfactory, with these responses being based on binary data. The average scores assigned were 3.55 (95% CI 3.49-3.60) for educational usefulness, 3.70 (95% CI 3.65-3.75) for clinical match, 3.49 (95% CI 3.44-3.55) for TA, and 2.34 (95% CI 2.28-2.40) for diagnosis difficulty, based on a 5-point Likert scale. Statistical analysis showed significant variability in content quality and relevance across the cases (P<.001 after Bonferroni correction). Participants suggested improvements in generating physical findings, using natural language, and enhancing medical TA. The thematic analysis highlighted the need for clearer documentation, clinical information consistency, content relevance, and patient-centered case presentations.

CONCLUSIONS: ChatGPT-4-generated medical cases written in Japanese possess considerable potential as resources in medical education, with recognized adequacy in quality and accuracy. Nevertheless, there is a notable need for enhancements in the precision and realism of case details. This study emphasizes ChatGPT-4’s value as an adjunctive educational tool in the medical field, requiring expert oversight for optimal application.

PMID:39137031 | DOI:10.2196/59133

Categories
Nevin Manimala Statistics

Understanding Health Care Students’ Perceptions, Beliefs, and Attitudes Toward AI-Powered Language Models: Cross-Sectional Study

JMIR Med Educ. 2024 Aug 13;10:e51757. doi: 10.2196/51757.

ABSTRACT

BACKGROUND: ChatGPT was not intended for use in health care, but it has potential benefits that depend on end-user understanding and acceptability, which is where health care students become crucial. There is still a limited amount of research in this area.

OBJECTIVE: The primary aim of our study was to assess the frequency of ChatGPT use, the perceived level of knowledge, the perceived risks associated with its use, and the ethical issues, as well as attitudes toward the use of ChatGPT in the context of education in the field of health. In addition, we aimed to examine whether there were differences across groups based on demographic variables. The second part of the study aimed to assess the association between the frequency of use, the level of perceived knowledge, the level of risk perception, and the level of perception of ethics as predictive factors for participants’ attitudes toward the use of ChatGPT.

METHODS: A cross-sectional survey was conducted from May to June 2023 encompassing students of medicine, nursing, dentistry, nutrition, and laboratory science across the Americas. The study used descriptive analysis, chi-square tests, and ANOVA to assess statistical significance across different categories. The study used several ordinal logistic regression models to analyze the impact of predictive factors (frequency of use, perception of knowledge, perception of risk, and ethics perception scores) on attitude as the dependent variable. The models were adjusted for gender, institution type, major, and country. Stata was used to conduct all the analyses.

RESULTS: Of 2661 health care students, 42.99% (n=1144) were unaware of ChatGPT. The median score of knowledge was “minimal” (median 2.00, IQR 1.00-3.00). Most respondents (median 2.61, IQR 2.11-3.11) regarded ChatGPT as neither ethical nor unethical. Most participants (median 3.89, IQR 3.44-4.34) “somewhat agreed” that ChatGPT (1) benefits health care settings, (2) provides trustworthy data, (3) is a helpful tool for clinical and educational medical information access, and (4) makes the work easier. In total, 70% (7/10) of people used it for homework. As the perceived knowledge of ChatGPT increased, there was a stronger tendency with regard to having a favorable attitude toward ChatGPT. Higher ethical consideration perception ratings increased the likelihood of considering ChatGPT as a source of trustworthy health care information (odds ratio [OR] 1.620, 95% CI 1.498-1.752), beneficial in medical issues (OR 1.495, 95% CI 1.452-1.539), and useful for medical literature (OR 1.494, 95% CI 1.426-1.564; P<.001 for all results).

CONCLUSIONS: Over 40% of American health care students (1144/2661, 42.99%) were unaware of ChatGPT despite its extensive use in the health field. Our data revealed the positive attitudes toward ChatGPT and the desire to learn more about it. Medical educators must explore how chatbots may be included in undergraduate health care education programs.

PMID:39137029 | DOI:10.2196/51757

Categories
Nevin Manimala Statistics

The Association Between Physical Distancing Behaviors to Avoid COVID-19 and Health-Related Quality of Life in Immunocompromised and Nonimmunocompromised Individuals: Patient-Informed Protocol for the Observational, Cross-Sectional EAGLE Study

JMIR Res Protoc. 2024 Aug 13;13:e52643. doi: 10.2196/52643.

ABSTRACT

BACKGROUND: Immunocompromised individuals are known to respond inadequately to SARS-CoV-2 vaccines, placing them at high risk of severe or fatal COVID-19. Thus, immunocompromised individuals and their caregivers may still practice varying degrees of social or physical distancing to avoid COVID-19. However, the association between physical distancing to avoid COVID-19 and quality of life has not been comprehensively evaluated in any study.

OBJECTIVE: We aim to measure physical distancing behaviors among immunocompromised individuals and the association between those behaviors and person-centric outcomes, including health-related quality of life (HRQoL) measures, health state utilities, anxiety and depression, and work and school productivity impairment.

METHODS: A patient-informed protocol was developed to conduct the EAGLE Study, a large cross-sectional, observational study, and this paper describes that protocol. EAGLE is designed to measure distancing behaviors and outcomes in immunocompromised individuals, including children (aged ≥6 mo) and their caregivers, and nonimmunocompromised adults in the United States and United Kingdom who report no receipt of passive immunization against COVID-19. We previously developed a novel self- and observer-reported instrument, the Physical Distancing Scale for COVID-19 Avoidance (PDS-C19), to measure physical distancing behavior levels cross-sectionally and retrospectively. Using an interim or a randomly selected subset of the study population, the PDS-C19 psychometric properties will be assessed, including structural validity, internal consistency, known-group validity, and convergent validity. Associations (correlations) will be assessed between the PDS-C19 and validated HRQoL-related measures and utilities. Structural equation modeling and regression will be used to assess these associations, adjusting for potential confounders. Participant recruitment and data collection took place from December 2022 to June 2023 using direct-to-patient channels, including panels, clinician referral, patient advocacy groups, and social media, with immunocompromising diagnosis confirmation collected and assessed for a randomly selected 25% of immunocompromised participants. The planned total sample size is 3718 participants and participant-caregiver pairs. Results will be reported by immunocompromised status, immunocompromising condition category, country, age group, and other subgroups.

RESULTS: All data analyses and reporting were planned to be completed by December 2023. Results are planned to be submitted for publication in peer-reviewed journals in 2024-2025.

CONCLUSIONS: This study will quantify immunocompromised individuals’ physical distancing behaviors to avoid COVID-19 and their association with HRQoL as well as health state utilities.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/52643.

PMID:39137022 | DOI:10.2196/52643

Categories
Nevin Manimala Statistics

Time structures of proton pencil beam scanning delivery on a microsecond scale measured with a pixelated semiconductor detector Timepix3

J Appl Clin Med Phys. 2024 Aug 13:e14486. doi: 10.1002/acm2.14486. Online ahead of print.

ABSTRACT

PURPOSE: The time structures of proton spot delivery in proton pencil beam scanning (PBS) radiation therapy are essential in many clinical applications. This study aims to characterize the time structures of proton PBS delivered by both synchrotron and synchrocyclotron accelerators using a non-invasive technique based on scattered particle tracking.

METHODS: A pixelated semiconductor detector, AdvaPIX-Timepix3, with a temporal resolution of 1.56 ns, was employed to measure time of arrival of secondary particles generated by a proton beam. The detector was placed laterally to the high-flux area of the beam in order to allow for single particle detection and not interfere with the treatment. The detector recorded counts of radiation events, their deposited energy and the timestamp associated with the single events. Individual recorded events and their temporal characteristics were used to analyze beam time structures, including energy layer switch time, magnet switch time, spot switch time, and the scanning speeds in the x and y directions. All the measurements were repeated 30 times on three dates, reducing statistical uncertainty.

RESULTS: The uncertainty of the measured energy layer switch times, magnet switch time, and the spot switch time were all within 1% of average values. The scanning speeds uncertainties were within 1.5% and are more precise than previously reported results. The measurements also revealed continuous sub-milliseconds proton spills at a low dose rate for the synchrotron accelerator and radiofrequency pulses at 7 µs and 1 ms repetition time for the synchrocyclotron accelerator.

CONCLUSION: The AdvaPIX-Timepix3 detector can be used to directly measure and monitor time structures on microseconds scale of the PBS proton beam delivery. This method yielded results with high precision and is completely independent of the machine log files.

PMID:39137008 | DOI:10.1002/acm2.14486

Categories
Nevin Manimala Statistics

Race adjustments in clinical algorithms can help correct for racial disparities in data quality

Proc Natl Acad Sci U S A. 2024 Aug 20;121(34):e2402267121. doi: 10.1073/pnas.2402267121. Epub 2024 Aug 13.

ABSTRACT

Despite ethical and historical arguments for removing race from clinical algorithms, the consequences of removal remain unclear. Here, we highlight a largely undiscussed consideration in this debate: varying data quality of input features across race groups. For example, family history of cancer is an essential predictor in cancer risk prediction algorithms but is less reliably documented for Black participants and may therefore be less predictive of cancer outcomes. Using data from the Southern Community Cohort Study, we assessed whether race adjustments could allow risk prediction models to capture varying data quality by race, focusing on colorectal cancer risk prediction. We analyzed 77,836 adults with no history of colorectal cancer at baseline. The predictive value of self-reported family history was greater for White participants than for Black participants. We compared two cancer risk prediction algorithms-a race-blind algorithm which included standard colorectal cancer risk factors but not race, and a race-adjusted algorithm which additionally included race. Relative to the race-blind algorithm, the race-adjusted algorithm improved predictive performance, as measured by goodness of fit in a likelihood ratio test (P-value: <0.001) and area under the receiving operating characteristic curve among Black participants (P-value: 0.006). Because the race-blind algorithm underpredicted risk for Black participants, the race-adjusted algorithm increased the fraction of Black participants among the predicted high-risk group, potentially increasing access to screening. More broadly, this study shows that race adjustments may be beneficial when the data quality of key predictors in clinical algorithms differs by race group.

PMID:39136986 | DOI:10.1073/pnas.2402267121

Categories
Nevin Manimala Statistics

Range expansions across landscapes with quenched noise

Proc Natl Acad Sci U S A. 2024 Aug 20;121(34):e2411487121. doi: 10.1073/pnas.2411487121. Epub 2024 Aug 13.

ABSTRACT

When biological populations expand into new territory, the evolutionary outcomes can be strongly influenced by genetic drift, the random fluctuations in allele frequencies. Meanwhile, spatial variability in the environment can also significantly influence the competition between subpopulations vying for space. Little is known about the interplay of these intrinsic and extrinsic sources of noise in population dynamics: When does environmental heterogeneity dominate over genetic drift or vice versa, and what distinguishes their population genetics signatures? Here, in the context of neutral evolution, we examine the interplay between a population’s intrinsic, demographic noise and an extrinsic, quenched random noise provided by a heterogeneous environment. Using a multispecies Eden model, we simulate a population expanding over a landscape with random variations in local growth rates and measure how this variability affects genealogical tree structure, and thus genetic diversity. We find that, for strong heterogeneity, the genetic makeup of the expansion front is to a great extent predetermined by the set of fastest paths through the environment. The landscape-dependent statistics of these optimal paths then supersede those of the population’s intrinsic noise as the main determinant of evolutionary dynamics. Remarkably, the statistics for coalescence of genealogical lineages, derived from those deterministic paths, strongly resemble the statistics emerging from demographic noise alone in uniform landscapes. This cautions interpretations of coalescence statistics and raises new challenges for inferring past population dynamics.

PMID:39136984 | DOI:10.1073/pnas.2411487121

Categories
Nevin Manimala Statistics

Ocular Adverse Effects of Over-the-Counter Cosmetics and Personal Care Products Reported to the Food and Drug Administration

Ophthalmic Plast Reconstr Surg. 2024 Aug 13. doi: 10.1097/IOP.0000000000002718. Online ahead of print.

ABSTRACT

PURPOSE: Personal care and cosmetic products can cause periocular and ocular adverse effects (AEs), for example, ocular surface disease, trauma, and hypersensitivity. The publicly available Food and Drug Administration (FDA) Center for Food Safety and Applied Nutrition Adverse Event Reporting System (CAERS) database includes AE reports by consumers, healthcare practitioners, and manufacturers. The purpose of this study was to characterize ophthalmic AE associated with cosmetics and personal care products reported by the FDA CAERS database.

METHODS: AE related to the eye or ocular adnexa from cosmetics submitted by consumers, healthcare practitioners, and manufacturers from January 2004 to June 2022 were identified after filtering using the Medical Dictionary for Regulatory Activities coding system. Demographic information, case outcome, and categories of product and AE were included. Chi-square analysis, with statistical significance at a = 0.05, was performed to ascertain variation in ocular, periocular, and general outcomes by product category.

RESULTS: Reports of ophthalmic AEs related to cosmetics per year increased from 2006 to 2018, reaching a maximum of 161 reports in 2018, then decreased from 2018 to 2021. In total, 959 and 1382 unique periocular and ocular AEs were reported. There were 1711 total incidences of reported periocular AEs and 2485 ocular AEs. The most reported periocular AEs were inflammation (770/1711) and hypersensitivity (331/1711). The most reported ocular effects were discomfort (946/2485) and inflammation (709/2485). Ocular, periocular, and general outcomes significantly varied by product category.

CONCLUSIONS: Consumers, healthcare practitioners, and manufacturers should be made aware of potential ophthalmic AE and outcomes associated with cosmetics and personal care products.

PMID:39136955 | DOI:10.1097/IOP.0000000000002718

Categories
Nevin Manimala Statistics

Use of Generative AI to Identify Helmet Status Among Patients With Micromobility-Related Injuries From Unstructured Clinical Notes

JAMA Netw Open. 2024 Aug 1;7(8):e2425981. doi: 10.1001/jamanetworkopen.2024.25981.

ABSTRACT

IMPORTANCE: Large language models (LLMs) have potential to increase the efficiency of information extraction from unstructured clinical notes in electronic medical records.

OBJECTIVE: To assess the utility and reliability of an LLM, ChatGPT-4 (OpenAI), to analyze clinical narratives and identify helmet use status of patients injured in micromobility-related accidents.

DESIGN, SETTING, AND PARTICIPANTS: This cross-sectional study used publicly available, deidentified 2019 to 2022 data from the US Consumer Product Safety Commission’s National Electronic Injury Surveillance System, a nationally representative stratified probability sample of 96 hospitals in the US. Unweighted estimates of e-bike, bicycle, hoverboard, and powered scooter-related injuries that resulted in an emergency department visit were used. Statistical analysis was performed from November 2023 to April 2024.

MAIN OUTCOMES AND MEASURES: Patient helmet status (wearing vs not wearing vs unknown) was extracted from clinical narratives using (1) a text string search using researcher-generated text strings and (2) the LLM by prompting the system with low-, intermediate-, and high-detail prompts. The level of agreement between the 2 approaches across all 3 prompts was analyzed using Cohen κ test statistics. Fleiss κ was calculated to measure the test-retest reliability of the high-detail prompt across 5 new chat sessions and days. Performance statistics were calculated by comparing results from the high-detail prompt to classifications of helmet status generated by researchers reading the clinical notes (ie, a criterion standard review).

RESULTS: Among 54 569 clinical notes, moderate (Cohen κ = 0.74 [95% CI, 0.73-0.75) and weak (Cohen κ = 0.53 [95% CI, 0.52-0.54]) agreement were found between the text string-search approach and the LLM for the low- and intermediate-detail prompts, respectively. The high-detail prompt had almost perfect agreement (κ = 1.00 [95% CI, 1.00-1.00]) but required the greatest amount of time to complete. The LLM did not perfectly replicate its analyses across new sessions and days (Fleiss κ = 0.91 across 5 trials; P < .001). The LLM often hallucinated and was consistent in replicating its hallucinations. It also showed high validity compared with the criterion standard (n = 400; κ = 0.98 [95% CI, 0.96-1.00]).

CONCLUSIONS AND RELEVANCE: This study’s findings suggest that although there are efficiency gains for using the LLM to extract information from clinical notes, the inadequate reliability compared with a text string-search approach, hallucinations, and inconsistent performance significantly hinder the potential of the currently available LLM.

PMID:39136946 | DOI:10.1001/jamanetworkopen.2024.25981

Categories
Nevin Manimala Statistics

Incidence of Cancer and Cardiovascular Disease After Bariatric Surgery in Older Patients

JAMA Netw Open. 2024 Aug 1;7(8):e2427457. doi: 10.1001/jamanetworkopen.2024.27457.

ABSTRACT

IMPORTANCE: Bariatric surgery is associated with decreased risk of obesity-related cancer and cardiovascular disease but is typically reserved for patients younger than 60 years. Whether these associations hold for patients who undergo surgery at older ages is uncertain.

OBJECTIVE: To determine whether bariatric surgery is associated with a decreased risk of obesity-related cancer and cardiovascular disease in patients who underwent surgery at age 60 years or older.

DESIGN, SETTING, AND PARTICIPANTS: Population-based cohort study of patients from Denmark, Finland, and Sweden who underwent bariatric surgery at age 60 years or older without previous malignant neoplasm or cardiovascular disease between 1989 and 2019. Each patient who underwent surgery was exactly matched to 5 patients with nonoperative treatment for obesity of the same country, sex, and age at the date of surgery. Data were analyzed in December 2023.

EXPOSURE: Receiving treatment for obesity, including bariatric surgery and nonoperative treatments.

MAIN OUTCOMES AND MEASURES: The main outcome was obesity-related cancer, defined as a composite outcome of breast, endometrial, esophageal, colorectal, and kidney cancer, identified from the national cancer registries. The secondary outcome was cardiovascular disease, defined as a composite outcome of myocardial infarction, ischemic stroke, and cerebral hemorrhage, identified from the patient registries. Multivariable Cox regression provided hazard ratios (HR) with 95% CIs adjusted for diabetes, hypertension, peripheral vascular disease, chronic obstructive pulmonary disease, kidney disease, and frailty.

RESULTS: In total, 15 300 patients (median [IQR] age, 63 [61-65] years; 10 152 female patients [66.4%]) were included, of which 2550 (16.7%) had bariatric surgery at age 60 or older and 12 750 (83.3%) had nonoperative treatment. During a median (IQR) of 5.8 (2.8-8.5) person-years of follow-up, 658 (4.3%) developed obesity-related cancer and 1436 (9.4%) developed cardiovascular disease. The risk of obesity-related cancer (HR, 0.81; 95% CI, 0.64-1.03) and cardiovascular disease (HR, 0.86; 95% CI, 0.74-1.01) were similar among who underwent surgery and those who did not. Gastric bypass (1930 patients) was associated with a decreased risk of obesity-related cancer (71 patients [3.7%]; HR, 0.74; 95% CI, 0.56-0.97) and cardiovascular disease (159 patients [8.2%]; HR, 0.82; 95% CI, 0.69-0.99) compared with matched controls (9650 patients; obesity-related cancer: 442 patients [4.6%]; cardiovascular disease: 859 patients [8.9%]).

CONCLUSIONS AND RELEVANCE: This cohort study found that bariatric surgery in older patients is not associated with lower rates of obesity-related cancer and cardiovascular events, but there was evidence that gastric bypass may be associated with lower risk of both outcomes.

PMID:39136945 | DOI:10.1001/jamanetworkopen.2024.27457

Categories
Nevin Manimala Statistics

The effect of economic growth, investment, and unemployment on renewable energy transition: evidence from OECD countries

Environ Sci Pollut Res Int. 2024 Aug 13. doi: 10.1007/s11356-024-34143-7. Online ahead of print.

ABSTRACT

In today’s world, where the dramatic effects of climate change continue to increase, it is critical to turn from fossil fuels to renewable energy sources to achieve the CO2 emission reduction targets that countries have committed at the Paris Climate Agreement and COP 27 conference. This study analyzes the effects of macroeconomic factors, including economic growth, investments, and unemployment, on the transition to renewable energy in OECD countries. From 1996 to 2020, long-run relationships between variables were examined using advanced econometric methodologies for empirical analysis. For this purpose, panel data analysis, second-generation panel unit root tests, cross-sectional dependence tests, and panel cointegration tests were applied. Economically, in the long run, according to panel CCEMG and AMG estimator, while economic growth enhances the renewable energy transitions, investment does not statistically promote an impact on the renewable energy transitions. Renewable energy transition increases with unemployment. Moreover, the role of the considered variables in the renewable energy transition varies among country-specific. Within the framework of the results obtained, it has been proven that before determining policies for renewable energy transformation, it is necessary to do the necessary groundwork in the economy to increase economic growth and investments and reduce unemployment.

PMID:39136924 | DOI:10.1007/s11356-024-34143-7