Categories
Nevin Manimala Statistics

Evaluation Framework of Large Language Models in Medical Documentation: Development and Usability Study

J Med Internet Res. 2024 Nov 20;26:e58329. doi: 10.2196/58329.

ABSTRACT

BACKGROUND: The advancement of large language models (LLMs) offers significant opportunities for health care, particularly in the generation of medical documentation. However, challenges related to ensuring the accuracy and reliability of LLM outputs, coupled with the absence of established quality standards, have raised concerns about their clinical application.

OBJECTIVE: This study aimed to develop and validate an evaluation framework for assessing the accuracy and clinical applicability of LLM-generated emergency department (ED) records, aiming to enhance artificial intelligence integration in health care documentation.

METHODS: We organized the Healthcare Prompt-a-thon, a competitive event designed to explore the capabilities of LLMs in generating accurate medical records. The event involved 52 participants who generated 33 initial ED records using HyperCLOVA X, a Korean-specialized LLM. We applied a dual evaluation approach. First, clinical evaluation: 4 medical professionals evaluated the records using a 5-point Likert scale across 5 criteria-appropriateness, accuracy, structure/format, conciseness, and clinical validity. Second, quantitative evaluation: We developed a framework to categorize and count errors in the LLM outputs, identifying 7 key error types. Statistical methods, including Pearson correlation and intraclass correlation coefficients (ICC), were used to assess consistency and agreement among evaluators.

RESULTS: The clinical evaluation demonstrated strong interrater reliability, with ICC values ranging from 0.653 to 0.887 (P<.001), and a test-retest reliability Pearson correlation coefficient of 0.776 (P<.001). Quantitative analysis revealed that invalid generation errors were the most common, constituting 35.38% of total errors, while structural malformation errors had the most significant negative impact on the clinical evaluation score (Pearson r=-0.654; P<.001). A strong negative correlation was found between the number of quantitative errors and clinical evaluation scores (Pearson r=-0.633; P<.001), indicating that higher error rates corresponded to lower clinical acceptability.

CONCLUSIONS: Our research provides robust support for the reliability and clinical acceptability of the proposed evaluation framework. It underscores the framework’s potential to mitigate clinical burdens and foster the responsible integration of artificial intelligence technologies in health care, suggesting a promising direction for future research and practical applications in the field.

PMID:39566044 | DOI:10.2196/58329

Categories
Nevin Manimala Statistics

Rank Ordered Design Attributes for Health Care Dashboards Including Artificial Intelligence: Usability Study

Online J Public Health Inform. 2024 Nov 20;16:e58277. doi: 10.2196/58277.

ABSTRACT

BACKGROUND: On average, people in the United States visit a doctor 4 times a year, and many of them have chronic illnesses. Because of the increased use of technology, people frequently rely on the internet to access health information and statistics. People use health care information to make better-educated decisions for themselves and others. Health care dashboards should provide pertinent and easily understood data, such as information on timely cancer screenings, so the public can make better-informed decisions. In order to enhance health outcomes, effective dashboards should provide precise data in an accessible and easily digestible manner.

OBJECTIVE: This study identifies the top 15 attributes of a health care dashboard. The objective of this research is to enhance health care dashboards to benefit the public by making better health care information available for more informed decisions by the public and to improve population-level health care outcomes.

METHODS: The authors conducted a survey of health care dashboards with 218 individuals identifying the best practices to consider when creating a public health care dashboard. The data collection was conducted from June 2023 to August 2023. The analyses performed were descriptive statistics, frequencies, and a comparison to a prior study.

RESULTS: From May 2023 to June 2023, we collected 3259 responses in multiple different states around the United States from 218 people aged 18 years or older. The features ranking in descending order of importance are as follows: (1) easy navigation, (2) historical data, (3) simplicity of design, (4) high usability, (5) use of clear descriptions, (6) consistency of data, (7) use of diverse chart types, (8) compliance with the Americans with Disabilities Act, (9) incorporated user feedback, (10) mobile compatibility, (11) comparison data with other entities, (12) storytelling, (13) predictive analytics with artificial intelligence, (14) adjustable thresholds, and (15) charts with tabulated data.

CONCLUSIONS: Future studies can extend the research to other types of dashboards such as bioinformatics, financial, and managerial dashboards as well as confirm these top 15 best practices for medical dashboards with further evidentiary support. The medical informatics community may benefit from standardization to improve efficiency and effectiveness as dashboards can communicate vital information to patients worldwide on critically prominent issues. Furthermore, health care professionals should use these best practices to help increase population health care outcomes by informing health care consumers to make better decisions with better data.

PMID:39566038 | DOI:10.2196/58277

Categories
Nevin Manimala Statistics

Associations between objectively measured nighttime sleep duration, sleep timing, and sleep quality and body composition in toddlers in the Guelph Family Health Study

Appl Physiol Nutr Metab. 2024 Nov 20. doi: 10.1139/apnm-2024-0244. Online ahead of print.

ABSTRACT

The prevalence of child obesity is a worldwide public health concern. Good sleep hygiene is associated with reduced adiposity in older children and adults. More research is needed in younger children to help mitigate risk of obesity. As well, we aimed to address limitations found in previous studies such as relying on subjective measures, or only including one parameter of sleep,using only one body composition parameter, and/or not adjusting for relevant covariates. This cross-sectional study examined baseline data from 48 toddlers aged 1 to <3 years enrolled in the Guelph Family Health Study. Nighttime sleep duration, sleep timing (time child went to sleep, and awoke), and sleep quality were measured using 24-hour accelerometry for seven consecutive days. Height, body weight, and waist circumference were measured, and BMI z-scores and waist-to-height ratios were calculated. Percent fat mass and fat mass index were calculated using bioelectrical impedance analysis. Linear regression models were used to estimate associations between sleep parameters and body composition outcomes, with adjustments for relevant covariates (age, sex, household income, screen time, energy intake, physical activity, household stress). Nighttime sleep onset time was positively associated with waist-to-height ratio (β^=0.004, p=0.04). Sleep offset time was negatively associated with BMI z-score (β^=-0.48, p=0.02). Total sleep time and wake after sleep onset were not associated with any body composition outcome. Building healthy sleep habits may prevent childhood obesity; longitudinal research in a larger sample is warranted. This study was registered on ClinicalTrials.gov (NCT02939261).

PMID:39566036 | DOI:10.1139/apnm-2024-0244

Categories
Nevin Manimala Statistics

Differing conceptual maps of skills for implementing evidence-based interventions held by community-based organization practitioners and academics: A multidimensional scaling comparison

Transl Behav Med. 2024 Nov 20:ibae051. doi: 10.1093/tbm/ibae051. Online ahead of print.

ABSTRACT

Community-based organizations (CBOs) are critical for delivering evidence-based interventions (EBIs) to address cancer inequities. However, a lack of consensus on the core skills needed for this work often hinders capacity-building strategies to support EBI implementation. The disconnect is partly due to differing views of EBIs and related skills held by those typically receiving versus developing capacity-building interventions (here, practitioners and academics, respectively). Our team of implementation scientists and practice-based advisors used group concept mapping to engage 34 CBO practitioners and 30 academics with experience addressing cervical cancer inequities implementing EBIs. We created group-specific maps of skills using multidimensional scaling and hierarchical cluster analysis, then compared them using Procrustes comparison permutations. The 98 skills were sorted into six clusters by CBO practitioners and five by academics. The groups generated maps with statistically comparable underlying structures but also statistically significant divergence. Some skill clusters had high concordance across the two maps, e.g. “managing funding and external resources.” Other skill clusters, e.g. “adapting EBIs” from the CBO practitioner map and “selecting and adapting EBIs” from the academic map, did not overlap as much. Across groups, key clusters of skills included connecting with community members, understanding the selected EBI and community context, adapting EBIs, building diverse and equitable partnerships, using data and evaluation, and managing funding and external resources. There is a significant opportunity to combine CBO practitioners’ systems/community frames with the EBI-focused frame of academics to promote EBI utilization and address cancer and other health inequities.

PMID:39566021 | DOI:10.1093/tbm/ibae051

Categories
Nevin Manimala Statistics

Analysis of Concordance Between Next-Generation Sequencing Assessment of Microsatellite Instability and Immunohistochemistry-Mismatch Repair From Solid Tumors

JCO Precis Oncol. 2024 Oct;8:e2300648. doi: 10.1200/PO.23.00648. Epub 2024 Nov 20.

ABSTRACT

PURPOSE: The new CAP guideline published in August 2022 recommends using immunohistochemistry (IHC) to test for mismatch repair defects in gastroesophageal (GE), small bowel (SB), or endometrial carcinoma (EC) cancers over next-generation sequencing assessment of microsatellite instability (NGS-MSI) for immune checkpoint inhibitor (ICI) therapy eligibility and states there is a preference to use IHC over NGS-MSI in colorectal carcinoma (CRC).

METHODS: We assessed the concordance of NGS-MSI and IHC-MMR from a very large cohort across the spectrum of solid tumors.

RESULTS: Of the over 190,000 samples with both NGS-MSI and IHC-MMR about 1,160 were initially flagged as discordant. Of those samples initially flagged as discordant, 50.9% remained discordant after being reviewed by an additional pathologist. This resulted in a final discordance rate of 0.31% (590/191,767). Among CRC, GE, SB and EC, 55.4% of mismatch repair proficient/MSI high (MMRp/MSI-H) tumors had at least one somatic pathogenic mutation in an MMR gene or POLE. Mismatch repair deficient/microsatellite stable (MMRd/MSS) tumors had a significantly lower rate of high tumor mutational burden than MMRp/MSI-H tumors. Across all solid tumors, MMRd/MSI-H tumors had significantly longer overall survival (OS; hazard ratio [HR], 1.47, P < .001) and post-ICI survival (HR, 1.82, P < .001) as compared with MMRp/MSS tumors. The OS for the MMRd/MSS group was slightly worse compared to the MMRp/MSI-H tumors, but this difference was not statistically significant (HR, 0.73, P = .058), with a similar pattern when looking at post-ICI survival (HR, 0.43, P = .155).

CONCLUSION: This study demonstrates that NGS-MSI is noninferior to IHC-MMR and can identify MSI-H tumors that IHC-MMR is unable to detect and conversely IHC-MMR can identify MMRd tumors that NGS-MSI misses.

PMID:39565978 | DOI:10.1200/PO.23.00648

Categories
Nevin Manimala Statistics

Periodontitis associated with brain function impairment in middle-aged and elderly individuals with normal cognition

J Periodontol. 2024 Nov 20. doi: 10.1002/JPER.24-0264. Online ahead of print.

ABSTRACT

BACKGROUND: The present study aimed to investigate changes in intranetwork functional connectivity (FC) and internetwork FC in middle-aged and elderly individuals with normal cognition (NC) and varying degrees of periodontitis to determine the effects of periodontitis on brain function.

METHODS: Periodontal findings and resting-state functional magnetic resonance imaging data were acquired from 51 subjects with NC. Independent component analysis and correlation analysis were used for the statistical analysis of the data.

RESULTS: Differences in intranetwork FC were observed among groups in the anterior default-mode network (aDMN), dorsal attention network and dorsal sensorimotor network (dSMN). Compared with the nonperiodontitis (NP) group or the mild-periodontitis group, the analysis of internetwork FC showed increased FC between the auditory network and the ventral attention network (VAN), between the aDMN and the salience network (SN), and between the SN and the VAN and decreased FC between the posterior default-mode network and the right frontoparietal network in the moderate-to-severe periodontitis group. Additionally, internetwork FC between the dSMN and the VAN was also increased in the moderate-to-severe periodontitis group compared to the NP group. The altered intra- and internetwork FC were significantly correlated with the periodontal clinical index.

CONCLUSION: Our results confirmed that periodontitis was associated with both intra- and internetwork FC changes even in NC. The present study indicates that periodontitis might be a potential risk factor for brain damage and provides a theoretical clue and a new treatment target for the early prevention of Alzheimer disease.

PLAIN LANGUAGE SUMMARY: Recent research has proposed that periodontitis is a potential risk factor for Alzheimer disease (AD). However, the relationship between periodontitis and the brain function of middle-aged and elderly individuals with normal cognition (NC) remains unclear. Analyzing the effect of periodontitis on brain function in the NC stage can provide clues to AD development and help achieve early prevention of dementia. The present study aimed to investigate changes in brain functional connectivity (FC) in NC with different severity of periodontitis to determine the effects of periodontitis on brain function. Both changed intranetwork FC and internetwork FC were found in the moderate-to-severe periodontitis group, and periodontitis was associated with brain network function impairment in NC. The present study indicates that periodontitis might be a potential risk factor for brain damage even in NC stage, and provides a theoretical clue and a new treatment target for the early prevention of AD.

PMID:39565645 | DOI:10.1002/JPER.24-0264

Categories
Nevin Manimala Statistics

High-molecular-weight oligomer tau (HMWoTau) species are dramatically increased in Braak-stage dependent manner in the frontal lobe of human brains, demonstrated by a novel oligomer Tau ELISA with a mouse monoclonal antibody (APNmAb005)

FASEB J. 2024 Nov 30;38(22):e70160. doi: 10.1096/fj.202401704R.

ABSTRACT

Disease-specific oligomers Tau assay system is anticipated in Alzheimer disease (AD) to elucidate their etiological roles. We developed a highly sensitive and selective ELISA for high-molecular-weight oligomer tau (HMWoTau) with LLOQ of 0.3 pg/well for the first time, using a novel mouse monoclonal antibody APNmAb005. The target molecule was identified as HMWoTau with circa 2000 kD as a minimum size and the more oligomerized species (>5000 kD), in combination analysis with Size-Exclusion-Chromatography and Sucrose-Density-Gradient-Centrifugation for both recombinant human (rh) Tau-derived aggregates and AD brain-lysates in PBS(-). HMWoTau was labeled by Thioflavin S and visualized as a homogeneous globular particle (about 30 nm in diameter) by two different technologies of atomic force microscopy and dSTORM-Nanoimager. Specific quantitation was also confirmed by immune-absorption, rhHMWoTau-spiked, and cross-reactivity studies. APNmAb005 failed to detect the HMWoTau signal by treatment with DTT/SDS under no influence on the pan-tau antibody, indicating its conformation-specific recognition. APNmAb005-ELISA showed AD-specific and statistically significant ELISA signals from 1 ng brain lysate protein/well. Analysis of the frontal neocortex (N = 40, Braak stage I-VI) by ELISA revealed the detection-limit levels of HMWoTau species at stage I-III, and drastic and statistically significant increases at stage V/VI (AD). By contrast, total Tau and p181 Tau showed 1/4-1/5 levels of AD even at Stage I, while both tau species also showed a statistically significant increase in AD. In sum, our novel APNmAb005-ELISA clarified the disease-specific increase in HMWoTau species and will be useful for not only further etiological elucidation but also the potential diagnostics in AD and relevant tauopathy.

PMID:39565643 | DOI:10.1096/fj.202401704R

Categories
Nevin Manimala Statistics

Use of Biologic or Targeted Synthetic Disease-Modifying Antirheumatic Drugs and Cancer Risk

JAMA Netw Open. 2024 Nov 4;7(11):e2446336. doi: 10.1001/jamanetworkopen.2024.46336.

ABSTRACT

IMPORTANCE: The Oral Rheumatoid Arthritis Trial Surveillance demonstrated an increased cancer risk among patients with rheumatoid arthritis (RA) taking tofacitinib compared with those taking tumor necrosis factor inhibitors (TNFis). Although international cohort studies have compared cancer outcomes between TNFis, non-TNFi drugs, and Janus kinase inhibitor (JAKis), their generalizability to US patients with RA is limited.

OBJECTIVE: To assess the comparative safety of TNFis, non-TNFi drugs, and JAKis among US patients with RA (ie, the cancer risk associated with the use of these drugs among these patients).

DESIGN, SETTING, AND PARTICIPANTS: This retrospective cohort study used US administrative claims data from Merative Marketscan Research Databases from November 1, 2012, to December 31, 2021. Follow-up occurred up to 2 years after initiation of biologic or targeted synthetic disease-modifying antirheumatic drugs (DMARDs). Participants included individuals aged 18 to 64 years with RA, identified using at least 2 RA International Classification of Diseases, Ninth Revision or International Statistical Classification of Diseases and Related Health Problems, Tenth Revision diagnostic codes on or before the date of TNFi, non-TNFi, or JAKi initiation (“index date”). Statistical analysis took place from June 2022 to September 2024.

EXPOSURES: New initiations of TNFis, abatacept, interleukin 6 inhibitors (IL-6is), rituximab, or JAKis. Individuals could contribute person-time to more than 1 treatment exposure if treatment escalation mimicked typical clinical practice but were censored if they switched to a previously trialed medication class.

MAIN OUTCOMES AND MEASURES: Incident cancer, excluding nonmelanoma skin cancer, after at least 90 days and within 2 years of initiation of biologic or targeted synthetic DMARDs. Outcomes were associated with the most recent drug exposure.

RESULTS: Of the 25 305 individuals who initiated treatment and who met the inclusion criteria, most were female (19 869 [79%]), had a median age of 50 years (IQR, 42-56 years), and were from the South US (12 516 [49%]). Of a total 27 661 drug exposures, drug initiations consisted of 20 586 TNFi exposures (74%), 2570 JAKi exposures (9%), 2255 abatacept exposures (8%), 1182 rituximab exposures (4%), and 1068 IL-6i exposures (4%). Multivariable Cox proportional hazards regression analysis showed that rituximab was associated with a higher risk of incident cancer compared with TNFis (hazard ratio [HR], 1.91; 95% CI, 1.17-3.14), followed by abatacept (HR, 1.47; 95% CI, 1.03-2.11), and JAKis (HR, 1.36; 95% CI, 0.94-1.96).

CONCLUSIONS AND RELEVANCE: In this cohort study of individuals with RA and new biologic or targeted synthetic DMARD exposures, individuals initiating rituximab, abatacept, and JAKis demonstrated higher incidence rates and statistically significantly increased risks of incident cancers compared with those initiating TNFis in the first 2 years after initiation of biologic or targeted synthetic DMARDs. Given the limitations of administrative claims data and confounding by indication, it is likely that these patients may have a higher disease burden, resulting in channeling bias. To better understand these associations, larger studies with longer follow-up time are needed.

PMID:39565623 | DOI:10.1001/jamanetworkopen.2024.46336

Categories
Nevin Manimala Statistics

American College of Surgeons Operative Standards and Breast Cancer Outcomes

JAMA Netw Open. 2024 Nov 4;7(11):e2446345. doi: 10.1001/jamanetworkopen.2024.46345.

ABSTRACT

IMPORTANCE: The American College of Surgeons (ACS) operative standards were established to detail critical elements of cancer surgery, reduce technical variation, and improve outcomes. Two of the 6 operative standards target adequate axillary surgery for breast cancer. The potential association of the operative standards with short-term oncologic outcomes, such as nodal yield and nodal positivity rates, is currently unknown.

OBJECTIVE: To evaluate the potential association of the ACS operative standards with short-term oncologic outcomes in breast cancer.

DESIGN, SETTING, AND PARTICIPANTS: A cohort study was performed using data on 1 201 317 women 18 years or older who underwent sentinel lymph node biopsy (SLNB) or axillary lymph node dissection (ALND) for invasive breast cancer from January 1, 2012, to December 31, 2020. Patients were identified using the National Cancer Database (NCDB), a clinical oncology database encompassing approximately 70% of new cancer diagnoses, sourced from hospital registry data from 1317 facilities. Statistical analysis was performed from October 2023 to June 2024.

EXPOSURE: Sentinel lymph node biopsy or ALND.

MAIN OUTCOMES AND MEASURES: Reliability-adjusted facility-level lymph node yield and nodal positivity rate for each procedure were calculated using generalized linear mixed models, Poisson regression, and logistic regression with facility-level random intercepts.

RESULTS: The cohort included 1 201 317 women with a median age of 62 years (IQR, 53-70 years). Facility-level nodal yield ranged from 1 to 6 for SLNB and from 6 to 22 for ALND. Median facility-level nodal yield for SLNB was 2.6 (IQR, 2.3-3.0) and the nodal positivity rate for SLNB was 12.2% (IQR, 11.0%-13.7%), with rates ranging from 6% to 21%. A weak correlation between facility-level lymph node yield and nodal positivity was observed (Spearman correlation coefficient, 0.17). Median nodal upstaging rate (≥4 positive nodes) for ALND was 30.5% (IQR, 26.5%-35.0%), with rates ranging from 11% to 54%; median nodal yield was 12.2 (IQR, 10.9-13.6). A strong correlation between nodal yield and nodal upstaging rates was observed (Spearman correlation coefficient, 0.53).

CONCLUSIONS AND RELEVANCE: In this cohort study of women undergoing axillary surgery for invasive breast cancer, facility-level variation in lymph node yield was present for both SLNB and ALND, which could potentially be improved through the ACS operative standards. However, this variation had mixed associations with nodal positivity and upstaging rates, suggesting the association of the ACS operative standards with oncologic outcomes may be mixed.

PMID:39565622 | DOI:10.1001/jamanetworkopen.2024.46345

Categories
Nevin Manimala Statistics

Inpatient Dermatology referrals: What is the burden? A retrospective review of 14 years of dermatology inpatient referrals

Clin Exp Dermatol. 2024 Nov 20:llae498. doi: 10.1093/ced/llae498. Online ahead of print.

ABSTRACT

BACKGROUND: The lack of dermatological knowledge by non-dermatologists is exposed by the increasing number of requests made for inpatient dermatological consultations. Patients have been commenced on inappropriate treatment because of poor dermatology training.

OBJECTIVES: To determine the burden and accuracy of inpatient dermatology referrals.

METHODS: A retrospective cohort study using paper inpatient dermatology referrals from one Health Board between June 2007 and July 2021. Data analysis included timing of referrals; referring speciality; diagnosis and treatment. Descriptive statistics, using Excel, were used for analyses.

RESULTS: The average number of referrals per year was 106 (79-166). The most frequent day of referral was Monday (26%). Most referrals were from medical teams (73%).Differential diagnosis was suggested by the referring team in 59% of referrals. In only 29% of referrals the dermatology team agreed with the differential diagnosis. There was discrepancy in the correctness of diagnosis in all categories, however the paediatricians were most likely to offer a correct differential (44%). In 44% of referrals treatment was commenced by the referring team, most commonly antibiotics.

CONCLUSIONS: There is an extra burden on dermatology teams to cover inpatients. Our figures highlight two important issues – the need for better dermatological education in medical schools to improve diagnosis accuracy and management of conditions as well as the need to recognise the need for an inpatient dermatology service to review inpatient referrals and advise in diagnosis and management of dermatology cases on the wards, and to protect the service from being uncoupled from the main hospital.

PMID:39565592 | DOI:10.1093/ced/llae498