Categories
Nevin Manimala Statistics

Usability and Usefulness of Machine Learning-Based Clinical Decision Support Software in Primary Care: Survey of Users in a Prospective Observational Study

JMIR Med Inform. 2026 May 27;14:e80527. doi: 10.2196/80527.

ABSTRACT

BACKGROUND: The successful implementation of decision support systems promises to enhance high-quality care. However, the successful implementation of a clinical decision support system (CDSS) depends on user acceptance and adoption. A machine learning (ML)-based CDSS to assist primary care professionals treating urinary tract infections (UTIs) was implemented, and usability and usefulness were assessed through a questionnaire.

OBJECTIVE: This study aimed to assess the system’s usability by examining users’ experiences with the software. A secondary goal was to assess users’ attitudes toward evidence-based practice and innovation in health care.

METHODS: In collaboration with the Netherlands Institute for Health Services Research (NIVEL) and Leiden University Medical Center (LUMC), Pacmed Ltd developed the CDSS. The cohort was mostly recruited at the care group level; practices within participating care groups were required to participate. Health insurers partly funded the research. Practitioners participated in the implementation study for 4 months. A survey based on the Unified Theory of Acceptance and Use of Technology (UTAUT) was sent to 263 general practitioners and assistants shortly after the implementation period. Furthermore, usage data were analyzed.

RESULTS: Of the 34 participating practices that used the software, 30 (88%) submitted at least one survey response, with a mean of 2.23 responses per practice (SD 1.43). The CDSS was used throughout the pilot period, and 31 practices continued using the tool, with 9% dropping out during the first 8 weeks. Sixty-seven percent of respondents trusted the tool’s output, and 73% found it understandable how the algorithm came to predictions. Sixty-five percent of respondents indicated that the information provided was useful in addition to the available guidelines, and 52% agreed that it supported their decision-making. However, many respondents were uncertain whether the tool improved patient care (46%) or patient outcomes (66%). Forty-eight percent of respondents found the software easy to integrate into their clinical workflow.

CONCLUSIONS: The CDSS was perceived as trustworthy and easy to use. However, users were unable to determine whether the CDSS improved patient outcomes. In addition, the CDSS development could have benefited from including assistants as well as general practitioners more in the design phase of the software. Because assistants play an important role in UTI care, designing the software to better fit existing workflows may reduce the perceived time investment associated with using the tool. Finally, respondents reported strong motivation to contribute to further research in this field and indicated willingness to embrace change in health care delivery, which may also reflect selection bias in our sample.

PMID:42202295 | DOI:10.2196/80527

Categories
Nevin Manimala Statistics

Multimodal Prediction of Renal Tumor Malignancy From Radiology Reports and Structured Electronic Health Records: Retrospective Cohort Study

JMIR Med Inform. 2026 May 27;14:e84396. doi: 10.2196/84396.

ABSTRACT

BACKGROUND: Accurate preoperative prediction of renal tumor malignancy is critical for guiding decisions and reducing overtreatment, as a substantial proportion of renal masses prove benign. Although radiology assessments and structured electronic health record (EHR) data are routinely used, many tumor-specific descriptors remain embedded in free-text radiology reports and are underused due to extraction challenges.

OBJECTIVE: This study aimed to develop and evaluate a multimodal pipeline that integrates structured EHR variables with natural language processing features from computed tomography (CT) radiology reports, including large language model (LLM)-extracted abnormality characteristics and transformer-based report embeddings, to improve malignancy prediction.

METHODS: We conducted a retrospective cohort study using University of Florida Health Integrated Data Repository Observational Medical Outcomes Partnership-mapped EHR data from December 2011 to August 2024. Adults with renal tumors were included if they had longitudinal diagnostic documentation consistent with a renal mass and at least 1 preoperative renal CT report; final benign or malignant status served as the outcome. Structured features included demographics, comorbidities, medications, vital signs, and laboratory measurements. From the recent preindex CT report, an on-premises LLM isolated kidney-specific findings and extracted abnormality characteristics. Four locally deployed LLMs were evaluated against manual annotations of 500 reports. Kidney-specific text was encoded using pretrained biomedical transformer models, including radiology Bidirectional Encoder Representations from Transformers (BERT) variants. We evaluated unimodal baselines and multimodal early, middle, and late fusion strategies. Model development used 5-fold cross-validation within the 80% training partition; each fold-specific model was evaluated on the same independent 20% held-out test set, with performance reported as mean and SD across the 5 held-out test evaluations. The primary metric was area under the receiver operating characteristic curve (AUC).

RESULTS: The final cohort included 967 patients (n=712, 73.6% malignant). In extraction evaluation, Qwen2.5-32B achieved 88.3% overall accuracy with a 100% extraction success rate and was selected for downstream feature generation. Among unimodal models, the structured clinical variable model achieved an AUC of 0.758 (SD 0.012), kidney-specific text with radiology BERT achieved an AUC of 0.746 (SD 0.058), and abnormality characteristics alone achieved an AUC of 0.716 (SD 0.015). Multimodal fusion models achieved higher descriptive performance than unimodal models. Early fusion achieved the highest AUC (mean 0.813, SD 0.008), and F1-score (mean 0.809, SD 0.030), while late fusion achieved an AUC of 0.805 (SD 0.016). Ablation and interpretability analyses suggested complementary predictive information from structured clinical variables and kidney-specific text embeddings.

CONCLUSIONS: Integrating unstructured radiology report text with structured EHR variables achieved higher mean predictive performance than unimodal approaches in descriptive comparisons. Multimodal fusion, particularly early fusion incorporating radiology BERT-derived kidney-specific text embeddings, achieved the strongest discrimination, suggesting potential value of natural language processing-enabled multimodal EHR pipelines for informing preoperative risk stratification.

PMID:42202288 | DOI:10.2196/84396

Categories
Nevin Manimala Statistics

Developing Customized Personas to Capture Intrinsic Capacity Profiles and Digital Monitoring Intentions in Older Adults: Mixed Methods Study

JMIR Aging. 2026 May 27;9:e82867. doi: 10.2196/82867.

ABSTRACT

BACKGROUND: Integrated Care for Older People (ICOPE), focused on monitoring and optimizing the intrinsic capacity (IC) of older adults, is a new model of geriatric care that is currently being accelerated globally. Digital health technologies are recommended for longitudinal IC monitoring to provide precise and timely interventions. However, little is known about the psychological intentions of engaging in digital monitoring of IC according to the profile heterogeneity of IC among older adults.

OBJECTIVE: This study aims to map a set of customized personas to capture the profiles of IC and match psychological intentions that support personalized digital IC monitoring.

METHODS: An explanatory sequential mixed methods study was conducted at 16 sites in Beijing, China. Older adults aged ≥60 years (n=481) were selected to complete the quantitative survey. Latent profile analysis, descriptive statistics, and logistic regression analyses were performed to cluster subgroups using Mplus (Muthén & Muthén) and SPSS (IBM Corp). A subsample of participants from each profile (n=25) was purposively sampled for semistructured interviews. An inductive-deductive content analysis was used to identify similar attributes and to affirm the personas gradually. A joint statistical and thematic visualization method was used to integrate the customized personas.

RESULTS: Three profiles of IC patterns emerged: “multisubdomain decline-IC imbalance group,” “multisubdomain moderate-sensory deficit group,” and “multisubdomain robust-whole balance group.” The distribution of latent profiles was influenced by age, education, monthly per capita household income, self-rated health, and number of chronic diseases, while positively impacting older adults’ functional ability. The following customized personas were captured regarding established themes: “affects my mood-anxious evader,” characterized by avoidance and anxiety, low digital interest, and perceived social isolation; “capitalize on what comes-accommodative adopter,” pragmatically oriented toward disease detection, with moderate digital openness but limited self-efficacy; and “more autonomy-active improver,” who exhibited proactive engagement, high digital literacy, and motivation rooted in self-management and social participation.

CONCLUSIONS: This study is the first to integrate latent profile analysis with customized qualitative personas to link the heterogeneity of IC with the psychological intentions underlying digital monitoring. The resulting personas model provides an actionable framework for tailoring digital IC monitoring strategies in community-based integrated care. The findings emphasize the need to align monitoring approaches with older adults’ IC characteristics, psychological readiness, digital literacy, and social support to enhance engagement in digital IC monitoring.

PMID:42202287 | DOI:10.2196/82867

Categories
Nevin Manimala Statistics

Nonlinear kernel-based high-dimensional inference for set-based genetic association studies

Brief Bioinform. 2026 May 4;27(3):bbag275. doi: 10.1093/bib/bbag275.

ABSTRACT

Nonlinear genetic architectures, including epistasis and threshold effects, are increasingly recognized as contributors to complex disease risk, yet most existing SNP-set association tests rely on linear modeling assumptions, resulting in reduced power and unstable inference when genetic effects are nonlinear or heterogeneously distributed across variants. To address this limitation, we propose a nonlinear high-dimensional inference framework for set-based genetic association analysis that integrates scalable kernel representations with valid statistical inference. The framework combines distance correlation-based sure independence screening to reduce ultra-high dimensional predictors, kernel principal component analysis with Nyström approximation for nonlinear feature extraction, and de-sparsified LASSO to enable asymptotically valid hypothesis testing in high dimensions, together with a two-stage omnibus testing strategy that adaptively aggregates evidence across complementary signal models. Extensive simulation studies demonstrate that the proposed method maintains well-calibrated Type I error and consistently achieves higher power than established set-based approaches, including Sequence Kernel Association Test and adaptive Sum of Powered Score test, particularly under nonlinear and heterogeneous genetic effect scenarios, while remaining competitive in linear settings. Application to Alzheimer’s Disease Neuroimaging Initiative data identifies gene-level associations with brain regional volumes that converge on neuronal excitability, calcium signaling, and cytoskeletal regulation, biological processes centrally implicated in neurodegeneration. Together, this work provides a robust and scalable framework for nonlinear set-based inference in genome-wide studies, expanding the analytical toolbox for dissecting complex genetic contributions to disease.

PMID:42202283 | DOI:10.1093/bib/bbag275

Categories
Nevin Manimala Statistics

ceQTL: a co-expression QTL model to detect a variant that affects transcription factor binding and its target regulation

Brief Bioinform. 2026 May 4;27(3):bbag258. doi: 10.1093/bib/bbag258.

ABSTRACT

Expression quantitative trait locus (eQTL) mapping is used to identify the functional link between a genomic variant and a gene’s expression. A significant eQTL association does not mean a causal relationship or mechanism, and further investigation is needed to understand how a single-nucleotide polymorphism (SNP) impacts gene expression. One of the most plausible explanations for eQTL is that a genomic variant affects transcription factor (TF) binding and thus impacts its regulation on target genes (TGs). However, the current eQTL model does not provide information on the TF and how its regulation is mediated by the SNP’s genotypes. Here, we propose a new method called differential co-expression QTL (ceQTL) among different alleles using Chow statistics to specifically detect eQTLs that are bound by a particular TF. We start with building a trio of TF, its TG, and related SNP, and then test the significant coefficient difference among different genotypes of the SNP. We applied this ceQTL model to simulated data and the lung tissue datasets from the genotype-tissue expression project. The simulated data results showed that the model was robust to detect true ceQTLs at variable sample sizes and different minor allele frequencies as measured by area under the curve. Our tool also performed a TF binding affinity analysis to add another layer of evidence for functional interpretation. In summary, ceQTL analysis provides a more interpretable and biological insight into the mechanism of eQTL and transcriptomic regulation, which would help us better understand how genomic variants affect phenotypes and diseases.

PMID:42202282 | DOI:10.1093/bib/bbag258

Categories
Nevin Manimala Statistics

The impact of transcriptome assembly algorithms on downstream quantification in RNA-seq data analysis

Brief Bioinform. 2026 May 4;27(3):bbag267. doi: 10.1093/bib/bbag267.

ABSTRACT

Transcriptome assembly and quantification are crucial steps in the differential expression analysis of RNA-seq data. As transcriptome assembly precedes quantification, its results inevitably influence the outcomes of quantification. This study investigates the impact of transcriptome assembly algorithms on quantification outcomes in next-generation RNA-seq data analysis. From the perspective of quantification results, we evaluate the performance of transcriptome assembly algorithms. We assess the assembly quality and stability of three commonly used transcriptome assemblers-StringTie2, Scallop, and Cufflinks-on both simulated and real datasets. Our evaluation provides references for downstream analyses and identifies the most effective and stable pipeline, which is specifically the pipeline combining HISAT2 (for transcriptome alignment) and StringTie2 (for assembly). Furthermore, we compare simulated data generated by RNA-seq data simulation tools with real RNA-seq data and reveal that simulated data fails to fully capture the complexity of real data. Through this analysis, we identify transcript features associated with poor assembly and quantification performance, specifically highlighting two extreme cases: long, low-expression transcripts that are often overlooked and short transcripts that are prone to quantification errors. These findings offer valuable insights into future software development directions.

PMID:42202281 | DOI:10.1093/bib/bbag267

Categories
Nevin Manimala Statistics

Detection of Microbehavior Intervals for Predicting Mental Health: Clinically Relevant and Advanced Multimodal Temporal Analysis

J Med Internet Res. 2026 May 27;28:e87049. doi: 10.2196/87049.

ABSTRACT

BACKGROUND: Health care workers (HCWs) face sustained psychological demands that place them at heightened risk for burnout and posttraumatic stress disorder (PTSD). However, assessing psychological distress in this population remains challenging because of stigma, underreporting, and the limitations of self-report tools. Although nonverbal behaviors such as facial expressions and gaze hold diagnostic promise, most approaches overlook the fine-grained, temporal fluctuations in these signals. In this study, we focused on microbehavior intervals-brief, involuntary changes in multimodal nonverbal signals-that emerge during emotion-eliciting interviews.

OBJECTIVE: This study aimed to determine whether microbehavior intervals improve the discrimination of psychological distress profiles among HCWs with symptoms of burnout and PTSD.

METHODS: HCWs participated in a semistructured interview that included 5 work-related, emotionally charged questions and that was recorded via Webex (online video platform). Participants also completed validated questionnaires for burnout (Maslach Burnout Inventory General Survey 9-item) and PTSD (PTSD checklist for Diagnostic and Statistical Manual, 5th edition). Recordings were analyzed with computer vision models to generate time-series data of facial expressions, head movement, gaze, body posture, and hand gestures. An unsupervised anomaly detection model (MOMENT [a Family of Open Time-Series Foundation Models]) isolated microbehavior intervals without requiring manual labels. Features derived from these intervals were used to train a deep learning classifier that predicted 4 symptom classes of psychological distress: “moderate-severe burnout,” “subthreshold-provisional PTSD,” “burnout+PTSD,” and “resilient.” We conducted an ablation study by systematically removing one behavioral data stream at a time. Finally, we conducted an explainability analysis to characterize the features driving model predictions.

RESULTS: We analyzed 258 interview recordings from 151 HCWs. Per interview, an average of 19.65 (SD 6.01) microbehavior intervals were detected, each lasting an average of 1.31 (SD 1.10) seconds. The classifier demonstrated robust performance across classes, achieving a macro- F1-score of 0.75 and a macro area under the receiver operating characteristic curve of 0.80 on held-out data. Ablation analysis showed that excluding gaze or arousal-valence signals caused the largest performance declines, particularly in recall and F1-score. The explainability analysis revealed distinct temporal patterns across symptom classes, with irregularity and variability in microbehaviors emerging as key predictors.

CONCLUSIONS: Focusing on microbehavior intervals yields a scalable, interpretable, and annotation-free framework for detecting psychological distress from nonverbal signals. By moving from whole-video features to fine-grained multimodal temporal modeling, we successfully captured subtle, involuntary fluctuations in nonverbal responses to emotion-eliciting questions. This multimodal approach enables an objective, robust, and explainable assessment of psychological distress and offers a promising complement to conventional psychometric assessments.

PMID:42202278 | DOI:10.2196/87049

Categories
Nevin Manimala Statistics

Cross-national differences in stroke management in the Baltic states: analysis within the Stroke Action Plan for Europe framework

Eur Stroke J. 2026 May 6;11(5):aakag050. doi: 10.1093/esj/aakag050.

ABSTRACT

INTRODUCTION: Although epidemiological studies often group the Baltic states together, they differ significantly in national stroke care legislation and infrastructure. Our study aimed to explore and compare the current state of stroke care in Lithuania, Latvia and Estonia.

PATIENTS AND METHODS: We analysed the Stroke Action Plan for Europe (SAP-E) Stroke Service Tracker data from 2022, including data from the respective National Health Insurance Funds and direct centre-level queries. Geographic Information System-based modelling assessed population access to stroke-ready hospitals within 1 h. Key metrics, including hospitalised stroke incidence, stroke unit admission, recanalisation therapy and in-hospital as well as 30-day mortality, were compared using Z-tests for proportions.

RESULTS: The hospitalised stroke incidence per 100,000 inhabitants was similar in Lithuania (353) and Latvia (354), but lower in Estonia (246), despite similar population structures. Lithuania had the highest proportion of its population (94.0%) with access to a stroke-ready hospital within 1 h, followed by Latvia (87.1%) and Estonia (84.7%, P < .001). Estonia had the highest proportion of stroke unit admission rates and the lowest mortality rates-9.6% (in-hospital) and 15.0% (30-day) for ischaemic stroke. Endovascular treatment was most frequent in Lithuania (8.6% of all strokes, P < .001), while Estonia had the highest rate of intravenous thrombolysis (29.0%, P < .001).

CONCLUSIONS: Despite broadly comparable populations and formal SAP-E alignment, the Baltic states exhibit marked differences in stroke access, treatment and outcomes. High stroke unit admissions and high recanalisation rates in Estonia may be associated with lower ischaemic stroke mortality, underscoring the importance of system design beyond geographic coverage alone.

PMID:42202277 | DOI:10.1093/esj/aakag050

Categories
Nevin Manimala Statistics

Exploring the Feasibility of an Examiner-Worn Neck-Mounted Camera for Objective Structured Clinical Examination Assessment: Pilot Feasibility Study

JMIR Med Educ. 2026 May 27;12:e87483. doi: 10.2196/87483.

ABSTRACT

BACKGROUND: The Objective Structured Clinical Examination (OSCE) is a prevalent method for evaluating clinical competence in medical education. As OSCEs become increasingly standardized and resource intensive, alternative evaluation methods are being explored, particularly because of the limited availability of certified examiners. However, few studies have investigated whether wearable technologies can support OSCE assessment. Wearable devices may provide a means of recording clinical skills from the examiner’s perspective.

OBJECTIVE: This pilot study, conducted in 2024, aimed to investigate the feasibility of using an examiner-worn neck-mounted camera for recording OSCE scenarios and to evaluate the evaluability of clinical performance from the recorded footage.

METHODS: In total, 9 experienced medical educators participated in a simulated OSCE scenario involving electrocardiogram lead placement. All participants completed the initial live assessment and the postuse questionnaire, while 8 of 9 (89%) participants completed the subsequent video-based reassessment. Video recordings from both a fixed camera and a neck-mounted camera (THINKLET) were used to assess the evaluability of each OSCE item. Following a washout period, evaluators reassessed the neck-mounted camera recordings by using the original checklist, while fixed-camera recordings were used to judge the evaluability of each item. Agreement between live and video-based scoring was analyzed using percent agreement and the Cohen κ coefficient. A postevaluation questionnaire captured evaluators’ experiences with the wearable device.

RESULTS: Cohen κ ranged from 0.258 to 0.913 (mean 0.67, SD 0.20). Across checklist observations, more items were judged to be evaluable in the neck-mounted camera recordings than in the fixed-camera recordings, particularly for tasks requiring observation of fine motor skills. Evaluators reported generally positive experiences with the device, although some noted issues related to audio quality, comfort, posture restriction, and limited visibility at low angles.

CONCLUSIONS: Although further investigation is needed, this pilot study suggests that an examiner-worn neck-mounted camera may be a valuable supplementary assessment tool for selected OSCE tasks. Further work is needed to refine the device, standardize recording protocols, and clarify how it can best support review and verification alongside live evaluation.

PMID:42202275 | DOI:10.2196/87483

Categories
Nevin Manimala Statistics

Reducing Mis-triage in Emergency Departments (RemEDy): Protocol for Improving Triage Accuracy Through Real-time Evaluation and Artificial Intelligence

JMIR Res Protoc. 2026 May 27;15:e92264. doi: 10.2196/92264.

ABSTRACT

BACKGROUND: Mis-triage represents a global concern, with reported rates ranging from 15% to 33%. Understanding its causes and contributing factors is essential for ensuring patient safety. Currently, available studies have mainly focused on evaluating triage systems rather than investigating the human factors affecting triage performance. A major limitation in triage evaluation studies is the lack of standardized criteria to assess patient acuity and the absence of a clear consensus on how to measure triage accuracy. Most studies rely on retrospective data, which often fail to capture real-life clinical complexity. Therefore, the underlying causes and consequences of mis-triage remain partially understood.

OBJECTIVE: This study aims to improve triage by defining the optimal triage evaluation process and identifying clinician-, patient-, and system-level factors that compromise its accuracy and safety.

METHODS: Reducing Mis-Triage in Emergency Departments (RemEDy) will be a 4-phase, mixed methods project conducted across 7 Swiss emergency departments. The first phase will focus on developing a standardized triage evaluation instrument, combining evidence from a scoping review of triage evaluation processes, workshops with triage clinicians using design thinking methodology, and a modified Research and Development-University of California Delphi involving international experts and patient representatives. The second phase will prospectively implement this instrument in real time within a multicenter observational cohort study to evaluate triage performance; quantify mis-triage; and identify predictors at the patient level (eg, demographics), clinician level (eg, training), and system level (eg, crowding and length of stay). The third phase will focus on designing and validating an artificial intelligence-based decision support tool, applying multimodal models that integrate real-time triage data to enhance acuity prediction and minimize human error. The fourth phase will develop and evaluate a targeted training program, guided by the Capability, Opportunity, Motivation, and Behavior model, to strengthen triage accuracy and mitigate cognitive biases.

RESULTS: The project was funded by the Swiss National Science Foundation in March 2025 (grant 10004535). At submission, the scoping review is ongoing and expected to be completed in early 2026. Development and piloting of the triage evaluation instrument will take place in 2026. A multicenter cohort study is planned between October 2026 and June 2027. The intervention study is scheduled between October 2027 and December 2028. Final results are expected in 2029.

CONCLUSIONS: The RemEDy project addresses key limitations of current triage research, including the lack of standardized evaluation methods. By combining expert and clinician consensus; real-time assessment; and multilevel analysis of patient-, clinician-, and emergency department-level factors, RemEDy is expected to provide a more comprehensive understanding of mis-triage and its causes. RemEDy will establish a novel framework for real-time triage evaluation and inform the development of targeted training programs with the potential to improve triage accuracy, safety, and equity.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): PRR1-10.2196/92264.

PMID:42202274 | DOI:10.2196/92264