Categories
Nevin Manimala Statistics

Lung nodule detection and potential impact on guideline-based management: a retrospective post-market evaluation of three commercial software systems

Eur Radiol. 2026 Jun 24. doi: 10.1007/s00330-026-12702-5. Online ahead of print.

ABSTRACT

OBJECTIVES: To evaluate three commercial AI software tools for pulmonary nodule detection and segmentation and to assess their impact on guideline-based management recommendations.

MATERIALS AND METHODS: A total of 740 CT and PET-CT studies from clinical routine were analyzed using three software tools (S1, S2, S3). We compared the total number of detected nodules and “actionable” nodules (per British Thoracic Society (BTS) definition). We further evaluated how measurement variations between tools affected hypothetical management according to Fleischner Society and BTS guidelines for incidental nodules.

RESULTS: The tools differed significantly in the total number of detections (S1: 1336; S2: 1060; S3: 1536; p < 0.001) and wrong findings (S1: 965; S2: 720; S3: 1169; p < 0.001). However, the detection of actionable nodules was comparable across all tools (S1: 375; S2: 341; S3: 373; p = 0.73). While no statistically significant differences were found in mean diameter or volume measurements, small absolute variations led to significant differences in management. Specifically, S2 triggered significantly more 1-year follow-up recommendations than S3 under BTS guidelines (p < 0.001). No significant management differences were observed when applying Fleischner Society guidelines.

CONCLUSION: While the three included AI tools show comparable performance in detecting actionable nodules, minor measurement variations significantly impact downstream management when using guidelines with narrow thresholds, such as the BTS criteria. Fleischner Society guidelines appear more robust to these inter-software variations.

KEY POINTS: Question How do commercial software tools for pulmonary nodule detection perform in real-world settings and impact hypothetical management under BTS and Fleischner guidelines? Findings Detection of actionable nodules was comparable across all tools, but small absolute measurement variations triggered significantly more 1-year follow-up recommendations under BTS guidelines. Clinical relevance AI software can cause inconsistent BTS-based management due to narrow thresholds, while Fleischner criteria appear more stable. Frequent detection of benign lesions potentially poses a risk of overdiagnosis and overtreatment in standalone AI-based reporting.

PMID:42343062 | DOI:10.1007/s00330-026-12702-5

Categories
Nevin Manimala Statistics

From Friends to Lovers: Understanding Motivations and Barriers in AI Companionship

Arch Sex Behav. 2026 Jun 24. doi: 10.1007/s10508-025-03375-0. Online ahead of print.

ABSTRACT

Artificial intelligence (AI) companions are increasingly used for social, emotional, and sometimes romantic fulfillment, raising questions about how people perceive and engage with these technologies. This study explored attitudes toward AI companionship and whether interviewer type (AI, human, or unmoderated) affects disclosure or engagement. A mixed-methods design was employed with 135 adult participants, who completed structured interviews including Likert scale items and open-ended questions about their views on AI companions. Most participants (55.6%) expressed hesitancy or resistance toward emotionally significant relationships with AI, although several reported openness to non-intimate or functional companionship such as friendship or mentoring. Motivations for engaging with AI included its non-judgment and trustworthiness, as well as its ability to provide emotional support and optimize daily life. Key barriers involved its lack of physicality, lack of humanity, incapacity to form an emotional connection, and privacy concerns. Participants in the unmoderated condition rated AI as a friend significantly higher than those in the human interviewer condition, and rated AI as equivalent to a human companion and as a potential romantic partner significantly higher than those in both the chatbot and human interviewer conditions. No statistically significant differences emerged across conditions in self-reported honesty. Participants’ engagement metrics were generally higher in the human and unmoderated interviews compared to AI. These findings offer methodological insights for research on sensitive topics and highlight the complex, context-dependent ways people relate to AI companionship. Implications for policy, clinical practice, and the design of future AI companion systems are discussed.

PMID:42343026 | DOI:10.1007/s10508-025-03375-0

Categories
Nevin Manimala Statistics

How is Bias Learned in Medical Image Analysis Models? An Exploration of the Encoding of Demographic Information in Deep Learning Models Trained to Detect Abnormalities on Chest X-Rays

J Imaging Inform Med. 2026 Jun 24. doi: 10.1007/s10278-026-02073-0. Online ahead of print.

ABSTRACT

Deep learning models achieve strong diagnostic performance in medical imaging, yet often exhibit systematic performance disparities across demographic subgroups. Although prior work has shown that attributes such as age, sex and race are encoded within internal representations, it remains unclear how the structure of these representations contributes to subgroup-level differences in prediction behaviour. This study aims to examine how demographic information is embedded in chest X-ray classifiers and how latent-space structure relates to observed sensitivity disparities. We analysed two large-scale chest X-ray datasets, CheXpert and MIMIC-CXR, using DenseNet-121 models trained for multi-label disease classification. In addition to standard output-level evaluation, we conducted representation-level analyses using linear probes, embedding statistics and geometric measures to characterise subgroup differences in activation strength, latent-space proximity and model confidence. Disparities were assessed across age, race and sex by jointly examining feature encodings, logits, energy scores and true-positive rates. Demographic attributes showed limited direct association with disease labels and low standalone predictive utility, yet were strongly encoded within internal features. Younger and Black/African American patients consistently exhibited higher feature norms, greater separation in latent space and lower joint logit energy, despite comparable overall discrimination performance. These representational patterns persisted after accounting for label configuration and were associated with larger sensitivity gaps, consistent with structural suppression in which certain subgroups occupy sparser, lower-activation regions of the representation space. Sex-based differences were comparatively modest across representational and performance metrics. Subgroup disparities in chest X-ray classification are closely linked to how demographic groups are positioned and activated within latent space, rather than to directional misalignment alone. Representation-level diagnostics based on activation magnitude, density and energy provide mechanistic insight into model behaviour and highlight limitations of mitigation strategies that focus solely on feature removal or post hoc thresholding. These findings support the use of representation-level analysis as a principled component of fairness evaluation and mitigation design in clinical AI systems.

PMID:42343005 | DOI:10.1007/s10278-026-02073-0

Categories
Nevin Manimala Statistics

Hierarchical endpoints and win statistics for geromedicine trials

Nat Aging. 2026 Jun 24. doi: 10.1038/s43587-026-01158-3. Online ahead of print.

ABSTRACT

Geroscience has advanced rapidly, yet its clinical translation remains limited. A central barrier is the lack of trial outcomes that capture the multidimensional effects of geroprotective interventions while meeting clinical and regulatory standards. Mortality is objective and regulatorily salient but often impractical. By contrast, surrogate measures of healthspan improve feasibility and may better reflect the quality of extended life, but they are generally considered soft endpoints that require further validation. Here, we propose hierarchical composite endpoints using time-to-worst-event analysis as a pragmatic and scientifically sound compromise. Participant pairs are compared using win statistics according to a prespecified clinical hierarchy, in which more severe and objective clinical events are prioritized, while health surrogates and biomarkers contribute information at lower tiers. When outcome selection, ordering and tie rules are clinically and mechanistically justified and agreed with regulators, this approach may improve geromedicine trial efficiency and allow overall treatment effects to be captured without compromising clinical priorities.

PMID:42342910 | DOI:10.1038/s43587-026-01158-3

Categories
Nevin Manimala Statistics

Predicting the strength of waste aggregate concrete blocks using novel hybrid machine learning models and graphical user interface deployment

Sci Rep. 2026 Jun 25. doi: 10.1038/s41598-026-58662-0. Online ahead of print.

ABSTRACT

Concrete blocks made from waste aggregates have become a promising way to reduce waste and conserve natural resources while still offering good mechanical performance in both solid and hollow concrete blocks. This is especially relevant as more sustainable construction projects increasingly use recycled and alternative materials. This research develops a series of novel hybrid machine learning (ML) models to accurately predict compressive strength, using a dataset of 544 concrete samples from various sources. The six novel hybrid ML frameworks are designed as Hybrid Stacked Ensemble (HSE), Hybrid Residual Learning (HRL), Hybrid Weighted Ensemble (HWE), Hybrid Meta-Learning (HML), Hybrid Bayesian Stacking (HBS), and Hybrid Feature Fusion (HFF). Results show that novel Hybrid Bayesian Stacking (HBS) algorithms deliver excellent predictive accuracy across all evaluation metrics, with (R2 = 0.998, RMSE = 0.665) during training and (R2 = 0.987, RMSE = 1.836) during testing. Furthermore, Individual Conditional Expectation (ICE) and SHapley Additive exPlanations (SHAP) analyses identified important input features and their effects on compressive strength. A graphical user interface (GUI) was developed to make predictive models accessible for practical engineering.

PMID:42342892 | DOI:10.1038/s41598-026-58662-0

Categories
Nevin Manimala Statistics

Patient-reported outcomes of laser hair removal for hidradenitis suppurativa: an exploratory cross-sectional survey

Lasers Med Sci. 2026 Jun 25;41(1):131. doi: 10.1007/s10103-026-04893-6.

ABSTRACT

PURPOSE: Hidradenitis suppurativa (HS) is a chronic, debilitating skin disease often requiring multimodal therapy. Laser hair removal (LHR) is an emerging treatment option, yet patient-centered data is limited. This study aimed to assess patient perspectives on the effectiveness, safety, motivations, and barriers associated with LHR for HS.

METHODS: An anonymous cross-sectional online survey was administered via REDCap (July-December 2024) to adults with HS living in the United States. Respondents reported prior treatments, LHR parameters, outcomes, adverse effects, and barriers. Descriptive statistics were used.

RESULTS: Of 110 participants who completed the survey (110/132, 83%), 24 (22%) had used LHR and comprised the analytic cohort, with a median of 8 LHR sessions (IQR 6-12). Leading motivations included reducing inflammation (92%), relieving pain (75%), and seeking durable treatment (71%). Highest median improvements (score 4, IQR 4-5) were in lumps/abscesses, swelling, and flare frequency. Other symptoms, including pain, odor, and quality of life, also showed moderate improvement. Half reported benefits lasting over 12 months. While biologics were perceived as most effective (median 4.5, IQR 3.5-5), LHR received one of the highest median scores among non-biologic options (3.5, IQR 3-5). Barriers included cost, insurance limitations, and low awareness; 63% paid over $1,000, and 38% discontinued early. Common adverse effects included discomfort (71%) and transient erythema (46%).

CONCLUSION: Most patients perceived LHR as beneficial for HS, but affordability and awareness remain barriers. Findings highlight the need for payer advocacy and additional trials defining LHR’s role in HS management.

PMID:42342886 | DOI:10.1007/s10103-026-04893-6

Categories
Nevin Manimala Statistics

Impact of personalized coaching on the use of digital health interventions for movement therapy in rheumatology: a randomized controlled trial

Sci Rep. 2026 Jun 24;16(1):19582. doi: 10.1038/s41598-026-59770-7.

ABSTRACT

Spondyloarthropathies (SpA) are characterized by low back pain and limited mobility. Therefore, physical activity (PA) is an essential part of the treatment, yielding positive effects on clinical symptoms. Digital health applications (DHAs) present new opportunities to promote clinical outcomes, however, their long-term effectiveness is often limited by low adherence and high dropout rates.This study investigates whether integrating personalized or AI-driven coaching enhances the therapeutic benefits of DHA in patients with SpA. SpAs patients were randomized into one of 3 groups. They were instructed to exercise at least 2-3 times per week for 6 months with the DHA according to their group (intervention groups: ViViRA (with personal coaching) or Kaia Health (with AI-based coaching); control group: ViViRA (without coaching)). Personal coaching consisted of a one-time, 30-min online coaching session prior to using DHA, while the AI coaching consisted of video-based AI integrated into DHA to provide movement guidance during each session. At baseline, after 3 and 6 months sociodemographic, questionnaires and mobility were assessed. Data from 78 participants were analyzed (mean age 51 years; 68% female). All three digital interventions showed a significant improvement in mobility (Bath Ankylosing Spondylitis Metrology Index (BASM), range: 0-10, lower scores = better mobility; BL-3 month: mean BASMI change – 0.6 to – 0.7; all p < 0.001). Pain intensity decreased substantially in all arms (PainDETECT, neuropathic pain, range: 0-38, higher scores = more severe pain; BL-6 month: mean reduction – 4.6 to – 6.6 points; all p ≤ 0.006). PAHCO (Physical Activity-related Health Competence) control competence increased over time and reached statistical significance only in the ViViRA + coaching group (PAHCO: higher scores = better physical activity-related health competence; BL-6 month: + 1.02, p = 0.013) but did not exceed the other interventions in a direct comparison. Overall, none of the coaching strategies showed significant superiority over the stand-alone digital therapy. Adherence was the same in all groups after 3 months (2-3 weekly use of DHA). Digital movement therapy with the use of DHA improves mobility and pain independently of coaching in SpAs patients. In contrast, personal coaching has been shown to improve health-related skills which could indicate potential benefits for self-management and long-term treatment adherence.Trial registration The study is registered in the German clinical trial registry (DRKS) under the following ID: DRKS00035191, https://www.drks.de/search/de/trial/DRKS00035191/details, Registration date: 01.10.2024.

PMID:42342871 | DOI:10.1038/s41598-026-59770-7

Categories
Nevin Manimala Statistics

Maternal and obstetric determinants of prematurity and term low birth weight in Afghanistan: a hospital-based case-control study

Sci Rep. 2026 Jun 24. doi: 10.1038/s41598-026-58012-0. Online ahead of print.

ABSTRACT

Prematurity and term low birth weight (LBW) are important contributors to neonatal morbidity and mortality in Afghanistan. Evidence from hospital-based studies in Herat remains limited, particularly studies that distinguish prematurity from term LBW. This study aimed to identify maternal, socioeconomic, obstetric, and pregnancy-related factors associated with prematurity and term LBW among newborns delivered in Herat, western Afghanistan. An unmatched hospital-based case-control study was conducted at Herat Midwifery Hospital from June 15 to September 15, 2023. The study included 176 premature infants, 84 term LBW infants, and 290 full-term normal-birth-weight controls. Prematurity was defined as birth before 37 completed weeks of gestation, and term LBW was defined as birth at ≥ 37 completed weeks with birth weight < 2500 g. Data were collected from hospital records and maternal interviews. Separate adjusted binary logistic regression models were used to estimate odds ratios (ORs) and 95% confidence intervals (CIs) for prematurity and term LBW compared with full-term normal-birth-weight controls. Statistical significance was set at p < 0.05. Prematurity was associated with medium perceived economic status (OR = 2.21, 95% CI: 1.04-4.73), bad perceived economic status (OR = 5.16, 95% CI: 2.06-12.91), preeclampsia (OR = 5.98, 95% CI: 1.72-20.76), pregnancy-related health problems (OR = 13.76, 95% CI: 5.32-35.61), substance use during pregnancy (OR = 2.88, 95% CI: 1.11-7.45), and cesarean section (OR = 3.37, 95% CI: 1.99-5.73). Term LBW was associated with medium perceived economic status (OR = 3.98, 95% CI: 1.29-12.30), bad perceived economic status (OR = 19.62, 95% CI: 5.09-75.63), pregnancy-related health problems (OR = 8.88, 95% CI: 2.46-32.02), and cesarean section (OR = 9.76, 95% CI: 4.85-19.65). Poor perceived economic status and pregnancy-related health problems were associated with both prematurity and term LBW. Preeclampsia and substance use were associated mainly with prematurity. Cesarean section should be interpreted as a marker of high-risk obstetric conditions rather than as a direct causal factor. These findings support strengthening antenatal risk detection, management of pregnancy complications, and targeted maternal health interventions in Herat.

PMID:42342870 | DOI:10.1038/s41598-026-58012-0

Categories
Nevin Manimala Statistics

Progression of hindfoot valgus and its association with foot- and ankle-related quality of life in patients with rheumatoid arthritis: a retrospective study from KURAMA cohort

Sci Rep. 2026 Jun 24. doi: 10.1038/s41598-026-55004-y. Online ahead of print.

ABSTRACT

To clarify the impact of lower limb and hindfoot alignment and its changes on foot and ankle-related quality of life (QOL) over a 4-year period in patients with rheumatoid arthritis (RA). A total of 258 RA patients (516 feet) who underwent plain X-ray examination with hip-to-calcaneal (HC) view at baseline and a 4-year follow-up, along with Self-Administered Foot Evaluation Questionnaire (SAFE-Q) data at the follow-up were analyzed after excluding patients with prior lower limb surgery or severe ankle destruction (Larsen classification ≥ III or Takakura-Tanaka classification ≥ IIIa). Radiographic parameters representing lower limb and hindfoot alignment were measured using HC view, including hip-knee-ankle angle (HKA), tibio-calcaneal angle (TCA), talar tilt angle (TTA), and the changes of these angles. Clinical and laboratory factors collected included age, sex, BMI, autoantibody titer and positivity, methotrexate (MTX) use, biologic and targeted synthetic disease-modifying antirheumatic drugs (b/tsDMARDs) use, cumulative glucocorticoid dose, and Clinical Disease Activity Index. The primary outcome was the association between clinical and radiographic factors and ankle-related QOL. A generalized linear mixed model was used for statistical analysis. The mean age was 62.4 years, 87.2% were female, and 89.9% were seropositive. Over 4 years, hindfoot valgus (TCA) progressed from 4.3° to 6.0°. GLMM showed that age and cumulative glucocorticoid dose negatively affected QOL, while male sex, methotrexate dose, and b/tsDMARDs use were positively associated. Among radiographic parameters, valgus progression of TCA was significantly associated with poorer SAFE-Q outcomes in the “Shoe-related” and “General Health Perception” domains. Baseline HKA predicted valgus progression of TCA, whereas higher BMI, male sex, and larger baseline TCA predicted varus progression. Progressive hindfoot valgus deformity over 4 years, rather than static alignment, negatively impacts foot- and ankle-related QOL in RA patients, particularly in shoe-related function and general health perception. Baseline knee varus deformity predicts longitudinal hindfoot valgus progression.

PMID:42342815 | DOI:10.1038/s41598-026-55004-y

Categories
Nevin Manimala Statistics

Processing efficiency predicts cognitive performance in aging

Sci Rep. 2026 Jun 24. doi: 10.1038/s41598-026-59021-9. Online ahead of print.

ABSTRACT

Cognitive decline is a central challenge of aging, with subtle early changes laying the foundation for broader difficulties later in life. One domain that is particularly challenging to capture with standard assessments is processing efficiency. Previous research has shown age-related differences in processing efficiency using redundant-target detection tasks, but it remains unclear whether individual differences in cognitive ability within the older adults are associated with differences in processing efficiency. In the present study, 65 cognitively healthy older adults (aged 60-79) completed the Montreal Cognitive Assessment (MoCA) and a color-shape redundant-target detection task, from which we estimated resilience capacity (Rz), a processing efficiency metric that quantifies how well a system maintains its target processing speed in the presence of distractors, using Systems Factorial Technology (SFT). MoCA scores were significantly and positively correlated with the standardized resilience capacity summary, Rz (r = 0.35, 95%CI [0.12, 0.55], p = 0.004). This significant association persisted in a partial correlation analysis that controlled for age as a covariate (partial r = 0.35, p = 0.004). In a direct model comparison of four candidate processing-efficiency metrics – inverse efficiency scores (IES), redundancy gains (RG), mean RT of correct responses, and Rz – Rz was the strongest predictor of MoCA. Functional principal component analysis (fPCA) of R(t) identified a temporal component (PC2) on which individuals at the higher end of the MoCA distribution showed a later, more controlled rise in capacity, whereas those at the lower end showed earlier but less efficient processing. Together, these findings indicate that processing efficiency metric – and specifically resilience capacity under distractor interference – is continuously related to cognitive performance in older adults and may reflect aspects of cognitive reserve not captured by global screening scores or summary-statistic-based efficiency measures.

PMID:42342801 | DOI:10.1038/s41598-026-59021-9