Categories
Nevin Manimala Statistics

Using AI-Based Virtual Simulated Patients for Training in Psychopathological Interviewing: Cross-Sectional Observational Study

JMIR Med Educ. 2025 Dec 23;11:e78857. doi: 10.2196/78857.

ABSTRACT

BACKGROUND: Virtual simulated patients (VSPs) powered by generative artificial intelligence (GAI) offer a promising tool for training clinical interviewing skills; yet, little is known about how different system- and user-level variables shape students’ perceptions of these interactions.

OBJECTIVE: We aim to study psychology students’ perceptions of GAI-driven VSPs and examine how demographic factors, system parameters, and interaction characteristics influence such perceptions.

METHODS: We conducted a total of 1832 recorded interactions involving 156 psychology students with 13 GAI-generated VSPs configured with varying temperature settings (0.1, 0.5, 0.9). For each student, we collected age and sex; for each interview, we recorded interview length (total number of question-answer turns), number of connectivity failures, the specific VSP consulted, and the model temperature. After every interview, students provided a 1-10 global rating and open-ended comments regarding strengths and areas for improvement. At the end of the training sequence, they also reported perceived improvement in diagnostic ability. Statistical analyses assessed the influence of different variables on global ratings: demographics, interaction-level data, and GAI temperature setting. Sentiment analysis was conducted to evaluate the VSPs’ clinical realism.

RESULTS: Statistical analysis showed that female students rated the tool significantly higher (mean rating 9.25/10) than male students (mean rating 8.94/10; Kruskal-Wallis test, H=8.7; P=.003). On the other side, no significant correlation was found between global rating and age (r=0.02, 95% CI -0.03 to 0.06; P=.42), interview length (r=0.04, 95% CI -0.2 to 0.10; P=.18), or frequency of participation (Kruskal-Wallis test, H=4.62; P=.20). A moderate negative correlation emerged between connectivity failures and ratings (r=-0.26, 95% CI -0.41 to -0.10; P=.002). Temperature settings significantly influenced ratings (Kruskal-Wallis test, H=6.93; P=.03; η²=0.02), with higher scores at temperature 0.9 compared with 0.1 (Dunn’s test, P=.04). Concerning learning outcomes, self-perceived improvement in diagnostic ability was reported by 94% (94/100) of students; however, final practical examination scores (mean 6.67, SD 1.42) did not differ significantly from those of the previous cohort without VSP training (mean 6.42, SD 1.56). Sentiment analysis indicated predominantly negative sentiment in GAI responses (median negativity 0.8903, IQR 0.306-0.961), consistent with clinical realism.

CONCLUSIONS: GAI-driven VSPs were well-received by psychology students, with student gender and system-level variables (particularly temperature settings and connection stability) shaping user evaluations. Although participants perceived the training as beneficial for their diagnostic skills, objective examination performance did not significantly differ from the previous cohort. However, lack of randomization limits the generalization of the results obtained, and further experiments are required.

PMID:41433050 | DOI:10.2196/78857

By Nevin Manimala

Portfolio Website for Nevin Manimala