Anat Sci Educ. 2025 Sep 7. doi: 10.1002/ase.70120. Online ahead of print.
ABSTRACT
Educational materials advocating whole-body donation must be accurate, easy to read, and transparent, as one potential solution to the fact that the supply of donations is not keeping pace with educational demand, thereby disrupting anatomy education programs. The use of AI technologies to supplement communications with prospective donors and next of kin deserves investigation to determine whether LLM-based approaches meet the common requirements for effective communication. This study contributes to the limited literature on LLM-supported communications by presenting a comparative quantitative benchmark and an adaptable evaluation framework. Five LLMs (ChatGPT-4o, Grok3.0, Claude4Sonnet, Gemini2.5 Flash, DeepSeekR1) were used to generate responses to six frequently asked questions about body donation in Turkish. Four anatomists evaluated accuracy, quality, readability, and vocabulary diversity. Differences between models were statistically analyzed. The two top-performing models, ChatGPT-4o and Grok3.0, achieved mean quality scores of 21.7 ± 2.8 and 21.0 ± 5.1 on a 25-point checklist, and 4.58 ± 0.88 and 4.25 ± 1.03 on a 5-point global quality scale, significantly outperforming the remaining three systems (p < 0.037). Both maintained a below-secondary-school level on two validated readability indices (scores ≥67.8 and ≥40.2). LLM-produced body donation materials (e.g., informational texts and FAQs) may help promote the importance of whole-body donations by providing accessible and reliable information, potentially streamlining the creation of first drafts and reducing staff workload. Given the sensitivity of donation decisions, ethical transparency, cultural sensitivity, and continuous human oversight are essential safeguards. Therefore, LLM use for such purposes should be governed by clear governance frameworks, regular expert audits, and publicly disclosed quality metrics.
PMID:40916067 | DOI:10.1002/ase.70120