Comput Inform Nurs. 2025 Aug 1. doi: 10.1097/CIN.0000000000001351. Online ahead of print.
ABSTRACT
Artificial intelligence, including chatbots that produce outputs in response to user prompts, is poised to revolutionize nursing education and program evaluation. This article summarizes a pilot evaluation of three core features in a chatbot system: (1) defining a persona or system prompt, (2) creating specific task prompts, and (3) developing an improvement mechanism. First, an “act as” system prompt was written to generate “Future-FLO” based on professional values. Second, a task prompt was written to generate outputs specific to program evaluation. Third, a separate ImproverBot was created to provide structured assessments of chatbot outputs. Human raters, knowledgeable about grant goals, provided feedback on AI-generated outputs from FLO and non-FLO system prompts. The ImproverBot rated the same outputs as the human raters on accuracy, completeness, and usefulness. Statistical tests were used to compare ratings across AI versions and rater types. The custom prompt provided no added benefits compared with using the standard model, as rated by human raters and the ImproverBot. The ImproverBot gave significantly higher ratings than human raters. Comments indicated that outputs were error-filled and unreliable. Engaging with artificial intelligence to support the work of grant evaluation will require ongoing efforts to develop chatbots and evaluate their outputs.
PMID:40763202 | DOI:10.1097/CIN.0000000000001351