Three Sources of Validation Evidence Needed to Evaluate the Quality of Generated Test Items for Medical Licensure

Teach Learn Med. 2022 Sep 14:1-11. doi: 10.1080/10401334.2022.2119569. Online ahead of print.

ABSTRACT

Issue: Automatic item generation is a method for creating medical items using an automated, technological solution. Automatic item generation is a contemporary method that can scale the item development process for production of large numbers of new items, support building of multiple forms, and allow rapid responses to changing medical content guidelines and threats to test security. The purpose of this analysis is to describe three sources of validation evidence that are required when producing high-quality medical licensure test items to ensure evidence for valid test score inferences, using the automatic item generation methodology for test development. Evidence: Generated items are used to make inferences about examinees’ medical knowledge, skills, and competencies. We present three sources of evidence required to evaluate the quality of the generated items that is necessary to ensure the generated items measure the intended knowledge, skills, and competencies. The sources of evidence we present here relate to the item definition, the item development process, and the item quality review. An item is defined as an explicit set of properties that include the parameters, constraints, and instructions used to elicit a response from the examinee. This definition allows for a critique of the input used for automatic item generation. The item development process is evaluated using a validation table, whose purpose is to support verification of the assumptions related to model specification made by the subject-matter expert. This table provides a succinct summary of the content and constraints that were used to create new items. The item quality review is used to evaluate the statistical quality of the generated items, which often focuses on the difficulty and the discrimination of the correct and incorrect options. Implications: Automatic item generation is an increasingly popular item development method. The generated items from this process must be bolstered by evidence to ensure the items measure the intended knowledge, skills, and competencies. The purpose of this analysis is to describe these sources of evidence that can be used to evaluate the quality of the generated items. The important role of medical expertise in the development and evaluation of the generated items is highlighted as a crucial requirement for producing validation evidence.

PMID:36106359 | DOI:10.1080/10401334.2022.2119569

By Nevin Manimala