Categories
Nevin Manimala Statistics

Use of Multiple-Select Multiple-Choice Items in a Dental Undergraduate Curriculum: Retrospective Application of Different Scoring Methods

JMIR Med Educ. 2023 Feb 25. doi: 10.2196/43792. Online ahead of print.

ABSTRACT

BACKGROUND: Scoring and awarding credit on multiple-select items is more complex than on single-choice items. Forty-one different scoring methods were retrospectively applied to two multiple-select multiple-choice item types (Pick-N and Multiple-True-False [MTF] items) from existing exam data.

OBJECTIVE: This study aimed to calculate and compare mean scores for both item types by applying different scoring methods, and to investigate the effect of item quality on mean raw scores and the likelihood of resulting scores at or above pass-level (≥0.6).

METHODS: Items and responses from examinees (ie, marking events) were retrieved from previous examinations. Different scoring methods were retrospectively applied to the existing exam data to calculate corresponding examination scores. In addition, item quality was assessed using a validated checklist. Statistical analysis was performed using Kruskal Wallis test, Wilcoxon rank-sum test, and multiple logistic regression analysis (P<.05).

RESULTS: 1,931 marking events on 48 Pick-N items and 828 marking events on 18 MTF items were analysed. For both item types, scoring results widely differed between scoring methods (min: 0.02, max: 0.98, P<.001). Both the use of an inappropriate item type (N = 34 items) and presence of cues (N = 30 items) impacted on the scoring results: Pick-N items used inappropriately resulted in lower mean raw scores (0.88 vs 0.93, P<.001), while inappropriately used MTF items resulted in higher mean raw scores (0.88 vs 0.85, P=.001). MTF items with cues resulted in higher mean raw scores than items without cues (0.91 vs 0.8, P<.001) while mean raw scores from Pick-N items with and without cues did not differ (0.89 vs 0.90, P=.09). Item quality also impacted on the likelihood of resulting scores at or above pass-level (OR≤6.977).

CONCLUSIONS: Educators should pay attention when using multiple-select multiple-choice items and select the most appropriate item type. Different item types, scoring methods, and presence of cues are likely to impact on examinees’ scores and overall examination results.

PMID:36841970 | DOI:10.2196/43792

By Nevin Manimala

Portfolio Website for Nevin Manimala