A modified expectation-maximization algorithm for accelerated item response theory model estimation with large datasets

Behav Res Methods. 2026 Apr 21;58(5):133. doi: 10.3758/s13428-026-02996-0.

ABSTRACT

The expectation-maximization (EM) algorithm is widely used for parameter estimation in item response theory (IRT) modeling. However, when applied to datasets with large numbers of individuals and items, the standard EM algorithm can be slow to converge, with computationally expensive E-steps. We propose a modified EM algorithm to accelerate estimation for unidimensional two-parameter logistic IRT models. The modified algorithm uses a two-stage structure with partial-step updating over data subsets to reduce convergence time, while maintaining comparable accuracy and precision. The first two simulation studies evaluated its performance relative to standard EM, focusing on convergence time, parameter recovery, and standard error estimation across varying subset sizes and item counts. The third study demonstrated its scalability and runtime advantage in a large-scale testing scenario involving one million respondents and 100 items. The fourth study evaluated robustness under departures from unidimensionality. The proposed algorithm showed time reductions under smaller subsets with 36 items, consistent reductions across all subset sizes with 54 items, and the largest reduction (60%) with 40-item forms constructed from 100 items, while maintaining comparable estimation performance. These results highlight the algorithm’s potential for large-scale applications involving tens of thousands of respondents and moderate-to-large item pools, with modifications that can be integrated into existing EM routines. We conclude with directions for operational use and potential extensions to multidimensional IRT estimation.

PMID:42014633 | DOI:10.3758/s13428-026-02996-0

By Nevin Manimala