Psychometrika. 2026 May 12:1-22. doi: 10.1017/psy.2026.10109. Online ahead of print.
ABSTRACT
Large-scale assessment data typically include numerous variables, often affected by missing values. Motivated by the challenges arising in this framework, we extend the knockoffs method for selecting predictors to settings with missing values. Our proposal relies on a preliminary phase of multiple imputation (MI) of missing values. Each imputed dataset is then processed using a suitable knockoff filter. We evaluate the performance of the proposed method through simulation studies, showing satisfactory results consistent with a recently advocated cutting-edge method. We apply the method to large-scale assessment data collected by INVALSI on test scores of Italian students in grade 5, including many background variables. This case study is challenging, as most predictors have unordered categories, a setting not considered by traditional knockoff methods. In addition, some of the key predictors are affected by missing values. Our proposal to implement the knockoffs method within an MI framework is feasible, flexible, and effective.
PMID:42117181 | DOI:10.1017/psy.2026.10109