Kernel-smoothed permutation for extreme p-value estimation in genetic association studies

Genetics. 2026 May 11:iyag119. doi: 10.1093/genetics/iyag119. Online ahead of print.

ABSTRACT

In genetic association studies, permutation tests serve as a cornerstone to estimate p-values. This is because researchers may design new test statistics without a known closed-form distribution, or the assumption of a well-established test may not hold. However, permutation tests require a vast number of permutations which is proportional to the magnitude of the actual p-values. When it comes to genome-wide association studies where multiple-test corrections are routinely conducted, the actual p-values are extremely small, requiring a daunting number of permutations that may be beyond the available computational resources. Existing models that reduce the required number of permutations all assume a specific format of the test statistic to exploit its specific statistical properties. We propose Kernel-smoothed permutation which is a model-free method universally applicable to any statistic. Our tool forms the null distribution of test statistics using a kurtosis-driven transformation, followed by a kernel-based density estimation (KDE). We compared our Kernel-smoothed permutation to Naïve permutation using statistics from known closed-form null distributions. Based on three frequently used test statistics in association studies, i.e., t-test, sequence kernel association test (SKAT), and chi-squared test, we demonstrated that our model reduced the required number of permutations by a magnitude with similar or higher accuracy. Based on a real-world genome-wide association study (GWAS) analysis, we used Crohn’s disease cohort to further confirm that our model substantially outperforms the Naïve permutation.

PMID:42114112 | DOI:10.1093/genetics/iyag119

By Nevin Manimala