Enhanced adaptive permutation test with negative binomial distribution in genome-wide omics datasets

Genes Genomics. 2024 Nov 6. doi: 10.1007/s13258-024-01584-w. Online ahead of print.

ABSTRACT

BACKGROUND: The permutation test has been widely used to provide the p-values of statistical tests when the standard test statistics do not follow parametric null distributions. However, the permutation test may require huge numbers of iterations, especially when the detection of very small p-values is required for multiple testing adjustments in the analysis of datasets with a large number of features.

OBJECTIVE: To overcome this computational burden, we suggest a novel enhanced adaptive permutation test that estimates p-values using the negative binomial (NB) distribution. By the method, the number of permutations are differently determined for individual features according to their potential significance.

METHODS: In detail, the permutation procedure stops, when test statistics from the permuted dataset exceed the observed statistics from the original dataset by a predefined number of times. We showed that this procedure reduced the number of permutations especially when there were many insignificant features. For significant features, we enhanced the reduction with Stouffer’s method after splitting datasets.

RESULTS: From the simulation study, we found that the enhanced adaptive permutation test dramatically reduced the number of permutations while keeping the precision of the permutation p-value within a small range, when compared to the ordinary permutation test. In real data analysis, we applied the enhanced adaptive permutation test to a genome-wide single nucleotide polymorphism (SNP) dataset of 327,872 features.

CONCLUSION: We found the analysis with the enhanced adaptive permutation took a feasible time for genome-wide omics datasets, and successfully identified features of highly significant p-values with reasonable confidence intervals.

PMID:39503929 | DOI:10.1007/s13258-024-01584-w

By Nevin Manimala