Categories
Nevin Manimala Statistics

A general framework of nonparametric feature selection in high-dimensional data

Biometrics. 2022 Mar 22. doi: 10.1111/biom.13664. Online ahead of print.

ABSTRACT

Nonparametric feature selection for high-dimensional data is an important and challenging problem in the fields of statistics and machine learning. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space (RKHS). The space is generated by a novel tensor product kernel which depends on a set of parameters that determines the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove the oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and applications to two real studies. This article is protected by copyright. All rights reserved.

PMID:35318639 | DOI:10.1111/biom.13664

By Nevin Manimala

Portfolio Website for Nevin Manimala