Brief Bioinform. 2023 Sep 22;24(6):bbad369. doi: 10.1093/bib/bbad369.
Inference of gene regulatory network (GRN) from gene expression profiles has been a central problem in systems biology and bioinformatics in the past decades. The tremendous emergency of single-cell RNA sequencing (scRNA-seq) data brings new opportunities and challenges for GRN inference: the extensive dropouts and complicated noise structure may also degrade the performance of contemporary gene regulatory models. Thus, there is an urgent need to develop more accurate methods for gene regulatory network inference in single-cell data while considering the noise structure at the same time. In this paper, we extend the traditional structural equation modeling (SEM) framework by considering a flexible noise modeling strategy, namely we use the Gaussian mixtures to approximate the complex stochastic nature of a biological system, since the Gaussian mixture framework can be arguably served as a universal approximation for any continuous distributions. The proposed non-Gaussian SEM framework is called NG-SEM, which can be optimized by iteratively performing Expectation-Maximization algorithm and weighted least-squares method. Moreover, the Akaike Information Criteria is adopted to select the number of components of the Gaussian mixture. To probe the accuracy and stability of our proposed method, we design a comprehensive variate of control experiments to systematically investigate the performance of NG-SEM under various conditions, including simulations and real biological data sets. Results on synthetic data demonstrate that this strategy can improve the performance of traditional Gaussian SEM model and results on real biological data sets verify that NG-SEM outperforms other five state-of-the-art methods.