Categories
Nevin Manimala Statistics

ATTED-II v11: A Plant Gene Coexpression Database Using a Sample Balancing Technique by Subagging of Principal Components

Plant Cell Physiol. 2022 Mar 30:pcac041. doi: 10.1093/pcp/pcac041. Online ahead of print.

ABSTRACT

ATTED-II (https://atted.jp) is a gene coexpression database for nine plant species based on publicly available RNAseq and microarray data. One of the challenges in constructing condition-independent coexpression data based on publicly available gene expression data is managing the inherent sampling bias. Here, we report ATTED-II version 11, wherein we adopted a coexpression calculation methodology to balance the samples using principal component analysis and ensemble calculation. This approach has two advantages. First; omitting principal components with low contribution rates reduces the main contributors of noise. Second; balancing large differences in contribution rates enables considering various sample conditions entirely. In addition, based on RNAseq- and microarray-based coexpression data, we provide species-representative, integrated coexpression information to enhance the efficiency of inter-species comparison of the coexpression data. These coexpression data are provided as a standardized z-score to facilitate integrated analysis with different data sources. We believe that with these improvements, ATTED-II is more valuable and powerful for supporting inter-species comparative studies and integrated analyses using heterogeneous data.

PMID:35353884 | DOI:10.1093/pcp/pcac041

By Nevin Manimala

Portfolio Website for Nevin Manimala