Metabolomics. 2023 Nov 30;20(1):2. doi: 10.1007/s11306-023-02065-z.
INTRODUCTION: In metabolomics, the investigation of associations between the metabolome and one trait of interest is a key research question. However, statistical analyses of such associations are often challenging. Statistical tools enabling resilient verification and clear presentation are therefore highly desired.
OBJECTIVES: Our aim is to provide a contribution for statistical analysis of metabolomics data, offering a widely applicable open-source statistical workflow, which considers the intrinsic complexity of metabolomics data.
METHODS: We combined selected R packages tailored for all properties of heterogeneous metabolomics datasets, where metabolite parameters typically (i) are analyzed in different matrices, (ii) are measured on different analytical platforms with different precision, (iii) are analyzed by targeted as well as non-targeted methods, (iv) are scaled variously, (v) reveal heterogeneous variances, (vi) may be correlated, (vii) may have only few values or values below a detection limit, or (viii) may be incomplete.
RESULTS: The code is shared entirely and freely available. The workflow output is a table of metabolites associated with a trait of interest and a compact plot for high-quality results visualization. The workflow output and its utility are presented by applying it to two previously published datasets: one dataset from our own lab and another dataset taken from the repository MetaboLights.
CONCLUSION: Robustness and benefits of the statistical workflow were clearly demonstrated, and everyone can directly re-use it for analysis of own data.