Bioinformatics. 2025 Nov 9:btaf617. doi: 10.1093/bioinformatics/btaf617. Online ahead of print.
ABSTRACT
MOTIVATION: Studies of microbial communities, represented by the relative abundances of taxa at various taxonomic levels, have underscored the significance of microbiota in numerous aspects of human health and disease. A pivotal challenge in microbiome research lies in pinpointing microbial taxa associated with disease outcomes, which could play crucial roles in prevention, detection, and treatment of various health conditions. Alongside these relative abundance data, taxonomic information sometimes offers a unique lens to explore the impact of shared evolutionary histories on patterns of microbial abundance.
RESULTS: In pursuit of this goal, we utilize the tree structure to more flexibly identify taxa associated with disease outcomes. To enhance the accuracy of our selection process, we introduce auxiliary knockoff copies of microbiome features designated as noise. This approach allows for the assessment of false positives in the selection process and aids in refining it towards more precise outcomes. Extensive numerical simulations demonstrate that our methodology outperforms several existing methods in terms of selection accuracy. Furthermore, we demonstrate the practicality of our approach by applying it to a widely used gut microbiome dataset, identifying microbial taxa linked to body mass index.
AVAILABILITY AND IMPLEMENTATION: TCVS R code is available at https://github.com/Yicong1225/TCVS.
PMID:41206954 | DOI:10.1093/bioinformatics/btaf617