Categories
Nevin Manimala Statistics

Practical utility of sequence-to-omics models for improving the reproducibility of genetic fine-mapping

bioRxiv [Preprint]. 2026 Feb 6:2026.02.04.703796. doi: 10.64898/2026.02.04.703796.

ABSTRACT

Recent advances in deep learning have led to the development of sequence-to-omics (S2O) models that predict molecular phenotypes directly from DNA sequences. Here, we systematically evaluate the utility of these models, e.g., AlphaGenome, Borzoi, Enformer, and Sei, for improving the reproducibility of genetic fine-mapping across expression quantitative trait loci (eQTL) datasets from Genotype-Tissue Expression (GTEx), Trans-Omics Precision Medicine (TOPMed), and Multi-Ancestry Analysis of Gene Expression (MAGE) projects. We show that purely statistical fine-mapping often yields high replication failure rates (RFRs), but integrating S2O model predictions substantially reduces RFRs and enhances the accuracy of prioritizing SNPs replicated in other consortia. We describe a generalized framework for functionally informed fine-mapping that combines traditional posterior inclusion probabilities (PIPs) from statistical fine-mapping methods with scores from S2O models to generate functionally informed PIPs (fiPIPs) that improve reproducibility. Our findings demonstrate that S2O models, particularly newer ones like AlphaGenome and Borzoi, enable robust identification of replicated variants across consortia, highlighting their promise for scalable, functionally aware genetic mapping.

PMID:41676556 | PMC:PMC12889643 | DOI:10.64898/2026.02.04.703796

By Nevin Manimala

Portfolio Website for Nevin Manimala