Comput Biol Chem. 2026 Mar 27;123:109039. doi: 10.1016/j.compbiolchem.2026.109039. Online ahead of print.
ABSTRACT
Disruption of transcription termination (DoTT) occurs when RNA polymerase II reads past a gene’s normal 3′ end, generating downstream “readthrough” RNA. DoTT has been reported under stresses such as viral infection and metabolic perturbation. But, many existing detection tools analyze samples one at a time or rely on rigid downstream windows, limiting direct condition-to-condition testing. We present DoTT-ML, a condition-aware pipeline for detecting transcription termination disruption from conventional short-read RNA-seq. This pipeline extends gene annotations downstream by a tunable window, applies an optional gap to reduce termination-proximal noise, and applies differential analysis between conditions using a robust statistical workflow. An optional machine learning approach provides a post-hoc prioritization when curated reference annotations are available. We benchmarked DoTT-ML against ARTDeco and DoGFinder across three public datasets: influenza A virus total RNA-seq, HSV-1 nascent 4sU-RNA, and HSV-1 Z-RNA RIP-seq. DoTT-ML showed comparably to, or better than, existing tools (high ROC AUC across datasets). Finally, in an in-house mouse, high-carbohydrate diet (HCD) liver model, DoTT-ML identified diet-associated readthroughs at metabolic genes. Experimental validation confirmed a stable readthrough transcript at the Scd1 locus under dietary stress, serving as a proof of principle for the pipeline’s biological relevance. Together, DoTT-ML provides a practical framework for condition-aware, readthrough detection and comparison across diverse RNA-seq assays.
PMID:41930502 | DOI:10.1016/j.compbiolchem.2026.109039