Bioinformatics. 2025 Jul 14:btaf397. doi: 10.1093/bioinformatics/btaf397. Online ahead of print.
ABSTRACT
MOTIVATION: DNA methylation plays important roles in various cellular physiological processes in bacteria. Nanopore sequencing has shown the ability to identify different types of DNA methylation from individual bacteria directly. However, existing methods for identifying bacterial methylomes showed inconsistent performances in different methylation motifs in bacteria and didn’t fully utilize the different scale information contained in nanopore signals.
RESULTS: We propose a deep-learning method, called Nanoident, for de novo detection of DNA methylation types and methylated base positions in bacteria using Nanopore sequencing. For each targeted motif sequence, Nanoident utilizes five different features, including statistical features extracted from both the nanopore raw signals and the basecalling results of the motif. All the five features are processed by a multi-scale neural network in Nanoident, which extracts information from different receptive fields of the features. The LOOCV (Leave-One-Out Cross Validation) on the dataset containing 7 bacteria samples with 46 methylation motifs shows that, Nanoident achieves ∼10% improvement in accuracy than the previous method. Furthermore, Nanoident achieves ∼13% improvement in accuracy in an independent dataset, which contains 12 methylation motifs. Additionally, we optimize the pipeline for de novo methylation motif enrichment, enabling the discovery of novel methylation motifs.
AVAILABILITY AND IMPLEMENTATION: The source code of Nanoident is freely available at https://github.com/cz-csu/Nanoident and https://doi.org/10.6084/m9.figshare.29252264.
SUPPLEMENTARY INFORMATION: data are available at Bioinformatics online.
PMID:40658463 | DOI:10.1093/bioinformatics/btaf397