Categories
Nevin Manimala Statistics

Data Mining Meets Machine Learning: A Novel ANN-based Multi-Body Interaction Docking Scoring Function (MBI-Score) based on Utilizing Frequent Geometric and Chemical Patterns of Interfacial Atoms in Native Protein-Ligand Complexes

Mol Inform. 2022 Feb 9. doi: 10.1002/minf.202100248. Online ahead of print.

ABSTRACT

Accurate prediction of binding poses is crucial to structure-based drug design. We employ two powerful artificial intelligence (AI) approaches, data-mining and machine-learning, to design artificial neural network (ANN) based pose-scoring function. It is a simple machine-learning-based statistical function that employs frequent geometric and chemical patterns of interacting atoms at protein-ligand interfaces. The patterns are derived by mining interfaces of “native” protein-ligand complexes. Each interface is represented by a graph where nodes are atoms and edges connect protein-ligand interfacial atoms located within certain cutoff distance of each other. Applying frequent subgraph mining to these interfaces provides “native” frequent patterns of interacting atoms. Subsequently, given a pose for a protein-ligand complex of interest, the pose-scoring function (the information-processing unit or neuron) calculates the degree of matching between the interaction patterns present at the pose’s interface and the native frequent patterns. The pose-scoring function takes into account the frequency of occurrence of the matching native patterns, the size of the match, and the degree of geometrical similarity between pose-specific and matching native frequent patterns. This novel “multi-body interaction” pose-scoring function (MBI-Score) was validated using two databases, PDBbind and Astex-85, and it outperformed seven commonly used commercial scoring functions. MBI-Score is available at www.khashanlab.org/mbi-score.

PMID:35142086 | DOI:10.1002/minf.202100248

By Nevin Manimala

Portfolio Website for Nevin Manimala