Categories
Nevin Manimala Statistics

Benchmarking ML in ADMET predictions: the practical impact of feature representations in ligand-based models

J Cheminform. 2025 Jul 21;17(1):108. doi: 10.1186/s13321-025-01041-0.

ABSTRACT

This study, focusing on predicting Absorption, Distribution, Metabolism, Excretion, and Toxicology (ADMET) properties, addresses the key challenges of ML models trained using ligand-based representations. We propose a structured approach to data feature selection, taking a step beyond the conventional practice of combining different representations without systematic reasoning. Additionally, we enhance model evaluation methods by integrating cross-validation with statistical hypothesis testing, adding a layer of reliability to the model assessments. Our final evaluations include a practical scenario, where models trained on one source of data are evaluated on a different one. This approach aims to bolster the reliability of ADMET predictions, providing more dependable and informative model evaluations.Scientific contributionThis study provided a structured approach to feature selection. We improve model evaluation by combining cross-validation with statistical hypothesis testing, making results more reliable. The methodology used in our study can be generalized beyond feature selection, boosting the confidence in selected models which is crucial in a noisy domain such as the ADMET prediction tasks. Additionally, we assess how well models trained on one dataset perform on another, offering practical insights for using external data in drug discovery.

PMID:40691635 | DOI:10.1186/s13321-025-01041-0

By Nevin Manimala

Portfolio Website for Nevin Manimala