Expert Opin Drug Discov. 2023 Nov 3:1-7. doi: 10.1080/17460441.2023.2277342. Online ahead of print.
ABSTRACT
INTRODUCTION: Modern drug discovery incorporates various tools and data, heralding the beginning of the data-driven drug design (DD) era. The distributions of chemical and physical data used for Artificial Intelligence (AI)/Machine Learning (ML) and to drive DD have thus become highly important to be understood and used effectively.
AREAS COVERED: The authors perform a comprehensive exploration of the statistical distributions driving the data-intensive era of drug discovery, including Benford’s Law in AI/ML-based DD.
EXPERT OPINION: As the relevance of data-driven discovery escalates, we anticipate meticulous scrutiny of datasets utilizing principles like Benford’s Law to enhance data integrity and guide efficient resource allocation and experimental planning. In this data-driven era of the pharmaceutical and medical industries, addressing critical aspects such as bias mitigation, algorithm effectiveness, data stewardship, effects, and fraud prevention are essential. Harnessing Benford’s Law and other distributions and statistical tests in DD provides a potent strategy to detect data anomalies, fill data gaps, and enhance dataset quality. Benford’s Law is a fast method for data integrity and quality of datasets, the backbone of AI/ML and other modeling approaches, proving very useful in the design process.
PMID:37921672 | DOI:10.1080/17460441.2023.2277342