Categories
Nevin Manimala Statistics

Modelling and estimation of chemical reaction yields from high-throughput experiments

Commun Chem. 2026 Jan 3. doi: 10.1038/s42004-025-01866-8. Online ahead of print.

ABSTRACT

Machine learning (ML) and artificial intelligence (AI) techniques are transforming the way chemical reactions are studied today. Datasets from high-throughput experimentation (HTE) are generated to better understand the reaction conditions crucial for outcomes such as yields and selectivities. However, it is often overlooked that datasets from such designed experiments possess a specific structure, which can be captured by a statistical model. Ignoring these data structures when applying ML/AI algorithms can result in misleading conclusions. In contrast, leveraging knowledge about the data-generating process yields reliable, interpretable, and comprehensive insights into reaction mechanisms. A particularly complex dataset is available for the Buchwald-Hartwig amination. Using this dataset, a statistical model for such HTE-generated chemical data is introduced, and a parameter estimation algorithm is developed. Based on the estimated model, new insights into the Buchwald-Hartwig amination are discussed. Our approach is applicable to a wide range of HTE-generated data for chemical reactions and beyond.

PMID:41484279 | DOI:10.1038/s42004-025-01866-8

By Nevin Manimala

Portfolio Website for Nevin Manimala