Categories
Nevin Manimala Statistics

Beyond Heuristics: A Model-Agnostic Framework for Uncertainty Quantification in QSAR via Adaptive Conformal Prediction

Chem Res Toxicol. 2026 Jun 22. doi: 10.1021/acs.chemrestox.6c00065. Online ahead of print.

ABSTRACT

Reliable quantification of uncertainty is critical for the interpretation and regulatory use of the QSAR models. Applicability domain (AD) assessment was introduced precisely for this purpose─the original OECD guidance defines AD in terms of prediction reliability─yet in practice AD metrics output heuristic similarity scores without statistically guaranteed confidence estimates. We present conformal prediction as a calibration layer that retrofits any QSAR models into a confidence predictor, producing prediction intervals for regression and prediction sets for classification at a user-specified nominal confidence level (e.g., 90%), with statistically guaranteed coverage, without retraining, using only model predictions and a calibration set. The guarantee holds under the exchangeability assumption─that calibration and test compounds are drawn from the same input space─and follows as a mathematical consequence of the rank-based calibration procedure. When the assumption is violated, coverage may fall below the nominal level─signaled by widening intervals and shrinking singleton rates. The framework uses auxiliary models trained on molecular fingerprints as nonconformity scores, a role that most existing AD indices can equally fulfill; a novel ordinal distance strategy extends the approach to hard-label classifiers by generating pseudoproabilities compatible with standard conformal methods. Applied to over 100 VEGA QSAR models spanning physicochemical properties, toxicity, and environmental endpoints (https://www.epa.gov/pesticide-science-and-assessing-pesticide-risks/technical-overview-ecological-risk-assessment-risk), the framework consistently achieves nominal coverage across all models and endpoint types. Conformal efficiency metrics─prediction interval width for regression and singleton rate for classification─correlate strongly with AD indices, demonstrating that CP formalizes and quantifies what AD heuristics approximate: the relationship between structural novelty and prediction reliability, successfully transforming heuristic chemical similarity into statistically valid prediction intervals or label sets. Large-scale application to the EPA CompTox chemical inventory demonstrates practical deployment at a regulatory scale. An open-source pipeline facilitates application to any QSAR/QSPR platform, enabling an improved transparency and reliability assessment.

PMID:42324899 | DOI:10.1021/acs.chemrestox.6c00065

By Nevin Manimala

Portfolio Website for Nevin Manimala