SAR QSAR Environ Res. 2025 Aug 5:1-28. doi: 10.1080/1062936X.2025.2535606. Online ahead of print.
ABSTRACT
The existing QSAR approaches for mammalian acute toxicity have been limited in scope, often relying on small or narrowly focused datasets and on classification endpoints. In contrast, our work leverages a sufficiently large curated dataset (9843 rat oral and 2323 intravenous LD50 values) to build regression models of acute toxicity. The best-performing QSAR models developed using 2D RDKit descriptors and the Cat Boost method achieve Q2 test = 0.66 at a data coverage of at least 77% within the applicability domain (AD) during validation of test sets. All models were rigorously validated according to OECD QSAR principles with clearly defined endpoints, explicit algorithms, and a well-characterized AD. The best QSAR models are integrated into the ToxAI_assistant web platform (https://tox-ai-assistant.streamlit.app/), which includes toxicity-level prediction with allowance for AD in terms of World Health Organization (WHO). We also provide mechanistic insight by identifying key toxicophores – substructural features statistically associated with high toxicity – thereby offering a structural interpretation. In sum, these elements (large and diverse data, regression modelling, WHO-based categorization, detailed fragment analysis, and AD assessment) together address the gaps of earlier studies and constitute the core novelty of our approach.
PMID:40762068 | DOI:10.1080/1062936X.2025.2535606