J Chem Inf Model. 2025 Sep 8. doi: 10.1021/acs.jcim.5c01101. Online ahead of print.
ABSTRACT
To assess environmental fate, transport, and exposure for PFAS (per- and polyfluoroalkyl substances), predictive models are needed to fill experimental data gaps for physicochemical properties. In this work, quantitative structure-property relationship (QSPR) models for octanol-water partition coefficient, water solubility, vapor pressure, boiling point, melting point, and Henry’s law constant are presented. Over 200,000 experimental property value records were extracted from publicly available data sources. Global models generated from data for diverse chemical classes resulted in more accurate property value predictions for PFAS than local models generated from a PFAS-only data set, with an average 11% reduction in mean absolute error (MAE). The global models across all property endpoints achieved strong performance on test data (R2 = 0.76-0.89 for all chemical classes). The test set mean absolute error for PFAS was about 33% higher than the value for all chemicals in the test set (when averaged over the six data sets). The new global models yielded superior PFAS prediction statistics relative to those for existing Toxicity Estimation Software Tool (T.E.S.T) models, with an average 13% reduction in MAE. A nearest neighbor-based measure of model applicability domain (AD) was shown to exclude poor predictions while maintaining a relatively high fraction (∼95%) of chemicals inside the AD. In addition, most test set PFAS are outside the AD when the model was generated without PFAS in the training set.
PMID:40921046 | DOI:10.1021/acs.jcim.5c01101