Categories
Nevin Manimala Statistics

Derivation of explicit mathematical equations for gypsum solubility in aqueous electrolyte solutions using GP, GEP, and GMDH techniques

Sci Rep. 2025 Sep 30;15(1):34086. doi: 10.1038/s41598-025-14641-5.

ABSTRACT

The accumulation of mineral deposits on industrial equipment surfaces poses a major concern in a variety of processes. Gypsum (CaSO4·2H2O) is one of the most widely produced minerals in both natural and industrial environments. Currently, intelligent white-box models can serve as a suitable alternative to time-consuming and high-priced experiments, enabling the identification of possible gypsum scaling issues in the chemical and petroleum industries. In this regard, the current study focused on the development of robust mathematical correlations to estimate the solubility of gypsum in aqueous electrolyte solutions. For this purpose, three rigorous techniques of Genetic Programming (GP), Gene Expression Programming (GEP), and Group Method of Data Handling (GMDH) were implemented on two distinct data banks, including 2288 experimental data-points taken from previously published literature. Solution temperature (T), solution molecular weight (MW), and molal concentrations of monovalent, divalent, and trivalent compounds (mI, mII, and mIII) were the input/independent variables employed in the first data bank, whereas solution temperature (T), solution molecular weight (MW), and solution ionic strength (I) were included in the second data bank. The performance and accuracy of correlations were evaluated using various statistical indicators such as Mean Bias Error (MBE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Coefficient of Determination (R2). Following multiple statistical and graphical analyses on the novel correlations’ outcomes, it was found that the correlation established by implementing the GMDH technique onto the first data bank (i.e., GMDH-1) performed significantly better than all other correlations, with MAE = 0.01095, RMSE = 0.01482, and R2 = 0.8508. The correlations obtained by applying the GEP and GMDH techniques to the second data bank (i.e., GEP-2 and GMDH-2) also revealed a satisfactory level of performance. By comparing the new correlations developed in this study with models reported in previous studies, a reasonable level of agreement was found.

PMID:41028254 | DOI:10.1038/s41598-025-14641-5

By Nevin Manimala

Portfolio Website for Nevin Manimala