J Basic Microbiol. 2026 Feb;66(2):e70148. doi: 10.1002/jobm.70148.
ABSTRACT
Cordycepin, a nucleoside analog derived from Cordyceps militaris, is a bioactive compound with potent pharmacological properties and growing relevance in functional food and pharmaceutical industries. However, its production is highly variable depending on cultivation conditions, making real-time and scalable prediction essential for efficient process control. This study aimed to develop a machine learning-based predictive model to estimate cordycepin content based on measurable cultivation parameters. Three machine learning algorithms-XGBoost, Random Forest, and Support Vector Machine-were trained using experimental data encompassing environmental and nutritional factors. Model validation was conducted using Tropsha’s statistical criteria, and model explainability was achieved through SHAP analysis. A user-friendly GUI was also developed for real-time prediction and application. Among the models, XGBoost demonstrated the highest performance with a cross-validated Q² of 0.9087 and an R² of 0.9544, satisfying all statistical requirements for reliability. SHAP analysis identified light wavelength and carbon/nitrogen ratio as the most influential factors in cordycepin biosynthesis. The developed GUI enables end-users to input cultivation conditions and receive immediate predictions, facilitating data-driven decision-making. This approach offers a scalable and interpretable framework for optimizing bioactive compound production in edible fungi, with potential application in smart bioprocessing and precision fermentation.
PMID:41636097 | DOI:10.1002/jobm.70148