Categories
Nevin Manimala Statistics

Morphometric trait analysis and machine learning-based yield modeling in wood apple (Feronia limonia L.)

BMC Plant Biol. 2025 Dec 28. doi: 10.1186/s12870-025-07978-6. Online ahead of print.

ABSTRACT

BACKGROUND: Wood apple is a hardy yet underutilized fruit tree of the Indian subcontinent, valued for its nutritional, medicinal, and ecological significance. Despite its potential as a climate-resilient fruit species, the determinants of yield variability remain poorly characterized. This study aimed to quantify how morphometric descriptors of canopy architecture, floral, and fruit traits explain yield variation across 62 wood apple genotypes. By integrating multivariate statistics with explainable machine-learning models (Random Forest + SHAP), we provide the first data-driven framework for identifying trait combinations that govern productivity in this underutilized tree species. The approach offers a novel, interpretable path toward ideotype selection and precision orchard design.

RESULTS: Extensive morphometric variability was observed across the 62 genotypes for vegetative, foliar, floral, fruit and seed traits, indicating a broad genetic base. Yield per tree ranged widely from 35 to 127 kg, with a mean of 75 kg tree⁻¹. Principal Component Analysis (PCA) showed that canopy architecture, branch traits, and leaf-fruit attributes collectively explained 31.1% of the total variation. Correlation analysis revealed positive associations of yield with tree shape, pulp colour, and fruit-bearing tendency, whereas ornamental fruit traits and excessive spine density were negatively related. The optimized Random Forest (RF) model achieved strong predictive performance on the test dataset (R² = 0.84; RMSE = 9.45 kg; MAE = 7.12 kg), significantly outperforming Multiple Linear Regression (R² = 0.62), Support Vector Regression (R² = 0.76), and the Deep Learning (MLP) model (R² = 0.71). RF identified tree shape (16%), open flower colour (11.3%), and pulp colour (9.0%) as the most influential predictors of yield. SHAP analysis further clarified the non-linear and interactive effects among traits, highlighting the combined influence of canopy vigour, reproductive efficiency, and fruit-quality attributes on productivity. Hierarchical clustering grouped the genotypes into three clusters, with Cluster 2 characterized by compact canopies, superior reproductive traits, and desirable pulp features showing the highest and most stable yield (mean 84.6 kg tree⁻¹). Cluster 0 displayed moderate-to-high yields (79.7 kg tree⁻¹) but with greater variability, while Cluster 1 comprised the lowest-yielding genotypes (70.4 kg tree⁻¹). These findings confirm that productivity in wood apple is jointly regulated by architectural and reproductive traits through coordinated source-sink dynamics.

CONCLUSIONS: Wood apple yield is governed by an integrated suite of architectural and reproductive traits, rather than single descriptors. Genotypes with compact canopies, regular bearing habit, and consumer-preferred pulp characteristics emerge as promising ideotypes for high productivity and orchard efficiency. By combining Random Forest and SHAP, this study demonstrates the practical value of explainable machine-learning tools in identifying actionable trait combinations and providing a robust, trait-based framework to support data-driven breeding and climate-smart orchard design in underutilized perennial fruit crops.

PMID:41457267 | DOI:10.1186/s12870-025-07978-6

By Nevin Manimala

Portfolio Website for Nevin Manimala