PLoS One. 2025 Oct 9;20(10):e0333379. doi: 10.1371/journal.pone.0333379. eCollection 2025.
ABSTRACT
We propose TIC-FusionNet, a trend-aware multimodal deep learning framework for time series forecasting with integrated visual signal analysis, aimed at addressing the limitations of unimodal and short-range dependency models in noisy financial environments. The architecture combines Exponential Moving Average (EMA) decomposition for denoising and trend extraction, a lightweight Linear Transformer for efficient long-sequence temporal modeling, and a spatial-channel CNN with CBAM attention to capture morphological patterns from candlestick chart images. A gated fusion mechanism adaptively integrates numerical and visual modalities based on context relevance, enabling dynamic feature weighting under varying market conditions. We evaluate TIC-FusionNet on six real-world stock datasets, including four major Chinese and U.S. companies-Amazon, Tesla, Kweichow Moutai, Ping An Insurance, China Vanke-and Apple-covering diverse market sectors and volatility patterns. The model is compared against a broad range of baselines, including statistical models (ARIMA), classical machine learning methods (Random Forest, SVR), recurrent and convolutional neural networks (LSTM, TCN, CNN-only), and recent Transformer-based architectures (Informer, Autoformer, Crossformer, iTransformer). Experimental results demonstrate that TIC-FusionNet achieves consistently superior predictive accuracy and generalization, outperforming state-of-the-art baselines across all datasets. Extensive ablation studies verify the critical role of each architectural component, while attention-based interpretability analysis highlights the dominant technical indicators under different volatility regimes. These findings not only confirm the effectiveness of multimodal integration in capturing complementary temporal-visual cues, but also provide valuable insights into model decision-making. The proposed framework offers a robust, scalable, and interpretable solution for multimodal temporal prediction tasks, with strong potential for deployment in intelligent forecasting, sensor fusion, and risk-aware decision-making systems.
PMID:41066756 | DOI:10.1371/journal.pone.0333379