JMIR Med Inform. 2025 Apr 23;13:e57530. doi: 10.2196/57530.
ABSTRACT
BACKGROUND: Consuming high amounts of foods or beverages with high levels of saturated fats, salt, or sugar (HFSS) can be harmful for health. Many snacks fall into this category (HFSS snacks). However, the palatability of these snacks means that people can sometimes struggle to reduce their intake. Machine learning algorithms could help in predicting the likely occurrence of HFSS snacking so that just-in-time adaptive interventions can be deployed. However, HFSS snacking data have certain characteristics, such as sparseness and incompleteness, which make snacking prediction a challenge for machine learning approaches. Previous attempts have employed several potential predictor variables and have achieved considerable success. Nevertheless, collecting information from several dimensions requires several potentially burdensome user questionnaires, and thus, this approach may be less acceptable for the general public.
OBJECTIVE: Our aim was to consider the capacity of standard (unmodified in any way; to tailor to the specific learning problem) machine learning algorithms to predict HFSS snacking based on the following minimal data that can be collected in a mostly automated way: day of the week, time of the day (divided into time bins), and location (divided into work, home, and other).
METHODS: A total of 111 participants in the United Kingdom were asked to record HFSS snacking occurrences and the location category over a period of 28 days, and this was considered the UK dataset. Data collection was facilitated by a purpose-specific app (Snack Tracker). Additionally, a similar dataset from the Netherlands was used (Dutch dataset). Both datasets were analyzed using machine learning methods, including random forest regressor, Extreme Gradient Boosting regressor, feed forward neural network, and long short-term memory. We additionally employed 2 baseline statistical models for prediction. In all cases, the prediction problem was the time to the next HFSS snack from the current one, and the evaluation metric was the mean absolute error.
RESULTS: The ability of machine learning methods to predict the time of the next HFSS snack was assessed. The quality of the prediction depended on the dataset, temporal resolution, and machine learning algorithm employed. In some cases, predictions were accurate to as low as 17 minutes on average. In general, machine learning methods outperformed the baseline models, but no machine learning method was clearly better than the others. Feed forward neural network showed a very marginal advantage.
CONCLUSIONS: The prediction of HFSS snacking using sparse data is possible with reasonable accuracy. Our findings offer a foundation for further exploring how machine learning methods can be used in health psychology and provide directions for further research.
PMID:40267467 | DOI:10.2196/57530