Exploring supportive care needs of lung cancer patients in China and predicting with machine learning models

Support Care Cancer. 2025 Jun 13;33(7):573. doi: 10.1007/s00520-025-09619-y.

ABSTRACT

PURPOSE: This study aims to explore the level of supportive care needs among hospitalized lung cancer patients in China, explore the key influencing factors and use machine learning (ML) to develop predictive models for the level of supportive care needs among hospitalized lung cancer patients.

METHODS: This cross-sectional study collected data on the supportive care needs, demographics, and clinical information of 486 hospitalized lung cancer patients. Univariate and multivariate analyses identified factors associated with these needs. Predictive models were developed using six machine learning methods-logistic regression, linear regression, k-nearest neighbors, support vector machine, random forest, and adaptive boosting-to assess their performance, followed by a visualization of feature importance. The code used for model development and analysis is publicly available at https://github.com/zimengcc/predict_cancer_scn.

RESULTS: Among the factors influencing the supportive care needs of hospitalized lung cancer patients, age, education level, occupation, tumor stage, and household per capita monthly income have a significant impact on supportive care needs scores. Multiple linear regression analysis revealed that education level and household per capita monthly income were statistically significant predictors of supportive care needs scores. In the predictive tasks, the random forest model performed the best, with a mean absolute error (MAE) of 4.45 for predicting the total supportive care needs score. Furthermore, to predict the dimension with the highest level of supportive care needs, the model achieved an accuracy of 88. 42%, an F1 score of 87. 49%, and an ROC-AUC of 0.9061.

CONCLUSION: Our study explored the factors influencing the level of supportive care needs among hospitalized lung cancer patients. While the machine learning models demonstrate promising predictive performance, it is important to note that all results were derived solely through cross-validation. Therefore, potential overfitting and overestimation of model performance should be considered when interpreting these findings. Nevertheless, these models may serve as a foundation for developing tools to support personalized care planning in clinical settings.

PMID:40512392 | DOI:10.1007/s00520-025-09619-y

By Nevin Manimala