Traffic Inj Prev. 2025 Aug 8:1-10. doi: 10.1080/15389588.2025.2530074. Online ahead of print.
ABSTRACT
OBJECTIVE: This study aims to enhance the accuracy and stability of traffic accident loss prediction in China by utilizing machine learning techniques. Specifically, it explores the application of the Extra Trees model combined with feature importance screening to predict key accident indicators such as the number of accidents, deaths, injuries, and property losses.
METHODS: Relevant transportation industry indicators were collected from national statistical sources. A two-step feature screening approach was employed based on average importance and importance ratio to reduce dimensionality and improve model performance. The Extra Trees algorithm was used for prediction modeling, and prediction accuracy was evaluated across multiple experimental runs to assess stability. Additionally, correlation, regression effects, and global importance scores were calculated to quantify the influence of each indicator. Polynomial fitting was conducted to explore the relationship between key indicators and predicted values.
RESULTS: The proposed feature screening approach improved both the accuracy and interpretability of the prediction model. The average prediction errors for the number of accidents, deaths, injuries, and property losses were 4.66%, 1.92%, 10.03%, and 5.01%, respectively. Among all targets, the number of deaths showed the highest predictive accuracy. Polynomial fitting confirmed a strong relationship between selected indicators and predicted values, with a quadratic fit achieving an R2 of 0.957. The analysis identified 30 influential indicators, of which 12 had multi-target effects. Highway mileage, grade highway mileage, and average freight distance emerged as the most impactful indicators, with global importance scores exceeding 9.5%. Furthermore, the study demonstrated that prediction stability could be maintained across different data intervals, with error fluctuations remaining within acceptable bounds.
CONCLUSIONS: This study confirms the effectiveness of integrating feature importance screening with the Extra Trees model for predicting traffic accident losses. The methodology not only enhances prediction accuracy but also ensures stable performance across different accident indicators. The quantitative assessment of indicator importance offers valuable insights into the factors contributing to accident severity and provides a data-driven foundation for traffic safety policy and planning.
PMID:40779748 | DOI:10.1080/15389588.2025.2530074