Feature-Driven Road Accident Risk Prediction Using Gradient Boosting Regression

Authors

  • yasir ul hassan Computer Science Quaid-e-Awam University of Engineering, Sciences & Technology Nawabshah, Pakistan
  • Shamshad Lakho Computer Science Quaid-e-Awam University of Engineering, Sciences & Technology Nawabshah, Pakistan
  • Imran Ali Memon Shaheed Benazir Bhutto University, SBA Nawabshah, Pakistan
  • Faryal Arshad Computer Science Quaid-e-Awam University of Engineering, Sciences & Technology Nawabshah, Pakistan
  • Mubashir Ul Hassan Data Science Quaid-e-Awam University of Engineering, Sciences & Technology Nawabshah, Pakistan
  • Muhammad Bilal Qazi Data Science Quaid-e-Awam University of Engineering, Sciences & Technology Nawabshah, Pakistan

Keywords:

Accident Risk Prediction, Artificial Intelligence, Gradient Boosting Regressor (GBR), Predictive Modeling, Machine Learning, Intelligent Transportation Systems (ITS)

Abstract

Road traffic accidents remain a major global cause of fatalities and economic loss, posing significant challenges to public  safety and urban mobility. Despite advancements in vehicle safety systems and road infrastructure, accurately predicting accident risk remains a complex task that requires advanced analytical techniques. This study develops a predictive framework using Gradient Boosting Regressor (GBR), an ensemble machine learning algorithm, to estimate road accident risk based on environmental and infrastructural features. The analysis incorporates multiple factors, including road type, number of lanes, road curvature, speed limits, lighting conditions, weather patterns, road signage, and historical accident records. The dataset used in this study contains more than 517,000 road condition observations with 13 predictive features and was obtained from the Kaggle Playground Series. During preprocessing, duplicate records were removed to ensure data quality, numerical variables were normalized using Min Max Scaler, and categorical variables were encoded systematically for model compatibility. Experimental results demonstrate that the proposed GBR model achieves strong predictive performance with an accuracy of approximately 88.57% in estimating accident risk levels across diverse road conditions. The findings highlight the significant influence of environmental and infrastructural factors on accident risk and demonstrate the potential of machine learning–based approaches in transportation safety analysis. The proposed framework can assist transportation authorities and policymakers in identifying high-risk road segments and implementing targeted safety interventions to reduce accident occurrences.

References

Dragan Gatarić, Nenad Ruškić, “Predicting Road Traffic Accidents—Artificial Neural Network Approach,” Alogrithms, vol. 16, no. 5, p. 257, 2023, [Online]. Available: https://www.mdpi.com/1999-4893/16/5/257

Chukwutoo C. Ihueze, Uchendu O. Onwurah, “Road traffic accidents prediction modelling: An analysis of Anambra State, Nigeria,” Accid. Anal. Prev., vol. 112, pp. 21–29, 2018, [Online]. Available: https://www.sciencedirect.com/science/article/abs/pii/S0001457517304542

N. Alpalhão, P. Sarmento, and Bruno Jardim, “Assessing the risk of traffic accidents in lisbon using a gradient boosting algorithm with a hybrid classification/regression approach,” Transp. Res. Interdiscip. Perspect., vol. 32, p. 101495, 2025, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2590198225001745

G. Singh, M. Pal, Y. Yadav, and T. Singla, “Deep neural network-based predictive modeling of road accidents,” Neural Comput. Appl. 2020 3216, vol. 32, no. 16, pp. 12417–12426, Jan. 2020, doi: 10.1007/s00521-019-04695-8.

A. K. Sheng Dong, “Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations,” Int. J. Environ. Res. Public Heal., vol. 19, no. 5, p. 2925, 2022, [Online]. Available: https://www.mdpi.com/1660-4601/19/5/2925

N. A. Kenan Menguc, “A Data Driven Approach to Forecasting Traffic Speed Classes Using Extreme Gradient Boosting Algorithm and Graph Theory,” Phys. A Stat. Mech. its Appl., vol. 620, 2023, [Online]. Available: https://ideas.repec.org/a/eee/phsmap/v620y2023ics0378437123002935.html

A. Theofilatos, G. Yannis, P. Kopelias, and F. Papadimitriou, “Predicting Road Accidents: A Rare-events Modeling Approach,” Transp. Res. Procedia, vol. 14, pp. 3399–3405, 2016, doi: 10.1016/J.TRPRO.2016.05.293.

S Govindaraju, M Indirani, Siti Sarah Maidin, Jingchuan Wei, “Intelligent Transportation System’s Machine Learning-Based Traffic Prediction,” J. Appl. Data Sci., vol. 5, no. 4, 2024, [Online]. Available: https://bright-journal.org/Journal/index.php/JADS/article/view/364

Noura Hamdan, Tibor Sipos, “Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors,” Futur. Transp, vol. 5, no. 4, p. 197, 2025, doi: https://doi.org/10.3390/futuretransp5040197.

Bappa Muktar, Vincent Fono, “Toward Safer Roads: Predicting the Severity of Traffic Accidents in Montreal Using Machine Learning,” Electronics, vol. 13, no. 15, p. 3036, 2024, doi: https://doi.org/10.3390/electronics13153036.

Alexei Roudnitski, “Evaluating Road Crash Severity Prediction with Balanced Ensemble Models,” Findings, 2024, doi: 10.32866/001c.116820.

M. S. Markus Deublein, “Prediction of road accidents: A Bayesian hierarchical approach,” Accid. Anal. Prev., vol. 51, pp. 274–291, 2013, [Online]. Available: https://www.sciencedirect.com/science/article/abs/pii/S0001457512004101

Downloads

Published

2025-11-29

How to Cite

ul hassan, yasir, Shamshad Lakho, Imran Ali Memon, Faryal Arshad, Mubashir Ul Hassan, & Muhammad Bilal Qazi. (2025). Feature-Driven Road Accident Risk Prediction Using Gradient Boosting Regression. International Journal of Innovations in Science & Technology, 7(10), 68–82. Retrieved from https://journal.50sea.com/index.php/IJIST/article/view/1699

Most read articles by the same author(s)