A Data Mining Framework to Identify Important Factors of Fatigue and Drowsiness Accidents

Document Type : Research Article


1 Civil Engineering Department, Iran University of Science and Technology, Tehran, Iran.

2 Civil Engineering Department, Iran University of Science and Technology, Tehran, Iran


Fatigue and drowsiness are the major factors contributing to accidents worldwide. According to statistics, 20 to 40 percent of traffic accidents in Iran are due to drivers' fatigue. This study aims to identify the most important variables affecting the occurrence of fatigue and drowsiness accidents based on the classification and regression tree (CART) method. At first, 859, 378 police crash data of provinces Tehran, Fars, and Mazandaran during seven years (2011-2018) were segmented into homogeneous groups using the two-step clustering algorithm. Next, an oversampling technique is applied to deal with the crash data imbalance problem. Finally, the classification and regression tree combined with the boosting algorithm increases the accuracy of the models. The results of the classification tree showed that the main variables affecting the occurrence of fatigue and drowsiness accidents are: road type, time of day, road traffic direction, local land use, shoulder type, vehicle type, control type, and collision type. Moreover, the road type variable was the only significant factor in residential suburban areas of Mazandaran and Fars provinces. Also, the common variable in residential urban areas of all three provinces was the time of day. It was concluded that the combination of the CART algorithm with oversampling and boosting increased the accuracy of the models. Identifying influential factors in fatigue and drowsiness accidents in the three mentioned provinces could improve the engineering and executive interactions and appropriate educational programs.


Main Subjects

  1. Zakeri, K. Kadkhodazadeh, Review of Contributing Factors in Road Traffic Accidents in Iran, 284060423X, 2015.
  2. Rad, A.L. Martiniuk, A. Ansari-Moghaddam, M. Mohammadi, F. Rashedi, A. Ghasemi, The pattern of road traffic crashes in South East Iran, Global journal of health science, 8(9) (2016) 149.
  3. B. Verwey, D.M. Zaidel, Preventing drowsiness accidents by an alertness maintenance device, Accident Analysis & Prevention, 31(3) (1999) 199-211.
  4. Akerstedt, Consensus statement: fatigue and accidents in transport operations, Journal of sleep research, 9(4) (2000) 395-395.
  5. W. MacLean, D.R. Davies, K. Thiele, The hazards and prevention of driving while sleepy, Sleep medicine reviews, 7(6) (2003) 507-521.
  6. J. Beirness, H.M. Simpson, K. Desmond, The road safety monitor 2004: Drowsy driving, 2005.
  7. Philip, P. Sagaspe, E. Lagarde, D. Leger, M.M. Ohayon, B. Bioulac, J. Boussuge, J. Taillard, Sleep disorders and accidental risk in a large group of regular registered highway drivers, Sleep medicine, 11(10) (2010) 973-979.
  8. Zhang, K.K. Yau, X. Zhang, Y. Li, Traffic accidents involving fatigue driving and their extent of casualties, Accident Analysis & Prevention, 87 (2016) 34-42.
  9. Connor, R. Norton, S. Ameratunga, E. Robinson, I. Civil, R. Dunn, J. Bailey, R. Jackson, Driver sleepiness and risk of serious injury to car occupants: population-based case-control study, Bmj, 324(7346) (2002) 1125.
  10. Philip, C. Chaufton, L. Orriols, E. Lagarde, E. Amoros, B. Laumon, T. Akerstedt, J. Taillard, P. Sagaspe, Complaints of poor sleep and risk of traffic accidents: a population-based case-control study, PloS one, 9(12) (2014) e114102.
  11. Niu, G. Li, Fatigue driving prediction on commercial dangerous goods truck using location data: the relationship between fatigue driving and driving environment, Journal of advanced transportation, 2020 (2020).
  12. Haworth, G. Rechnitzer, Description of fatal crashes involving various causal variables, 1993.
  13. Ryan, J. Spittle, Truck crashes in Western Australia, in: Road Safety Research and Enforcement Conference, Melbourne, November 1995.
  14. Armstrong, A.J. Filtness, C.N. Watling, P. Barraclough, N. Haworth, Efficacy of proxy definitions for identification of fatigue/sleep-related crashes: An Australian evaluation, Transportation Research Part F: Traffic Psychology and Behaviour, 21 (2013) 242-252.
  15. J. Filtness, K.A. Armstrong, A. Watson, S.S. Smith, Sleep-related vehicle crashes on low speed roads, Accident Analysis & Prevention, 99 (2017) 279-286.
  16. Gnardellis, G. Tzamalouka, M. Papadakaki, J.E. Chliaoutakis, An investigation of the effect of sleepiness, drowsy driving, and lifestyle on vehicle crashes, Transportation research part F: traffic psychology and behavior, 11(4) (2008) 270-281.
  17. Zhang, X. Wang, X. Yang, C. Xu, X. Zhu, J. Wei, Driver drowsiness detection using mixed-effect ordered logit model considering time cumulative effect, Analytic methods in accident research, 26 (2020) 100114.
  18. Soares, T. Monteiro, A. Lobo, A. Couto, L. Cunha, S. Ferreira, Analyzing driver drowsiness: From causes to effects, Sustainability, 12(5) (2020) 1971.
  19. Yu, C. Zheng, C. Ma, J. Shen, The temporal stability of factors affecting driver injury severity in run-off-road crashes: A random parameters ordered probit model with heterogeneity in the means approach, Accident Analysis & Prevention, 144 (2020) 105677.
  20. -Y. Chang, H.-W. Wang, Analysis of traffic injury severity: An application of non-parametric classification tree techniques, Accident Analysis & Prevention, 38(5) (2006) 1019-1027.
  21. Sharma, S. Kumar, A survey on decision tree algorithms of classification in data mining, International Journal of Science and Research (IJSR), 5(4) (2016) 2094-2097.
  22. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and regression trees. Chapman & Hall/CRC, (1998).
  23. Jeong, Y. Jang, P.J. Bowman, N. Masoud, Classification of motor vehicle crash injury severity: A hybrid approach for imbalanced data, Accident Analysis & Prevention, 120 (2018) 250-261.
  24. O. Mujalli, G. López, L. Garach, Bayes classifiers for imbalanced traffic accidents datasets, Accident Analysis & Prevention, 88 (2016) 37-51.
  25. Chiu, D. Fang, J. Chen, Y. Wang, C. Jeris, A robust and scalable clustering algorithm for mixed type attributes in large database environment, in: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, 2001, pp. 263-268.
  26. Norusis, P. Statistics, Advanced statistical procedures companion, p. 152ff, (2005).
  27. R.A. Hamid, J.H. Abawajy, An approach for profiling phishing activities, Computers & Security, 45 (2014) 27-41.
  28. Thammasiri, D. Delen, P. Meesad, N. Kasap, A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition, Expert Systems with Applications, 41(2) (2014) 321-330
  29. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and regression trees–crc press, Boca Raton, Florida, (1984).
  30. Cios KJ, Pedrycz W, Swiniarski RW, Kurgan LA. Data mining: A knowledge discovery approach. New York: Springer Verlag, 2007.
  31. Pande, M. Abdel-Aty, Assessment of freeway traffic parameters leading to lane-change related collisions, Accident Analysis & Prevention, 38(5) (2006) 936-948.
  32. H. Friedman, The elements of statistical learning: Data mining, inference, and prediction, springer open, 2017.
  33. J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, Journal of computational and applied mathematics, 20 (1987) 53-65.
  34. Horne, L. Reyner, Driver sleepiness, Journal of sleep research, 4 (1995) 23-29.
  35. -H. Ting, J.-R. Hwang, J.-L. Doong, M.-C. Jeng, Driver fatigue and highway driving: A simulator study, Physiology & behavior, 94(3) (2008) 448-453.