Modelling of Road Traffic Accidents: A Multi-state Markov Approach

Myriads of statistical techniques have been used to analyze road traffic accident data for causes; consequently serving as a guiding tool for policies made to achieve safer roads. However, relatively little is known about the progression and survival probabilities of road traffic accident victims admitted for treatment in health care institutions. Since the primary goal for safer roads is to save lives, this research work takes this primal one-step further through the analysis of the event history of road accident using the multi-state Markov model. Data of road traffic accident victims in year, 2014 at Akure, Nigeria were collected from both the Federal Road Safety Corps of Nigeria (FRSCN) and State Specialist hospital, Akure. Based on application of multi-state model, it was discovered that progression to injury state is 5 times more likely than death. Also injured victims are 6% more likely to recover from the injury than die. However, the transition probabilities that a victim will dieafter 1, 7 and 14 days of occurrence of the accident were obtained as 0.08, 0.39 and 0.61 respectively. Based on this, it is concluded that more effort should be intensifiedtowards achieving the Decade of action targets of the UN. Also the post-accident treatment of the victims of road accident should be improved as the victims are liable to die, the longer they state in the hospital.


Introduction
In 2010, the United Nation General Assembly declared the period 2011-2020 as the Decade of Action for Road Safety, with the intention of stabilizing and reducing the forecast level of road traffic fatalities to the barest minimum around the world (Waeg, 2010). Most agencies responsible for ensuring safer roads have devised various ways of preventing road traffic accidents. In Nigeria for example, stringent measures are being meted out to motorists who break traffic laws and also large number of road safety officials are seen at motor ways during periods of high traffic intensity such as festive periods. A good number of studies have been conducted into investigating and forecasting road traffic accident based on various causes. While some researcher are interested at devising and examining various vehicle system that may reduce vehicular crashes (Aga & Okada, 2003), others investigate dangerous behaviors, conditions and factors that may lead to road traffic accidents and ways of ameliorating them.
From the methodological perspective, almost all of the statistical tools utilized are used with the aim to understand the causes of road traffic accidents and achieve safer roads. Predominant among the tools are time series, regression and correlation. Lassarre (1986) successfully employed multivariate time series approaches in developing a predictive model of severity of accidents. Jones and Whitfield (1988) modeled severity risk as a function of anthropometric measures, car mass, restraint system use and age of driver using logistic regression. Logistic regression was also utilized by Lui, McGee, Rhodes, and Pallock (1988) in modelling the probability of fatalities conditioned on the occurrence of an accident. Other studies have utilized bivariate probit analysis (Hutchinson, 1986), discriminant analysis (Shao, 1987), double-pair comparison (Evans, 1990), sequential estimation procedure using simple multinomial model (Shankar, Mannering, & Barfield, 1996). Golob and Recker (2003) studied the relationships among accident types, flow of traffic, weather and lighting conditions using linear and nonlinear multivariate statistical analysis. Kadilar (2014) examined the factors that appear to have potential for serious injury or death of drivers in traffic accident using conditional logistic regression analysis. Given the rich literature in accident data research, especially the development of methods for real-time prediction as a function of current recent traffic and roadway conditions, a systematic review and meta-analysis was carried out by Roshandel, Zheng, and Washington (2015).
Despite the numerous works on road traffic accident, there has not been serious consideration of the survivability of road traffic accident victims. Amongst the few author who investigated survivability of road accident victims is Sevitt (1973 Multi-state Markov model considers several discernable states of a process; a modeling process that allows one to compute transition probabilities by integrating certain functions of the transition intensities, transition rates, mean sojourn time (mean time spent in each state) and assess the dependence of transition intensities on certain covariates. In spite of the many advantages of applying multi-state models, these models are not often applied compared to classical survival analysis. Meira-Machado, deUña-Álvarez, Cadarso-Suárez, and Andersen (2009) state the reasons for this as daunting mathematical theory and lack of available software. As a multi-state process evolves over time, a history is naturally generated. This history contains information on previous states visited, times of entry into previous states and length of stay in states. The simplest and most applied multi-state model assumes that the transition to a future state is only dependent on the present state (Markov property) and that the transition intensities are constant over time (i.e. Time homogenous transition rates). These transition intensities provide the hazards for movement from one state to another and can also be used to determine the mean sojourn time in a given state. Covariates may be incorporated in models through transition intensities to explain differences among individuals in the course of treatments. The complexity of a multistate model greatly depends on the number of states and also on the possible transitions (Zhao, 2009). A general model for accident data can be constructed in such a way that several intermediate states, like several stages of treatment in the hospital, are incorporated into the model. There are other multi-state models which are less restrictive but more difficult to implement. The best data source for such comprehensive analysis consists of event histories i.e., records that contain the dates of all events of interest that happen to the victims. Such data are expensive to collect, but as suggested by Ogurtsova (2014) , other data sources, like panel data or cross sectional data can be relied upon. A simple multi-state structure based on three states referring to whether an accident victim is injured or healthy with death as the third and only absorbing state is considered in this work. The study involves following life progression of road accident victims from the time of occurrence of the accident, till they recover or experience the final critical event which is, death, so as to estimate model parameters that would provide insights to the dynamics of the process. These parameters includes the transition intensities and transition probabilities of events, the hazard rate of deaths after accident and deaths during injury based treatment, and the mean sojourn time spent in the injury state. The effect of covariates on the model parameters is studied and the goodness-of-fit of the model is considered.

Methodology
A multi-state process is a stochastic process ( ( ), t ∈ T) with finite state space X = {1, 2, 3 . . . N} where T =[0, ] is the period of observation (Meira-Machado et al., 2009). As the process evolves over time a history − consisting of the observation of the process over the interval [0, t) or [0, t ] is generated. − can be explained as the history of the process just before time t. As mentioned previously this history contains information on previous states visited, length of stay in the current and previous states, time of transition into previous states and other related information. The two quantities which completely characterize a multi-state process are transition probabilities and transition intensities. The transition probabilities may be defined by: can be interpreted as the probability of entering state j at time t, conditional on being in state i at time s and the history of the process prior to time s. The transition probabilities in the equation above form a transition probability matrix whose rows summed up to unity. The transition intensities (3) is the instantaneous rate or hazard of making a transition from state i to state j at time t. Thus, the transition intensity matrix is defined as; with the property that = −∑ ≠ for alli∈ S. Some characteristics of the victim such as age group and gender are of particular interest in this study. One may wish to know how susceptible the age groups or gender are to road traffic accident. These explanatory variables are included at each level of the model through a generalized regression as implemented by Jackson, Sharples, Thompson, Duffy, and Couto (2003). A proportional hazards model described by Marshall and Jones (1995), where ( ), the transition intensity matrix elements, are related to the covariates ( ) at time t using ( ( )) = (0) = exp ( ( )) (5) can be used to incorporate the covariates into each level of the model.

Maximum Likelihood
To simultaneously estimate the progression rate and the effect of mismeasurement and / or covariate, the method of maximum likelihood may be employed. Generally, in a road traffic accident model, it is assumed that the transition times between states and the time of death of the victim are known, though the state of the victim before death may not be known, depending on the model being considered. Suppose the transition is to the absorbing state, i.e. death, say S and the states before death, say X is unknown, then the contribution to the likelihood according to Jackson (2011) is given by where the sum is over all possible state which can be visited between ( ) and S. Equation (6) contains transition probability which can be obtained using (Cox & Miller, 1965) ( ) = exp ( ( )) (7) where ( ) and ( ) are defined in equation (2) and (4) respectively. If the exact transition times between the states are ( 1 , 2 , … , ) with no transition between the times; the contribution to the likelihood is given by Substituting equation (5) for the intensities in equation (8) gives the maximum likelihood estimates of the effect of the covariates.
A set of initial values is required since the maximum likelihood described in equation (8) above can only be obtained using numerical methods. Following the method describe by Jackson (2011), a crude initial estimate can be obtained using = / (9) where is the number of observed transitions from state i to state j and is the number of days spent in state i.

Application to road traffic accident data
Multi-state model is applied to data obtained from records of road traffic accident victims in Akure, Nigeria from January to December 2014. The data is obtained from both the records of Federal Road Safety Commission of Nigeria (FRSCN) and the State Specialist hospital, Akure where accident victims are referred. The record from FRSCN contains the dates of the accident and the numbers of people involved in, the numbers that died and the numbers that got injured. The records of the victims that got injured are traced to the State Specialist hospital where both the numbers of victims that eventually died and those that eventually recovered after treatment are obtained. The victims that got injured but considers their injuries minor and left, are considered unaffected by the accident. The Demographic information collected about the victims, which includes their gender and age group, are used as the covariates. The age group is classified as above and below 18 years which is referred to in the work as adult and non-adult respectively. The records provide information about the time of arrival and time of departures. While arrivals refer to admissions only, departures refer to discharges which include recovery and/or deaths while being treated. The difference between these time intervals provides the length of stay in the hospital before eventual death/ full recovery per individual and the unit of time is in days. Thus, the model considered is a three-state model with the states = 1, 2, 3 referred to respectively as healthy, Injury and death as shown in the Fig. 2 below.
The data contains 2077 individuals who are completely monitored from the time that the accident occurred till they were discharged. The observed state transition is depicted in the table 1 which is considered as a "snapshots" of the process. The table indicates that 891 individuals were injured, 185 individuals died without treatment and 1001 individuals who are unaffected after the accident are classed as healthy. 458 injured individuals recovered while 433 died while being treated. Individuals who were injured before and after the observation period are not included.

Results and Discussion
The transition intensity matrix is calculated by assuming exact transition times. The initial estimate of the transition rate matrix is given by This estimate is used to compute maximum likelihood estimates of where ≠ and , =1, 2, 3 which are given in Table 2. From the table, the progression to injury state is 5 times more likely than death (that is 12 13 ⁄ ). This implies that for every accident were 5 people sustained injury, one person is likely to die. In other words, a high percentage of the accidents that occur during the period under study is fatal. Also victims are 6% more likely to recover from the injury sustained than die from the injury (that is 100 ( 21 − 23 ) 23 ⁄ ). An average of 8 days (that is −1 22 ⁄ ) is spent by injured victims in the hospital before discharge or death. Something, accident victims may begin to feel the effect of the crash few days after the occurrence of the accident. A suitable tool to measure a probable change of state by a victim after some days, is the transition probability which gives the probability that a victim in state i will move to state j. Table 3 shows the transition probabilities after day 1, 7 and 14. From the table, transition to state 3 from state 1 after 1, 7 and 14 days are given as 0.08, 0.39 and 0.61 respectively. This implies that an accident victim has a probability of 0.08 of being dead a day after the occurrence of the accident, 0.39 a week after and 0.61 after two weeks. The probability that the victim is classified as injured after day 1, 7, and 14 are 0.31, 0.52 and 0.34 respectively. This implies that the probability that the effect of the crash is established a day after the accident is 0.31, 0.52 a week after and 0.34 after two weeks. After two weeks, the probability that a victim is confirmed dead is greater than both the probability that the victim is healthy and the probability that the victim is established as injured. The transition intensities with the effect of the covariates are shown in table 4. There is 124% and 178% increments in transition intensity q12 (from healthy to injury state) when the road traffic accident is modeled in terms of age-group and gender as covariates respectively. Also, there is 1586% increment in transition intensity q23 when age-group is considered as a covariate in the model while there is 1693% increment when gender is considered as a covariate. These actually showed that the two covariates considered in the model have a considerable impact in the model.  fig. 1 below. This is a rough indication of goodness of fit of the model. From the plot, it is clear that the model overestimate the number of victims in state 1 from about day two onwards, though it becomes better as the days increases. Also the model underestimates the number of victims in state 2 from about day 2 to day 10. The expected number of victims that died is almost correctly estimated by the model till about day 10 when it begins to underestimate it.

Conclusion
An investigation has been carried out into the progression of accident victims from the moment when the accident occurred to the period when they fully recovered or died using data collected from 2077 road accident victims in 2014 at Akure, Nigeria. The data contained 891 injured individuals, 185 that died without treatments and 1001 that are seemly unaffected by the accident. 433 out of the 891 died while being treated and 458 eventually recovered. Using multi-state Markov model approach, it was observed that the probability of a victim progressing from injury state to dead state increases as the days progressed. This simply indicates that more efforts should be placed in the post-accident care. Also, the estimated results show that greater number of the accidents that occurred in the period under consideration were found to be fatal; one out of every 6 victims died. In other to achieve the Decade of Action, several preventive measures should be put in place to reduce the incidence of road traffic accidents.

Estimate
Estimates Estimates (No Covariates) (Age-Group as Covariates) (Gender as Covariates)