Generalized Linear Multilevel Models for Ordinal Categorical Responses: Methods and Application to Medical Data

Statistical modeling of multilevel data has been in discussion for several years and many developments have been made in this aspect. However the field of multilevel modeling for discrete categorical responses is relatively new, with markedly few applications in the areas of ordinal categorical response modeling. Most of these applications are focused in the area of educational data. The basis of this paper is to explore the use of Generalized Linear Multilevel Models for modeling a multilevel ordinal categorical response, in the field of medicine, which is somewhat of a novel application, as these methods have seldom been utilized in modeling medical data. The application focuses on analysing the factors that affect the severity of respiratory infections diagnosed in family practice and is based on data collected at 13 family practices in Sri Lanka. The data consisted of individual patient records, clustered within the practices and thus required a multilevel modeling approach. The explanatory variables pertaining to this study were: Age, Gender, Duration and most prevailing Symptom of the patients, while the ordinal categorical response indicating the severity of the diagnosis made was termed Diagnosis. Two main approaches of the Generalized Linear Multilevel Model; namely the Proportional Odds Model and the Non-Proportional Odds Model have been applied to the data and the models compared using suitable diagnostic tests. The variables Symptom and Duration provided significant main effects while the Symptom-Gender interaction also proved to be significant. Based on the DIC diagnostic, the Non-Proportional Odds model proves to be the better of the two models. DOI: http://dx.doi.org/10.4038/sljastats.v12i0.4969 Sri Lankan Journal of Applied Statistics Vol.12 2011 pp.83-99


Introduction
Multilevel or Hierarchical data are a commonly encountered phenomenon, in many data structures, especially in the fields of Medical, Biological and Social Sciences. A hierarchical structure in data refers to two or more distinct levels of data within a data set. Statistical modeling of multilevel data has been in discussion for many years and many developments have been made in this aspect Aitkin et al. [1], Goldstein [2], Hedeker and Gibbons [3]. As most of the early developments are concentrated in the area of continuous response variables, the field of multilevel modeling for discrete categorical responses is a relatively new approach Goldstein [4], Rashbash et al. [5], Fielding and Yang [6].
The basis of this paper is the modeling of an ordinal categorical response in the presence of a multilevel data structure. Ordered categorical responses are often encountered, especially in social and medical data Kuruppumullage and Sooriyarachchi [7]. The standard procedure for dealing with ordered category scales was to assign a score to these categories and to treat it as a continuous variable. However there are many questionable aspects to this method Goldstein [8]. Multilevel models for ordered categorical responses, which is somewhat of a recent development in this field, tries to avoid the arbitrariness of assumptions involved when assigning these numerical scores through the use of Cumulative Response Probabilities in place of response probabilities for each category. Fielding et al. [9] presents an application of this method to an educational dataset. However this model is somewhat of a novel application in the field of medical data analysis as it has not yet been frequently utilized for the modeling of medical data. Thus the main focus of this paper is to present the application of Generalized Linear Multilevel Models (also referred to as Generalized Multilevel Ordinal Models), for analysing multilevel ordinal categorical responses, in the field of medicine.
The data used in this study deals with individual patient records, clustered within 13 family practices. The explanatory variables pertaining to this study are: Age, Gender, Duration and Symptoms of the patients, while the ordinal categorical response 'Diagnosis' indicates the severity of the diagnosis made.
This paper aims to compare two approaches of the Generalized Multilevel Ordinal Model (GMOM); namely the Proportional Odds Model and the Non-Proportional Odds Model. Section 2 presents a brief summary of relevant literature pertaining to GMOM followed by a description of the data used in Section 3. Section 4 will briefly explain the theory pertaining to the two modeling approaches mentioned above. Section 5 presents the application of the Proportional Odds and Non-Proportional Odds models for the data, along with the analysis of odds ratios and interpretation of results followed by a comparison of the two models. Finally section 6 concludes with a discussion of the outcome of the two models and their appropriateness in handling the data.

A Review of Literature
The birth of hierarchical data modeling can be traced back to the mid 1980's when many research breakthroughs in the field of statistical analysis of hierarchical data were made. The influential works of Aitkin et al. [10] and, Aitkin and Longford [1], initiated a succession of developments in the area of modeling hierarchical data, which in turn resulted in the development of a variety of techniques, and software.
Initial multilevel techniques were mainly confined to continuous response variables. However the early 1990's showed the extension of multilevel theory and its implementation in software, to ably handle different types of outcomes such as binary, nominal scale multicategorical and ordered categorical. Various estimation techniques were also proposed for handling models with discrete responses. Some marked methods among them were an improved 2nd order approximation proposed by Goldstein in 1995, Gauss-Hermite quadrature approximations to Maximum Likelihood, proposed by Hedeker and Gibbons [3] and Pinheiro and Bates [11] (1995) and a higher order Laplace transformation proposed by Raudenbush et al. [12]. These developments together with the development of more sophisticated software such as MLwiN and STATA have brought multilevel modeling to new heights of application.

Generalized Multilevel Ordinal Models
The basic multilevel ordinal model based on generalized linear models uses the cumulative probabilities of response categories as the dependent variables. The development of these models and a comparison to the normal model for examination grades is presented by Fielding et al.(2003). Initial works on these models were presented by Hedeker and Gibbons [3] and Fielding [13]. Popularly referred to as the Random Effects Ordinal Regression Model, these models were based on the concept of a latent variable, which was arbitrarily scaled. This variable was assumed to underlie the response categories and vary continuously along the real line. Fielding et al. [9] bases the distribution of the latent variable to be governed by a multilevel linear model, instead of an arbitrary distribution.

The Data
The data for this study was provided by the "Primary Care Respiratory Group, Sri Lanka", a member of the International Primary Care Respiratory Group-UK (International Primary Care Respiratory Group (IPCRG) Member Associations, [14]. Thirteen different family physicians participated in this study, where each physician collected the relevant data during a stipulated time period (3 months). The data set consists of 7 variables spread across two main levels. The 2 nd level unit of the dataset can be identified as the practice of each physician, while the 1 st level unit comprises of individual patients. The level two variables comprise of the qualification of the physician (qualification with regard to family medicine) and the number of years in service. Level 1 variables comprise of the patients age (in years), gender, most prevalent symptom, duration of the symptom (in days) and the severity of the disease diagnosed by the physician. The age of patients, most prevalent symptom and the duration were categorized in to an appropriate number of categories for modeling purposes. The categorization was done on a logical basis and on medical grounds. The response variable of interest is the severity of the respiratory infection diagnosed by the physician at the end of the examination. Though the initial data classified symptoms and diagnosis according to the ICHPPC (International Classification of Health Problems in Primary Care) classification, the data has been re-categorized on a medical basis to suit the statistical modeling procedures. 2966 patient's records were used for the study. Table 1 gives the potential analytical factors, their levels and the base categories considered in modeling.

Generalized Linear Multilevel Model: Theory and Extensions
This section will briefly present the theory, on which multilevel models for ordinal categorical responses are built. Since the multilevel model is an extension of the single level model, it may be of relevance to first present the single level model for ordinal responses. Rashbash et al. [5] clearly presents the theory for Generalized Multilevel Ordinal models as follows.

Single-Level Model
Consider the response variable has t categories indexed by s(s=1,2,...t) and that category t is chosen as the reference category. Also let us consider that the probability of subject i having a response variable value of s is ) ( s i π . In order to exploit the ordering of the categories the model uses cumulative probabilities in place of the response probabilities for each category. The cumulative response probabilities are as follows.
) ( s i y represents the observed cumulative proportions for the ith subject.
Expressing the category probabilities in terms of the cumulative probabilities, The proportional odds model with a logit-link is the most frequently used model in these cases.
Assuming an underlying multinomial distribution for the category probabilities, the covariance matrix of the cumulative proportions takes the form,

Two-Level Generalized Multilevel Ordinal Model
The following model extends the single-level model to two levels, with j representing the higher level and i representing the lower level. The following explanation is provided in Fielding et al. [9]. The following represents the Generalized Multilevel Ordinal Proportional Odds Model for a single random effect and no covariates or factors.
It is assumed that the random effect ) The above model can be extended to fit fixed/random covariates/factors and several random components as follows.
Generally Z variables are a subset of X variables. β is a vector of fixed effects coefficients associated with the covariates/factors included in ij X .The elements of T j u are random variables at the 2nd level and are assumed to have a dependent multivariate normal distribution with a zero expectation Fielding et al., [9]. Also considering that the probability of The theory explained above relates to the most basic multilevel model specified for ordinal responses. Many different extensions to this basic model have been proposed by Fielding et al. [9], Hedeker and Gibbons [3], Raman and Hedeker [15] etc. An important extension of the Proportional odds model is the Generalized Multilevel Ordinal Non-Proportional Odds Model, which is widely used in situations where there is reasonable doubt to suggest that the effect of certain variables do not behave proportionally across response categories. In such situations the cumulative proportions will be modeled as follows.
Here ij t refers to the variables suspected to vary non-proportionally across logits. The terms ) ( s ω depict the estimated coefficients that vary across the logits. Fielding et al. [13] includes a detailed explanation regarding these models and their application to educational data.

Model Fitting and Interpretation
The software used for modeling of the data is MLwiN version 2.19. MLwiN is software that specializes in multilevel modeling techniques. It is geared to handle several different types of responses spanning from continuous to binary to categorical. Hence the software provides the functionality to model the dataset under consideration for this study.

5.1.
Variable Selection and Estimation Techniques The MLwiN software was used to fit both proportional and nonproportional GMOM and the Markov Chain Monte Carlo (MCMC) estimation procedure was used for obtaining parameter estimates Browne [16]. Variable selection was done using Wald tests. The choice of MCMC as the estimation technique over the more commonly used PQL(Penalized Quazi-Likelihood) and MQL (Marginal Quazi-Likelihood) techniques was to avoid the linearization of the logit functions, which tend to provide unreliable results, which would render the use of the Wald Test for assessing the significance of parameters unsuitable Omar and Thompson [17]. The use of MCMC estimation overcomes these problems and thus allows the Wald Test to be effectively used for variable selection. Hence a forward selection methodology based on then Wald test was used for the selection of variables under both modeling approaches.
The response category having 3 levels (Mild, Moderate and Severe), both the Proportional and Non-Proportional Odds models fit two logits, for the <=Mild and <=Moderate categories. Let refer to the number of response in category i (i=1 for <=Mild,i=2 for <=Moderate) made by the jth patient clustered within practice k. For this study, = 1 or 0 for all patients, since each patient only gives a single response. Thus the cumulative probabilities of the three response categories are indicated as, -Probability of patient j in practice k being diagnosed with a 'mild' condition.
-Probability of patient j in practice k being diagnosed with a 'mild or moderate' condition.
-Probability of patient j in practice k being diagnosed with a 'mild, moderate or severe' condition. (Hence =1)

Generalized Ordinal Proportional Odds Model
Of the 7 variables used in the study, only Symptom, Duration and Gender proved to be significant in the final model. The variable Gender, though not significant as a main effect, showed a significant interaction with Symptom at a 10% level of significance. The formulation of the fitted model is given in equation (9 Table 2 shows the results of the final model fitted. In table 2 the random component having the value of 1.595 indicates the between-practice variance component, and the 95% Bayesian confidence interval (0.601, 3.907) clearly indicates that this between-practice variance is significant. Table 3 presents the odds ratios and their 95% confidence intervals for each of the fitted effects. The odds are the same for both logits. Accordingly it is clear that males showing Not Frequently Encountered symptoms were more than twice more likely to be diagnosed with a less serious disease as opposed to a more serious disease (when considering both logits), than those showing Frequently Encountered symptoms while females show more than 3 times this odds. With regard to duration, patients showing symptoms for a long duration (more than 5 days) were more likely to be diagnosed with a more serious disease as opposed to a less serious disease, compared to patients showing symptoms for a shorter duration (less than or equal to 5 days). Though the 95% confidence interval depicts no significant difference in the effect of females and males with regard to the severity of the diagnosis for both symptom categories, the 90% confidence interval indicates that there is some evidence to suggest that a significant difference between females and males exist, with regard to the severity of the diagnosis in the presence of Not Frequently Encountered symptoms.

Generalized Ordinal Non-Proportional Odds Model
The Generalized Ordinal Non-Proportional Odds Model is used in situations where there is reasonable evidence to suggest that at least one explonatory variable does not behave proportionally across the logits (Fielding et al. [9]. Such behaviour can be easily detected by simple graphical and descriptive procedures. In the present study, preliminary analysis presented tentative evidence for the non proportional behavior of the variable 'Symptom' across the logits. Thus it will be of interest to fit a Non-Proportional Odds Model to the data and compare the results with the model fitted in section 5.2. The formulation of this model is given in equation 10 Table 4 presents the results of the Generalized Ordinal Non-Proportional Odds Model fitted to the data. As in section 5.2, Symptom, Duration and Gender will be fitted as main effects together with the Symptom-Gender interaction. Since the main effect Symptom is fitted with separate coefficients for the two logits the Symptom-Gender interaction is also fitted seperately across the two logits.  In table 4 the between-practice variance is indicated as 1.436 and the 95% Bayesian confidence interval (0.585, 3.306) clearly indicates that this between-practice variance is significant. Table 5 presents the odds ratios and their 95% confidence intervals for each of the main effects and interaction terms. Two odds ratios each will be calculated for the variable Symptom and the Symptom-Gender interaction since separate coefficients are fitted for each. The Non-Proportional Odds model being usually more complex, the odds ratios calculated above should be interpreted with care. According to the results presented above, males showing Not Frequently Encountered symptoms are nearly 5 times more likely to be diagnosed with a mild or moderate disease as opposed to males showing Frequently Encountered Symptoms, while females show more than 10 times these odds. The significant difference between males and females with respect to the effect of symptom on the severity of the diagnosis is noticeable. The corresponding odds of males with Not Frequently Encountered symptoms being diagnosed with a Mild condition as opposed to a Moderate or Severe condition is only approximately twice as those showing Frequently Encountered symptoms while in females it is nearly three times. The effect Duration shows a similar behavior to the Proportional Odds model while Gender with regard to the severity of the diagnosis shows no significant difference for both Symptom categories at both 5% and 10% levels of significance. On an overall basis, both models clearly indicate that Frequently Encountered symptoms may lead to more serious respiratory conditions while symptoms lasting for over 5 days could also lead to the same.

Model Comparison
Having fitted both the proportional odds and non-proportional odds GMOM, it will be of interest to compare these two models. The MCMC estimation technique provides a diagnostic named Deviance Information Criteria (DIC), which is a generalization of the Akaike's Information Criterion (AIC) Spiegelhalter et al. [18] 2002). Table 6 presents DIC values for the two models fitted.

Discussion
Multilevel models can be identified as very flexible models that allow the easy inclusion of factors and covariates at each level of the data hierarchy. Additionally it also provides the ability to include interactions between variables. Although multilevel data structures are commonly encountered in social and medical research, the modeling of such data has been confined mostly to continuous responses. Hence, modeling multilevel data in the presence of an ordinal categorical response especially for medical data is somewhat of a novel application. A complete analysis of the data was carried out by fitting a GMOM using the accepted methodology. The modeling technique used in this study though used extensively in educational data modeling, has not been widely used for modeling medical data with hierarchical structures. Hence the contribution made in this paper would be invaluable for medical researchers and statisticians, alike. MLwiN version 2.19 was used to fit both proportional and non-proportional GMOM and the MCMC estimation procedure was used for obtaining parameter estimates. Both Proportional Odds and Non-Proportional Odds models were fitted and the results were presented in Section 5.
The variables selected for the final model were Symptom, Duration and Gender. The main effects of both Symptom and Duration were highly significant in both the models. The main effect of Gender though not significant, the interaction between Gender and Symptom proved to be significant in the Proportional Odds model at a 10% level of significance. However both the main effect Gender and the interaction between Gender and Symptom were insignificant in the Non-Proportional Odds model. The interaction effect was however maintained in both models for comparison purpose. Thus on an overall basis it can be concluded that of the 7 explanatory variables considered only the Symptom shown by patients, the Duration of the symptoms and the Gender of the patient for different levels of Symptom significantly influenced the severity of the respiratory infection diagnosed.
Section 5 concludes with the presentation of the odds ratios calculated under both models and a brief comparison of the two models based on the DIC diagnostic. It will be of interest at this point to compare the results of the odds ratios for the two models. Odds ratios require careful interpretation as they are often quite difficult to interpret. However the odds ratios can be subjectively compared to get a basic idea of the behaviour of variables. The variable Symptom shows the same pattern over both the models with both male and female patients showing Not Frequently Encountered symptoms having a higher chance of being diagnosed with a low severity disease as opposed to a more severe disease. However the odds ratios produced by the non-proportional odds model, clearly indicates that the effect of the variable Symptom behaves non-proportionally across the two logits. It should also be noted that while showing the same overall pattern, the above mentioned odds show a considerable variation between males and females. The effect of Duration in both the models also shows the same pattern with patients showing symptoms for over 5 days being less prone to be diagnosed with a low severity disease. Though the 95% confidence intervals for the odds ratios for Gender in both models show an insignificant effect, the 90% confidence interval for the proportional odds model suggests that the odds of females showing Not Frequently Encountered symptoms, being diagnosed with a mild/moderate condition as opposed to a severe condition is significantly higher than the odds for males.
The model comparison phase indicates the Non-Proportional Model is to be preferred over the Proportional Odds model. However there is tradeoff between accuracy and simplicity of interpretation when it comes to selecting the best model. It should be noted that the Non-Proportional Odds model while providing more accurate results is much more difficult to interpret compared to the Proportional Odds model.