Rate of Decay of the Tail Dependence Coefficient for the Skew t Distribution

We examine the rate of decay to zero of the tail dependence coefficient of the bivariate skew t distribution which is obtained via normal variance-mean mixture in a case where there is no asymptotic tail dependence. Our result helps to explain the difference in performance in model fitting between this skew t distribution and the one based on the variance mixing of the bivariate skew-normal distribution. This t distribution always displays asymptotic tail dependence, as happens in the symmetric case which is common to both models. DOI: http://dx.doi.org/10.4038/sljastats.v12i0.4966 Sri Lankan Journal of Applied Statistics Vol.12 2011 pp.27-40


Introduction
The skew t distribution, introduced in Demarta and McNeil [1], can be defined as the distribution of a random vector X = (X 1 , X 2 ) T in a normal variance-mean mixture as where Z ∼ N 2 (0, R) is the bivariate normal distribution with mean 0 and correlation matrix R = , is independently distributed of V ∼ Γ( η 2 , η 2 ) with η > 0 i.e. a gamma distribution with density y η 2 −1 e − η 2 y , y > 0; = 0, otherwise.
η is the shape parameter and is usually termed as the degrees of freedom of the t distribution. θ = (θ 1 , θ 2 ) T ∈ R 2 controls the asymmetry of the distribution and the symmetric case is then obtained as θ = 0.
The coefficient of lower tail dependence of a (bivariate) random vector X = (X 1 , X 2 ) T with marginal inverse distribution functions F −1 1 and F −1 2 is defined as if this limit exists, where The coefficient of asymptotic upper tail dependence of a random vector X can be defined similarly as In this note, we will mainly focus on the former dependence coefficient (3). X is said to have asymptotic lower tail dependence if the limit λ L exists and is positive. If λ L = 0, then X is said to be asymptotically independent in the lower tail. This quantity provides insight on the tendency for the distribution to generate joint extreme low values of X 1 , X 2 since it measures the strength of dependence (or association) in the lower tails of a bivariate distribution. If the marginal distributions of these random variables are continuous, then from (4), it follows that λ L (u) can be expressed in terms of the copula of X, C(u 1 , u 2 ), as The quantity λ L (u) and the limit λ L if it exists, is thus invariant under strictly increasing transformations of the marginal random variables, which is a necessary property of any measure of association. Nelsen [2] is a nowstandard introduction to copulas.
The tail dependence coefficient for symmetric t distribution, i.e. Case (1), was first found by Embrechts, McNeil and Strumann [4] and reproduced in Demarta and McNeil [1].
As noted by Banachewicz and van der Vaart [3], the introduction of skewness to the symmetric t distribution as in (1) leads to trivial values -0 or 1 -of the limit of (3) in most cases. This was used in Fung and Seneta [5] to explain the difference in performance (in model fitting to simulated and real data) from another skew t distribution. This alternative bivariate skew t distribution is obtained from variance-mixing the bivariate skew-normal distribution, Z ∼ SN 2 (θ, R) (Azzalini and Dalla Valle [6]), inversely with a gamma random variable V ∼ Γ( η 2 , η 2 ), i.e.
where Z is independently distributed of V . This skew t distribution possesses nontrivial values of tail dependence under all conditions. The proof of this result can be found in Fung and Seneta [7]. However, it should also be noted that positivity of λ L , as defined in (3) is too extreme a measure as discussed in Fung and Seneta [8], since under independence of marginals, the numerator in (6) is u 2 and the limit λ in (3) is zero. Now, if the numerator is of the order of u η , for η > 1, the limit is also zero but we are far from independence for η just above unity. Then we have no asymptotic lower tail dependence but substantial probability mass in the joint tail of the joint distribution (that is, in the lower rectangle with apex given by (u, u)) and so the behaviour of C(u, u)/u as a regularly varying function of positive index as u → 0 + was foreshadowed. The authors demonstrated this in the simplest case of all, the bivariate normal in (Fung and Seneta [8]) and the result is summarised into the following theorem.
. Then is a slowly varying function (SVF) at 0 + .
The regularly varying index of λ L (u) is 1−ρ 1+ρ and it tends to zero as ρ → 1. In other words, as the linear correlation ρ increases, we will see more and more weight at the tail of the joint distribution and λ L (u) largely reflects the size of the measure of linear correlation but this does not contribute towards asymptotic tail dependence.
As a result, in order for us to explain accurately the difference in the tail dependence, we need to find a η > 0 for the skew t distribution defined in (1) when there is no asymptotic tail dependence, such that as u → 0 + for some SVF L(u) at 0 + . In the present paper, we will restrict ourselves to focus solely on the case θ 1 = θ 2 = θ > 0 and under this condition we can show that regularly varying index coincides with that of the bivariate normal, which means that the tail dependence of the bivariate skew t decays at the same "polynomial" rate as the one of the bivariate normal. This is an intriguing result because one would expect the tail dependence for the skew t distribution to decay more slowly than in the normal distribution especially when the degrees of freedom η is small since t distribution is renowned for its heavy tails. But the only asymptotic difference is in the nature of the slowly varying multiplier, L(u).
The paper is organised as follows: in Section 2, we review the skew t and its related distribution. In Section 3, we provide our main result and find the regularly varying index of the tail dependence coefficient for bivariate skew t distribution defined in (1).

Preliminaries
For our quantitative development we need K τ (ω), the modified Bessel function of the third kind (Erdélyi et al. [9]) with index τ ∈ R and for ω > 0 which can be represented as For τ ∈ R, and as ω → ∞, we have (See Jørgensen [?]) It is convenient for a unified exposition to introduce the notationK(·, ·). For τ ∈ R, a, b ∈ R, with a and b not simultaneously 0, defineK τ (a, b) as where Γ(·) is the gamma function and K τ (·) is defined in (8). That the second and third components of the definition (10) are appropriate follows by continuity.
We shall also need to introduce the univariate skew generalised hyperbolic (GH) distribution which is necessary for our analytical development. A random variable X is said to have a univariate skew GH distribution, first introduced in Barndorff-Nielsen [?], when where Y ∼ GIG(p, a, b), which is to say it has density with p ∈ R, a, b ≥ 0 with a and b not simultaneously 0. Then, using (11), X has density: We now turn our focus back to the skew t distribution defined by (1). The random vector X = (X 1 , X 2 ) T is said to have a skew t distribution, if it has density: (14) This implies the marginal density of X 1 is By using (15) with (9) and (10), the tail behaviour for X 1 is as x → −∞. Consequently, if θ 1 ≤ 0, then X 1 possesses a power tail, and likewise for X 2 if θ 2 ≤ 0. Discussion on the tail behaviour for skew t distribution can be found in Aas and Haff [10] and Banachewicz and van der Vaart [3].
Combining (14) with (15) allows us to find the condition density of is the density of a univariate skew GH distribution, and X 2.1 can be represented as a normal mean-variance mixture:
As a result, in order for us to show that for the skew t distribution defined in (1), when θ 1 = θ 2 = θ 2 > 0, it is sufficient for us to find an ω > 0 such that as u → 0 + , for some slowly varying function L * (u) and (ω + 1)L(u) ∼ L * (u) and θ 1 = θ 2 = θ > 0. We summarise our main result into the following theorem.
Theorem 3: For a random vector X defined in (1) where Z ∼ N (0, 1), for |y| ≥ 1 (See Feller [12], Chapter VII Lemma 2), we have We will consider the terms in (20) separately. Suppose we focus on the upper bound first and we have (21) Similarly, for the second term in the lower bound of (20), we have Finally, for the first term in the lower bound of (20), we have By combining (21) to (23) with (20), we have , by using (10) and then (9) as y → −∞ in each term. Since lim y→−∞ Therefore, To complete the proof, we need the asymptotic behaviour of y = y(u) = F −1 1 (u) as u → 0 + . Suppose we set so that g(u) → −∞ as u → 0 + . It is obvious that −g(u) is a slowly varying function at 0 + . By combining (25) and (16) with θ 1 = θ > 0, we have As a result, F −1 1 (u) ∼ g(u) as u → 0 + follows from Theorem 1 of Fung and Seneta [8]. Back-substituting (25) as y = y(u) = F −1 (u) into (24) and we have This completes the proof.
The regularly varying index of the tail dependence coefficient for the skew t distribution under the condition of θ 1 = θ 2 = θ > 0 is 1−ρ 1+ρ , which does not depend on the skewness parameter θ nor the shape parameter η. This index coincides with the one of the bivariate normal which can be seen from Theorem 2. It is well established that the tails of the bivariate normal decay quickly and therefore in this case the lower tail of the bivariate skew t distribution decays quickly as well. However this is still just half the picture. Under the condition of θ 1 = θ 2 = θ > 0, the joint upper tail of the bivariate skew t distribution has complete tail dependence, i.e. the limit of (5) is 1, this indicates the joint upper tail will decay very slowly. The tails under the condition θ 1 = θ 2 = θ < 0 behave similarly since they are the mirror image of those of θ 1 = θ 2 = θ > 0. As a result, this skew t distribution always has one tail which is very heavy and the other one decays very quickly in the case θ 1 = θ 2 = θ ̸ = 0 that we have considered here. This inflexibility is a disadvantage in model fitting against the alternative skew t distribution defined by (7) which posses nontrivial values of tail dependence under all conditions. This is likely the cause of the relatively poor performance for the classical skew t model when compared to the alternative skew t model in Fung and Seneta [5] when they were fitted to both simulated and real bivariate data. If the goodness of fit to the marginals is equally good as in Fung and Seneta [5], the model that has a more flexible tail dependence structure is more likely to perform better.