Model Comparison of Coordinate-Free Multivariate Skewed ...

12 downloads 1615 Views 406KB Size Report
Distributions with an Application to Stochastic Frontiers. José T.A.S. Ferreira ... Keywords: Coordinate-free distributions, dairy farm, multivariate skewness, orthogonal transfor- mation, stochastic ..... They call a distribution S a skewed version of.
Model Comparison of Coordinate-Free Multivariate Skewed Distributions with an Application to Stochastic Frontiers Jos´e T.A.S. Ferreira and Mark F.J. Steel∗ Department of Statistics University of Warwick, UK

Abstract We consider classes of multivariate distributions which can model skewness and are closed under orthogonal transformations. We review two classes of such distributions proposed in the literature and focus our attention on a particular, yet quite flexible, subclass of one of these classes. Members of this subclass are defined by affine transformations of univariate (skewed) distributions that ensure the existence of a set of coordinate axes along which there is independence and the marginals are known analytically. The choice of an appropriate m-dimensional skewed distribution is then restricted to the simpler problem of choosing m univariate skewed distributions. We introduce a Bayesian model comparison setup for selection of these univariate skewed distributions. The analysis does not rely on the existence of moments (allowing for any tail behaviour) and uses equivalent priors on the common characteristics of the different models. Finally, we apply this framework to multi-output stochastic frontiers using data from Dutch dairy farms. Keywords: Coordinate-free distributions, dairy farm, multivariate skewness, orthogonal transformation, stochastic frontier. JEL classification: C11; C16; C52

1

Introduction

Probability distributions that can model the presence of skewness in the distribution of a phenomenon have been the focus of interest in recent years (see Genton, 2004 for a review). Some of the classes of multivariate skewed distributions present in the literature introduce skewness along a pre-determined set of directions. Here, we are interested in classes that do not make such assumptions. We consider two classes of such distributions proposed in the literature and focus our attention on a particular subclass of one of them. A class of multivariate distributions is defined to be coordinate-free if it is closed under orthogonal transformations. A simple example illustrates the importance of dealing with a coordinate-free class of distributions, say S. Suppose that a process is measured using an orthogonal set of coordinates X = (x1 , . . . , xm )0 (i.e. xi is perpendicular to xj , i 6= j, i, j = 1, . . . , m) and that the process can be described by a distribution SX ∈ S. Now consider a change to a different orthogonal set of coordinates ∗

Address for correspondence: Mark Steel, Department of Statistics, University of Warwick, Coventry, CV4 7AL, U.K. Tel.: +44-24-7652 3369; Fax: +44-24-7652 4532; Email [email protected].

1

Y = (y1 , . . . , ym )0 , spanning the same space. The class S is coordinate-free if the process can also be described by a distribution SY ∈ S, for any set of coordinates Y . One of the many interesting features of elliptical distributions (Kelker, 1970) is that they are closed under orthogonal transformations (see Fang et al., 1990 for details). When going from elliptical distributions to skewed distributions, coordinate-free classes become even more valuable. For elliptical classes, the only characteristic that changes with direction is spread. For skewed distributions both asymmetry and spread can vary with the direction. As a consequence, classes of skewed distributions that are not coordinate-free necessarily impose that skewness is manifested along particular directions. For example, the class of distributions introduced in Sahu et al. (2003) is not coordinate-free and introduces skewness into a symmetric elliptical distribution along the original coordinates. We consider in some detail two main classes of multivariate skewed distributions that are closed under orthogonal transformations. The first is the class of skew-elliptical distributions, initially introduced through its special case of the multivariate skew-Normal distribution of Azzalini and Dalla-Valle (1996), generalised by Branco and Dey (2001), and extended by a number of authors (see Genton, 2004 for further details). The members of this class can be interpreted as generated by conditioning on an unobserved truncated variable, so they are multivariate “hidden truncation” distributions. More recently, a different class of coordinate-free distributions has been suggested in Ferreira and Steel (2004a), henceforth FS, based on linear affine transformations of multivariate random variables with independent components, each having an univariate skewed distribution. For reasons that will become clear in Section 2, the latter class of distributions is the main focus of our attention here. FS allows for any non-singular affine linear transformation. In this article, we restrict the set of transformations by imposing that for any distribution there is one set of orthogonal coordinates along which the components are independent and have known univariate distributions. In the (rather different) context of bivariate symmetric distributions with different kurtosis, Hoggart et al. (2003) introduced a class of distributions with a similar characteristic. In FS, the authors point out that the skewed distributions of the univariate components in the transformation can be freely chosen, but focus on distributions that are generated by transforming originally symmetric distributions through inverse scale factors in the positive and the negative orthant (Fern´andez and Steel, 1998). This method can be viewed as a particular example of a general mechanism for transforming univariate symmetric distributions (see Ferreira and Steel, 2004b). Here, in addition to distributions generated by inverse scale factors, we analyse others generated by three distinct methods: hidden truncation (see e.g. Azzalini, 1985 and Arnold and Beaver, 2002), order statistics (Jones, 2004) and a construct (Ferreira and Steel, 2004b). An alternative way of generating possibly skewed distributions is through the use of maximum entropy methods, as e.g. in Zellner (1996). Given the flexibility of this class of multivariate distributions, and in particular the possibility of using different distributions for the univariate components, one important question is how to select appropriate forms for a specific problem. We analyse this issue for a general Bayesian regression setup. Prior specification is of special importance and we tackle the problem by using the same priors on common parameters. This requires, however, that these parameters share the same interpretation across models, which we ensure through normalisation of the skewed univariate distributions. The latter normalisation is based on robust measures of location and spread and does not rely on moment existence. For parameters that are specific to particular models, the skewness parameters, we propose

2

prior matching ideas, where the priors on the parameters are not elicited directly but through a prior on a quantity common to all models. We achieve this by specifying a prior on a measure of skewness and deriving equivalent priors on the skewness parameters for each model. In addition to modelling skewness, it is often important to model tail behaviour of the distribution. We accommodate varying tail behaviour in our analysis by using two different types of heavy tailed distributions that differ in whether they assume a common tail behaviour for all dimensions. We specify Bayesian regression models using a proper prior structure. This implies we do not need conditions on moment characteristics to ensure posterior existence and, as such, we do not place any restriction on tail weight. Consequently, our analysis allows for distributions with (extremely) heavy tails. We take two different approaches to model comparison. We use Bayes factors and we also compare predictive quality using log predictive scores. The regression framework is then applied to a multivariate stochastic frontier problem. Such problems are traditionally dealt with through a composed error framework (as introduced in Aigner et al., 1977 and Meeusen and van den Broeck, 1977) with separate measurement and inefficiency error terms, but here we use a skewed distribution to model the composed error directly. One important advantage of this approach is that it immediately generalises to the analysis of multi-output production, in contrast to the composed error framework. We apply this to a dataset of Dutch dairy farms with two outputs, milk production and non-milk outputs. In view of the links mentioned above, let us briefly summarize how the present paper relates to and extends the existing literature. There are important differences with the analysis of FS: in the present paper we use a restricted and more operational version of the general class of distributions defined in FS, we adopt a proper prior rather than the improper prior in FS, we use a variety of skewing mechanisms rather than the inverse scale factor model in FS, and we specifically compare them using both formal within-sample and predictive criteria. This comparison also requires a framework for eliciting matching or equivalent priors as well as for normalisation of the distributions, which is developed here. The application of our methods to stochastic frontiers opens up a whole new approach to modelling. We provide a flexible and operational framework for modelling the error directly through a skewed distribution allowing us to model multi-output production frontiers by using multivariate skewed distributions. Besides a brief mention in some existing papers relating only to the case with a single output, this has not been explored in a systematic and operational manner. As this is an ambitious project which generates many new modelling possibilities, this paper can not provide more than a pilot study in this regard, which should lead to further examination of how to best exploit these new opportunities for stochastic frontier modelling. Section 2 introduces the class of multivariate skewed distributions and presents a number of results for the class. In Section 3, we review four different alternatives for introducing skewness in the distribution of the univariate components and we describe the normalisation. Section 4 introduces the Bayesian regression models considered here. A discussion about the prior, including prior elicitation and sensitivity can be found in Section 5. Equivalent priors on the skewness parameters for the different models are determined in Section 6. In Section 7 details about the model comparison procedures are presented. Section 8 describes the application to the stochastic frontier problem. The final section provides some concluding remarks.

3

2

Coordinate-Free Distributions

In this section we briefly review the complete class of distributions introduced in FS and define the subclass that will be the focus of our attention in this paper. We also briefly review the skew-elliptical class.

2.1

Complete Class of Distributions

The class introduced in FS is constructed using linear transformations of univariate skewed distributions. Let m be the dimension of the random variable ² = (²1 , . . . , ²m )0 ∈