Aggregation theorems and multidimensional stochastic ... - Springer Link

0 downloads 0 Views 1MB Size Report
Although the final choice has to be made on the basis of some combination of .... the first set of conditions leads to an arithmetic mean combination rule, and theĀ ...
A . A . J. M A R L E Y

AGGREGATION

THEOREMS

MULTIDIMENSIONAL CHOICE

AND

STOCHASTIC

MODELS

A B S T R A C T . In m a n y choice situations, the options are multidimensional. N u m e r o u s probabilistic models have been developed for such choices between multidimensional options and for the parallel choices determined by one or m o r e c o m p o n e n t s of such options. In this paper, it is a s s u m e d that a functional relation exists between the choice probabilities over the multidimensional options and the choice probabilities over the associated c o m p o n e n t unidimensional options. It is shown that if that function satisfies a marginalization property then it is essentially an arithmetic m e a n , and if the function satisfies a likelihood independence property then it is a weighted geometric mean. T h e results are related to those on the combination of expert opinion, and various probabilistic models in the choice literature are shown to have the geometric m e a n form.

Keywords: Aggregation theorems, stochastic choice, aggregation, choice.

1. I N T R O D U C T I O N

Luce (1959) developed his choice model on an axiomatic basis, showing that a probabilistic version of the independence o f irrelevant alternatives condition is equivalent to an interesting unidimensional representation of choice probabilities on the subsets of some finite set. This choice model has been successfully applied in numerous areas, including preference and psyehophysics (Luce, 1977). However, from the very beginning (Debreu, 1960), it was clear that there were certain situations, usually involving multidimensional objects of choice, where the choice model's predictions would not be expected to hold. This known limitation of the choice model motivated researchers to develop extensions of the model that could handle a larger range of phenomena. Perhaps surprisingly, most of these extensions did not focus on the unidimensional aspect of the model, and attempt to develop a multidimensional version; instead, they showed that Luce's choice model could be represented as a random utility model (Luce and Theory and Decision 30: 245-272, 1991. ( ~ 1991 Kluwer Academic Publishers. Printed in the Netherlands.

246

A.A.J.

MARLEY

Suppes, 1965), and then developed more general random utility models (Marley, 1989). The major classes of random utility models that include the choice model as special cases are (Marley, 1989): Strauss' (1981) choice by features model, McFadden's (1978) generalized extreme value (GEV) model, and Tversky's (1972a, b) elimination by aspects (EBA) model. Each of these models is based on dependent multivariate extreme value distributions, and although they can all be interpreted in terms of multidimensional representations of the objects of choice (Marley, 1989), this aspect of these models is not usually emphasized. Also, of these models, Tversky's EBA is the only one that is given an axiomatic development. Rotondo (1986) returns to the original counterexamples to Luce's choice axiom as given by Debreu (1960), and takes their multidimensional nature seriously. This leads him to introduce a fascinating generalization of the choice model that deals with multidimensional objects of choice in a natural manner, and equally naturally handles Debreu's (and others) counterexamples. In working with Rotondo's (multidimensional) generalization of the choice model, it became clear to me that there are two rather distinct aspects to his contribution. The first, and more obvious, is a general representation of choice probabilities that can (implicitly) deal with choices between multidimensional objects of choice; the second is a particular relationship between choice probabilities over the individual dimensions versus choice probabilities over all the dimensions. (See Marley, 1991b, for discussion of when such choice probabilities over the individual dimensions can be obtained empirically). It turns out that the solution that Rotondo (implicitly) proposes for this latter problem has a close relationship to recent work on combination of probability distributions and/or combinations of expert opinion (Genest and Zidek, 1986). It is also interesting that the axiomatic motivation for this representation is a kind of multidimensional independence of irrelevant alternatives condition. The structure of this paper is therefore as follows: I first present two sets of independence conditions that imply particular relationships between unidimensional and multidimensional choice probabilities. 1 I relate these results to work on combination of expert opinions, and I illustrate each o f these relationships with example models from the literature on stochastic choice. I also suggest areas for further work.

AGGREGATION

THEOREMS

AND STOCHASTIC

2. M U L T I D I M E N S I O N A L

CHOICE

CHOICE

247

MODELS

In a simple choice experiment, a person is asked to select one element from a set of available alternatives according to some specified criterion (preference, loudness, size, etc). In many such cases, the objects of choice are multidimensional, e.g. the loudness of a pure tone depends on both its amplitude and its frequency; the desirability of a house depends (among other things) on its price, location, and size. Although the final choice has to be made on the basis of some combination of the preferences on the individual dimensions, we assume that it is also possible for the person choosing to select the 'best' object on the individual dimensions. Such choices on individual dimensions may or may not depend on the whole context, in particular they may depend on the values of the objects of choice on the currently 'irrelevant' dimensions. Obviously, such possible relations lead to complex empirical questions, but fortunately I am able to formulate the present theoretical conditions without having a complete answer to the empirical questions (see Marley (1991b) for further discussion of the empirical issues). Also, I assume that choices are probabilistic, even in the unidimensional case. DEFINITION 1. A structure of choice probabilities is a pair (R, P) where R = {x, y , . . . } is a finite set, and P is a function from R x 2 R to [0, 1] such that for 0 C X C _ R , (i) P(x: X) = 0 for x ~_~X and (ii) Ex~ A P(x : X ) = 1. x#O

For simplicity, and because many choice situations satisfy this condition, I restrict attention in this paper to finite sets R. However, the proofs do not depend in any significant way on the finiteness assumption - in fact, much of the related literature on combination of expert opinions deals with both finite and infinite sets, and in the infinite case the later existence assumptions M2 and L2 become much more realistic. For the general case, other recent work on combining probability distributions (somewhat outside the literature on combining expert opinion) is presented in Alsina (1989) and Marshall and Olkin (1988). For each x E X C R, P ( x : X ) is interpreted as the probability of

248

h. A. J. MARLEY

choosing x as the 'best' element in X. When the options in X are multidimensional, as discussed next, there are numerous ways in which the person can decide on the overall 'best' option. For instance, the person could simply attend to a particular dimension, and choose the 'best' option on that dimension as the overall 'best' o p t i o n - this does not, in general, appear to be a sensible strategy. A slightly more general strategy would be to 'attend' to each dimension with a certain probability, and when 'attending' to a particular dimension choose as the overall 'best' option the option that is 'best' on that dimension; I show later that a similar process is implied by a quite natural simple marginalization property (Assumption M1 below). I frequently need to consider distinct structures of choice probabilities on the same set R - I normally use (R, P) (R, Q) to denote such structures; a class of such structures of choice probabilities (R, P), (R, Q) etc., on a fixed set R (satisfying certain properties) is called a class of choice probabilities on the set R. I now expand the notation of Definition 1 to explicitly mention the dimensional representation of the objects of choice. So let { 1 , . . . , n} be the set of dimensions and let Pi(x: X) be the probability of selecting x as the 'best' option in X on dimension i. For notational completeness, I should now write (R, P, P 1 , . . . , P~) for such a structure of multidimensional and unidimensional choice probabilities. However, for notational simplicity, I continue t o write (R, P) with the understanding that P 1 , 9 9 9 , en are also defined, and I continue to refer to such (R, P), (R, Q), etc., on a fixed set R as a class of choice

probabilities on the set R. R e m e m b e r , P(x : X ) is the (theoretical) probability that option x is selected as 'best' (overall) in set X, whereas Pi(x: X) is the (theoretical) probability that option x is selected as 'best' in set X on dimension i. It is very important to r e m e m b e r that there is no necessary relation between P(x : X) and the Pi(x : X), i = 1 , . . . , n - it is the purpose of this paper to study and motivate such relations. Stated another way, empirical estimates of P(x : X ) and Pi(x : X), i = 1 , . . . , n, could be obtained in different experiments, with different instructions; there are numerous possible such experimental designs (Marley, 1991b). For present purposes, it is suffieient to think of the multidimensional options being presented; when the overall choice probabilities are

AGGREGATION

THEOREMS AND STOCHASTIC CHOICE

249

being estimated, the person is told to select the (overall) 'best' option; when the choice probabilities on dimension i are being estimated, the person is told to select the option that is 'best' on dimension i, and to ignore the other currently irrelevant dimensions. To further emphasize these points, and to indicate the nature of later results, I now introduce two processes that might be used by a person trying to select the 'best' overall option. Each process implies a different relation between P(x : X) and the Pi(x: X), i = 1 , . . , n, supporting my point that there is no necessary relation between these quantities. First, consider the possibility that

P(x

: X)= ~

Wx(i)Pi(x: X )

i=l

where Wx(i ) E [0, 1] and Ei"=x Wx(i ) = 1. This relation would arise if the person in choosing the overall 'best' option 'attended' to dimension i with probability Wx(i), and when attending to dimension i selected the 'best' option on that dimension as the overall 'best' option. This process (mentioned earlier) and representation is a special case of a general representation that is implied by a quite natural simple marginalization property (Assumption M1 below). A second possible relation is

II~%~P~(x :X) P(x: X) = EyEx iii~ 1 pi(y: X ) . This relation would arise if the person in choosing the overall 'best' option independently selected the 'best' option on each dimension; if the same option were selected as 'best' on every dimension, then that option is considered the overall 'best' option; otherwise, the person resamples over all the dimensions. Clearly, this is a 'very special procedure for reaching a decision, and there is no reason in general to expect a person to choose in this way. Nonetheless, we will see later that such a representation is a special case of a general representation that is implied by a quite natural likelihood independence property (Assumption L1 below). In the above, I have mentioned and partially motivated two aggrega-

250

A. A. J. MARLEY

tion functions for relating the overall choice probabilities P ( x : X ) to the component choice probabilities P i ( x : X ) , i = 1 . . . . . n - the first function being an arithmetic mean, the second a weighted geometric mean. The major purpose of this paper is to study and motivate these and possibly other 'reasonable' aggregation procedures. I will show that, under certain plausible conditions, slight generalizations of the arithmetic and weighted geometric means are the only possible such aggregation procedures. In summary, the task is now to present plausible conditions relating the overall choice probabilities P ( x : X ) to the component choice probabilities P i ( x : X), i = 1 , . . . n. I consider two sets of conditions, parallel to those suggested in other aggregation contexts, showing that the first set of conditions leads to an arithmetic m e a n combination rule, and the second leads to a weighted g e o m e t r i c m e a n combination rule.

3. AGGREGATION FUNCTIONS SATISFYING THE SIMPLE MARGINALIZATION PROPERTY I now state and discuss the various assumptions that lead to arithmetic mean aggregation. The major condition is the first, with the others constraining the solution in appropriate ways for the current situation. ASSUMPTION M1 (simple marginalization property). For a structure of choice probabilities (R, P) and for x E X C R, P ( x : X ) = F x [ x , Pi(x: X ) , . . . ,

Pn(X: X)]

where F x is a function that may depend on the elements of the set X. I write F x [ x , . . .] rather than, say, F x , x [ . . . ] or F [ X , x . . . . ] to keep the notation similar to that used in the literature on the aggregation of expert opinion (e.g. Genest, 1984) where the function would depend on x but its dependence on X would not be explicitly stated. Note that for a given structure (R, P) and x E X C_R, one can trivially satisfy Assumption M1 by defining F x ( x , a 1. . . . , a n) = P ( x : X ) for all a i E [0, 1], i = 1 . . . . , n. However, the representation has to hold for all structures of choice probabilities (R, Q), not just a particular one (n, P).

AGGREGATION THEOREMS AND STOCHASTIC CHOICE

251

I call Assumption M1 simple marginalization because it is a special case of the marginalization property considered in the context of opinion aggregation (McConway, 1981, and Genest, 1984)- I discuss that property later. Clearly, aggregation by simple marginalization is similar to the way univariate or marginal distributions are aggregated to form multivariate distributions - but as already emphasized here the Pi are not necessarily the marginais of P. The main result of this section is that p r o v i d e d IX] > 2, the simple marginalization property, plus some regularity and existence conditions, imply that F x is essentially an arithmetic mean with weights that can depend on X. However, when IXI = 2, the class of solutions is much larger - for instance, the following class of functions satisfy the simple marginalization property when IXI = 2. Let X = {x, y}, write p~(x, y ) for P~(x: { x , y}), i = 1 , . . . , n, p ( x , y ) for P ( x : { x , y}), and rewrite Assumption M1 in the form: p ( x , y ) = G [ x , Pa(X, y ) , . . . , p n ( x , y)], p ( y , x ) = G [ y , P l ( Y , x ) , . . . , P n ( Y , X)],

where G = F{x,y ~ might depend on {x, y}. With a i = p i ( x , y), i = 1 , . . . , n, and remembering that p ( x , y ) + p ( y , x) = 1, the above yields: (1)

G(x, al,...,

an) + G ( y , 1 -

a~ . . . .

,1-

an) = 1 .

The general solution of this equation has the form: fy is an arbitrary mapping of [0, 1] n to [0, 1] and for a; E [0, 1], i = 1 . . . . , n, G ( y , a l , . . . , a , ) = f y ( a l , . . . , c~)

and G(x,

o,,) = 1 -L(1

-

1 -

Of course, in general such G ( y , a 1 , . . . , a n) and G ( x , ol1 , . . , a n ) will not be 'of the same form'. The following class of solutions does have that property, and with further restrictions becomes the solution when

252

A. A. J. MARLEY

X has more than two elements (remember, I am suppressing reference to the current set {x, y}): for r E {x, y}, o~i C [0, 1], i = 1 , . . . , n,

(2) where the following conditions hold: 1. 0 and */ are continuous strictly monotonic increasing functions mapping [0, 1] onto [0, 1] with 0(a) + 0(1 - a) = ,/(o0 + ,/(1 - a) = 1 for all a E [0, 1]. 2. w ( k ) , k = 1 , . . , n, are weights with w ( k ) E [0, 1], and 2~=~w(k) = 1. 3. 0 ~< q~(r) ~< 1 with ~0(x) + ~0(y) = 1. It is obvious that this class of functions satisfies the simple marginalization property; we will also see from Theorem 1 below that one can weaken the restriction that w ( k ) E [0, 1] provided other appropriate changes are made to guarantee that G gives the desired probability function. Example 0 and */can be constructed with 0 and */possibly distinct cumulative distributions on [0, 1] with densities that are symmetric around a mean of 1. If 0 and */are both the identity function, then the above representation agrees with that given by Theorem 1 below for the case IXI >/3, but otherwise the representation is more general. I now show that when IXI I> 3, the only solutions satisfying Assumption M1 are essentially arithmetic means. I do, however, need additional existence and monotonicity conditions. A S S U M P T I O N M2. For any n-dimensional real vectors (r 1. . . . . rn), (Sl . . . . , s , ) with ri, si, rl + s i E [0, 1], i = 1 , . . . , n, and for x, y, z E X C R, there exist distributions Qi(.:X) such that for i = 1 , . . . , n, Qi( x: x ) =

ri ,

Q i ( y : X ) = si ,

Qi(z: x ) =

l-(ri

+ si) .

AGGREGATION

THEOREMS

AND STOCHASTIC CHOICE

253

Note that this condition holds vacuously unless IXI i>3. It is a technical assumption that allows the application of functional equation results to our problem. This assumption is somewhat 'strange' in that we are dealing with finite sets R, yet we want the total set of available probabilities to be quite dense. Nonetheless, the assumption is plausible for the usual choice models, e.g. Luce's choice model. Although it is probably worthwhile in the future to study weaker versions of this condition, it is nonetheless Assumption M1 (and not Assumption M2) that is the major constraint. Falmagne (1981) discusses the use of conditions similar to Assumption M2 in other situations, and presents weaker versions of those conditions. Also, Aczel and Dho:mbres (1989, Chapter 6) discuss the related general problem of conditional functional equations. ASSUMPTION M3 (dominance principle). For all x E X c R, and structures of choice probabilities (R, P), (R, Q), if Pi(x: X ) 3, but otherwise the representation is more general. I now show that when IX 1t> 3, the only solutions satisfying Assumption L1 are weighted geometric means. I do, however, need additional existence and monotonicity conditions. A S S U M P T I O N L2. For any n-dimensional positive real vectors (r 1. . . . , r,), (Sl . . . . . s,), it is possible to select a structure of nonzero choice probabilities (R, P) and x, y, z ~ X C R, such that for i = 1,

. . .

, n,

260

A . A . J . MARLEY L ~ ( x , y) = ri ,

L xPi( y , z) = s i .

Note that this condition requires Ix[ >/3. Similar comments,can be made about this condition as were made previously about Assumption M2. ASSUMPTION L3 (dominance principle). For structures of nonzero choice probabilities (R, P), (R, Q), and x, y E X , if L~(x,y)- 3 there exist constants Wx(i), i = 1 , . . . , n, such that for each structure of nonzero choice probabilities (R, P) and for x @ X , P(x : X ) =

[Iin=1 Pi(x. X) wx(i) Eyex II~=a Pi(Y : X ) wx(O "

If Assumption L3 also holds then the constants Wx(i ) are nonnegative.

AGGREGATION THEOREMS AND STOCHASTIC CHOICE

261

The proof is given in the Appendix. Clearly, it does not make sense for P(x: X) to be a decreasing function of Pi(x: X) for any i = 1 , . . . , n so I now discuss various examples with nonnegative Wx(i), i=l,...,n. In the following, I write

P(x : X) ~ [I Pi(x : X) wx(O i=1

when the choice probabilities have the form given by Theorem 2, with the same notation for particular examples. For instance

P(x: X ) - F[ vi(x) wx(i) i=1

would mean that

P(x : X) =

II~Ll v~(x) wx(~ Eyex IIi~-_1vi(Y) wx(i) 9

In fact, using this representation for the overall choice probabilities, and assuming that for 1 = i , . . . , n,

vi(x) Pi(x : x) - EyexVi(y ) ,

we can write

Pi(x:X)~vi(x),

i=l,...,n,

and

P(x: X) ~ Vx(X) with vx(x ) = [I vi(x) wx(i) , i=I

i.e. the choice probabilities (componentwise and overall) are related as in Theorem 2, and Luce's choice model is satisfied on each of the dimensions. If the weights Wx(i ) are independent of X, then Vx(X) is

262

A.A.J.

MARLEY

also independent of X, and so Luce's choice model is also satisfied by the overall choice probabilities, with the combined scale values being weighted geometric means of the component scale values. However, it is clear from the proof of Theorem 2 that the weights Wx(i ) will be independent of X if Assumption L1 (likelihood independence principle) holds with F x a fixed function, independent of X. Thus, Theorem 2 shows that the 'correct' way to combine probabilities and scale values satisfying Luce's choice model is via a weighted geometric mean. There are n u m e r o u s probabilistic choice models besides Luce's choice model that relate unidimensional and multidimensional choices in a similar fashion, so I only give illustrations from recent work. 1. Suppose that there are nonnegative measures ~g(x, y) for x, y @ R which are interpreted as the advantage of element x with respect to element y on dimension i. A slight generalization of Rotondo's (1986) model (Marley, 1991a) assumes that for i = 1 , . . . , n

Pi(x. X ) ~

~ ~i(x ' y)l/O(X) yEX-{x}

with O(X) > 0, and thus if the representation of Theorem 2 also holds, then

P(x: x)=

[I

y~X-{x}

where

nx(x, y)= g=l (I This reduces to the choice model when Wx(i ) = w(i), O(X) = IX I - 1, and hi(x, y) = ui(x ). 2. Suppose that there are nonnegative measures ~e(x, X*), i = 1 , . . . , n, x E R, where X* is some 'ideal' option that may depend on the current choice set X, and ~g(X, X*) is a measure of the similarity of the element x to the ideal element X* on dimension i. One might then assume that

AGGREGATION THEOREMS AND STOCHASTIC CHOICE

263

e i ( x : X ) ~ $1i(x, X * ) and apply T h e o r e m 2 to obtain the overall (multidimensional) choice probabilities. Of course, further restrictions must be placed on the ~i, otherwise any set of unidimensional choice probabilities can be fit by this m o d e l - although the representation then derived for the overall choice probabilities may not be 'correct'. Hoijtink (1989) presents a unidimensional, binary choice version of this model. We again obtain the choice model when the weights Wx(i ) are independent of X, and T~i(X, X * ) : v(x), i = 1 , . . . , n. Note that in Examples 1 and 2 the unidimensional and multidimensional choice probabilities are all of the 'same form'. It would be interesting to formulate in a general way what the complete class of choice models is that has this property. As in Section 2 we can replace the motivation for the representation in terms of dimensions by other interpretations. For instance, we can reformulate the discussion that led to Markov transition models to obtain the following type of representation:

e(x : X) ~

I-[

P(x : B n X ) wR(B) ,

BOX~fJ,X BCR and apply it r e c u r s i v e l y - to the choice sets P(x: C), 0 ~-C C X - t o obtain an overall representation for P(x : X). I am not aware of such a form having been studied in the literature, and it would need careful elaboration to decide whether it yields viable choice models. [Note that I have in this case excluded sets B c_ R with B n X -- O, X from the product.]

5. DISCUSSION I have concentrated in this paper on constant utility models which assume that response probabilities are some deterministic function of various scale values associated with the relevant options. A second major class of models are the random utility models which assume a utility function on the options that is randomly determined on each

264

A. A. J. MARLEY

presentation, but once the (random) utility values are selected then the subject's choice is determined, usually being the outcome(s) with the largest current utility. Many common probabilistic choice models are both constant and random utility models although they are more often axiomatized as constant utility models. In particular, Luce's choice model, a constant utility model, can also be represented as a (independent) random utility model (Luce and Suppes, 1965). Thus, since we have shown earlier that Luce's choice model is 'closed' under geometric mean aggregation, both the unidimensional choice probabilities and the associated multidimensional choice probabilities will satisfy a (independent) random utility model. However, I now discuss a sense in which this random utility model is not itself 'closed', and pose the question of whether such 'closed' random utility models exist that satisfy, say, geometric mean aggregation. Generalizing the standard definition of a random utility model (e.g. Luce and Suppes, 1965) I say that a structure of choice probabilities (R, P) satisfies a random utility model if there exist random variable U(r) and Ui(r), r E R, i = 1 , . . . , n such that for each x E X C_R,

P(x: X ) = Pr(U(x) >I U(y), y E X - {x}), and

Pi(x: X ) = Pr(Ui(x ) ~ Ui(Y), y ~ X - {x}), i=l,...,n. In particular, it is known (e.g. Marley, 1991a) that if the random variables are independent with the following distributions: for r E R

Pr(U(r) ~< t) = exp

v(r) t '

t >10, v a ratio scale,

and

oi(r)

Pr(Ui(r ) ~< t) = exp - - then

t 1> 0,

Vi a

ratio scale,

AGGREGATION

THEOREMS

AND STOCHASTIC

CHOICE

265

P(x: X ) = Pr(U(x) >1U(y), y @ X - {x})

o(x) Ey~X (Y) and

Pi(x: X ) = Pr(Ui(x ) ~ Ui(Y),

y E X - {x})

oi(x) Erex vi(Y) ' i.e. the choice model both on the component probabilities Pi and on the overall probabilities P. In particular, if v(x)= II~"=lvi(x), then the overall and component choice probabilities are also related by geometric mean aggregation (see the results in the previous section). However, it is also reasonable to study aggregation at the level of the random variables U(r), Ui(r), x E R, i = 1 , . . . , n, i.e. to suppose that

(3)

U(r) = G[U,(r) . . . . .

U.(r)]

for some function G. Plausible examples of G(o~1. . . . . a , ) are (l/n)Ei"=la i, maxi=a ...... cei, and Ilin~t~i . For instance, taking the case U(r) = max;=~ ...... Ui(r ) with the U(r), U;(r) distributed as above, we obtain Pr(U(r) ~< t) = exp - [v(r)/t] were v(r) = ET=lvi(r), and therefore, again as above, the overall choice probabilities P satisfy the choice model with these scale values v(r), r ~ R. Thus, we now have a random utility model where both the component and the overall choice probabilities satisfy the choice model, with the overall random utilities a 'natural' function of the component random utilities. However, the resultant overall choice probabilities are not related to the component choice probabilities by geometric mean aggregation since the relation v(r) -= ~in=lVi(r) is not compatible with such aggregation. ~[laus a natural question is whether there are any random utility models satisfying Equation 3 for which the component and overall choice probabilities are related by geometric mean aggregation.

266

A . A. J. M A R L E Y

The above results and questions only begin to address the interesting problems that arise when one considers aggregation at the level of random utilities, and the relation of such aggregate models to aggregation at the level of choice probabilities; Marley (1991a, 1991b) discusses other partial results available to date. APPENDIX

Proof of Theorem 1. This result is closely related to those of Aczel, Kannappan, Ng, and Wagner (1983), Aczel, Ng, and Wagner (1984) and Genest (1984). In the following r = (r 1. . . . . r , ) and s = (s 1. . . . , s , ) are vectors satisfying the conditions of Assumption M2, and 0 = ( 0 , . . . , 0) is an n-component vector of zeros. Using Assumptions M1 and M2 we can select x, y, z E X C R, r, s, r + s E [0, 1]", and Q, Qi, i = 1 . . . . , n, such that

Q(x: x ) = Fx(x,r), Q(y: x ) = Fx(y,s), Q(z: X) : Fx(z, 1 - (r + s)) and

Q(w:X)=Fx(w,O) for

wEX-{x,y,z}.

Therefore

I=Q(x:X)+Q(y:X)+Q(z:X)+

~ w~X-(x,y,z}

= Fx(x , r) + Fx(Y, s) + Fx(z, 1 - (r.+ s)) +

~_~

Fx(w, 0),

w~X-{x,y,z}

i.e. (A1)

Fx(z, 1 - (r + s)) = 1 - Fx(x, r) - Ix(y, s) -

E

w~X-{x,y,z}

Fx(w, 0).

Q(w:X)

AGGREGATION

THEOREMS

In particular w h e n (A2)

s =

AND STOCHASTIC

CHOICE

267

0,

Fx(z, 1 - r) = 1 - Fx(x, r) -

Fx(w, o). w ~ X - {x,z)

H o w e v e r , E q u a t i o n A 2 then holds for arbitrary r, in particular for r + s, so substituting E q u a t i o n A 2 in the left h a n d side of E q u a t i o n A1 for that case, we obtain

1 - Fx(x, r + s) -

Fx(W, O)

~ w~X-(x,z)

Fx(w, o ) ,

= 1 - Fx(x, !") - F x ( y , s) w~X-{x,y,z}

i.e.

Fx(x , r + s) = Fx(x, r) + Fx(Y, s) - Fx(Y, 0), i.e.

Fx(x, r + s) - Fx(x , O) = Fx(x, r) - Fx(x , O) + Fx( Y, s) - Fx(y, 0), i.e. with

6 x ( w , t) =

Fx(w, t)

-

Fx(w,0 ) ,

we have (A3)

Gx(x, r + s) = Gx(x, r) + Gx(Y, s ) .

Also, since 0 ~< Fx(w, r) ~< 1 for all w E X , it follows f r o m its definition that [Gx(w, t)l-< 1 and Gx(w, 0) = 0. Using E q u a t i o n A3 with r = 0 then gives Gx(x, s) = Gx(Y, s), i.e. Gx(x, s), is i n d e p e n d e n t of x. So writing ~x(S) for Gx(x , s) we h a v e I~x(s)l-< 1 and

s

+ s) = s

+ ~:x(S) for

r, s, r + s ~ [0, 1] ~ .

268

A.A.J.

MARLEY

This is, of course, Cauchy's equation for vectors but on a restricted domain. However, in analogy to Aczel (1966, Section 2.1.4), we can prove that there exist Wx(j), j = 1. . . . . n, such that

~x(r) = j2= , Wx(j)ri

1 7=,Wx(J)l l

where definition

since I ~ x ( r ) l ~ a for all r E [ 0 , 1 ] " . Now by

~x(r)=Fx(x,r)-Fx(x,O), i.e.

Fx(x, r) = ~x(r) + Fx(x, O) = 2 Wx(j)r i + Fx(x, 0). j=1

T h e r e f o r e returning to the definition of Fx(x, r) in terms of Q, i = 1 , . . . , n, we have for x ~ X C R , (A4)

Qi,

Q(x: x ) = Fx(x, Ql(x: x ) , . . ., Q,(x: x)) = 2 Wx(j)Qj(x: X) + Fx(x,O ) 1=1

and so

1= ff'~Q(x: X)= 2 Wx(l) + Z Fx(x,O). j=l

x~X

x@X

Therefore if Ej=lWx(l)-I n 9 _ then Fx(x,O)=O for all x E X since Fx(x, O) >-0 by construction, and so Equation A4 reduces in this case to (A5)

a(x: X) = 2 wx(j)Qj(x: x ) . j=l

AGGREGATION

THEOREMS AND STOCHASTIC CHOICE

269

Now consider the case ZT=lWx(j)< 1 - w e know that E~.=lWx(j)3, let positive real vectors r = ( r l , . . . , rn) and s = ( s l , . . . , s,), and x, y, z E X C R, have the properties in Assumption L2, i.e. L~(x, y) = r i ,

L xPi( y , z)

_~.

Si .

Also, from the definition of L P, P Lx(x, y ) L P ( y , z) = Lx(x, P z),

i.e. using Assumption L1 with the above, we have Fx(r)Fx(s) = Fx(r.s). Now F x is defined and bounded for all positive real vectors (Assumption L2) so we can use a multivariate extension of Aczel (1966, Theorem 3, p. 41) to conclude that

Fx(r) : lZ[ r Y ) i=l

270

A. A. J. M A R L E Y

for some constants W x ( i ), i = 1 . . . . , n. T h e first result is then immediate, and the second follows because Assumption L3 forces F x to be monotonic increasing in each variable, which requires W x ( i ) >I O, i = 1,...,n. ACKNOWLEDGEMENTS

This work was supported in part by the Natural Science and Engineering Research Council of Canada. The basic results were presented in an Invited Address at the Twenty First Annual Meeting of the Society for Mathematical Psychology, Northwestern University, July 1988. The referee provided helpful critical comments. I thank Janos Aczel for ensuring that I was aware of all the relevant functional equation results, Ulf B6ckenholt for providing references to recent work on ideal based choice models, and John Rotondo of AT&T Bell Laboratories, Murray Hill, NJ., for providing me with copies of his unpublished results on his generalization of Luce's choice model. NOTES 1 Phrases such as 'unidimensional choice probabilities' (respectively, 'multidimensional choice probabilities') should be interpreted as meaning 'choice probabilities over unidimensional options' (respectively, 'choice probabilities over multidimensional options') - i.e. it is the options, not the probabilities, that are uni- or multi-dimensional. The phrasing used is clearly more compact and should not lead to confusion. REFERENCES Aczel, J.: 1966, Lectures on Functional Equations and Their Applications. New York: Academic Press. Aczel, J.: 1984, 'On weighted synthesis of judgements', Aequationes Mathematicae, 27, 288-307. Aczel, J., Kannappan, PI., Ng, C. T., and Wagner, C.: 1983, 'Functional equations and inequalities in "rational group decision making",' In E.F. Beckenbach and W. Walter (Eds.), General Inequalities 3, Basel: Birk~iuser Verlag. Aczel, J., Ng, C. T., and Wagner, C.: 1984, 'Aggregation theorems for allocation problems', SIAM J. Algebraic Discrete Methods, 5, 1-8. Aczel, J., and Dhombres, J.: 1989, Functional Equations in Several Variables. New York: Cambridge University Press. Alsina, C.: 1989, 'Synthesizing judgements given by probability distribution functions', Manuscript, Department Mathem~tiques, Universidad Polit~cnica de Catalunya.

AGGREGATION

THEOREMS AND STOCHASTIC CHOICE

271

Bordley, R. F.: 1982, 'A multiplicative formula for aggregating probability assessments', Management Science, 28, 1137-1148. Corbin, R. and Mafley, A. A. J.: 1974, 'Random utility models with equality: An apparent, but not actual, generalization of random utility models', J. Mathematical Psychology, 11, 274-293. Debreau, G.: 1960, 'Review of R.D. Luce "Individual choice behaviour: A theoretical analysis",' American Economic Review, 50, 186-188. Dempster, A. P.: 1967, 'Upper and lower probabilities induced by a multivalued mapping', Ann. Mathematical Statistics, 38, 325-339. Falmagne, J. C.: 1981, 'On a recurrent misuse of a classical functional equation result', J. Mathematical Psychology, 23, 190-193. Genest, C.: 1984, 'Pooling operators with the marginalization property', Can. J. Statistics, 12, 153-163. Genest, C., Weerahandi, S., and Zidek, J. V.: 1984, 'Aggregating opinion pools through logarithmic pooling', Theory and Decision, 17, 61-70. Genest, C. and Zidek, J. V.: 1986, 'Combining probability distributions: A critique and an annotated bibliography', Statistical Science, 1, 114-148. Hoijtink, H.: 1989, 'A latent trait model for dichotomous choice data', Manuscript, University of Groningen, Netherlands. Luce, R. D.: 1959, Individual Choice Behaviour, New York: Wiley. Luce, R. D.: 1977, 'The choice axiom after twenty years', J. Mathematical Psychology, 15, 215-233. Luce, R. D. and Suppes, P.: 1965, 'Preference, utility, and subjective probability', In R. D. Luce, R. R. Bush, and E. Galanter (Eds.), Handbook of Mathematical Psychology, III. New York: Wiley, pp. 230-270. Marley, A. A. J.: 1989, 'A random utility family that includes many of the "classical" models and has closed form choice probabilities and choice reaction times', Brit. J. of Mathematical and Statistical Psychology, 42, 13-36. Marley, A. A. J.: 1991a, 'Context dependent probabilistic choice models based on measures of binary advantage', Mathematical Social Sciences (to appear). Marley, A. A. J.: 1991b, 'Developing and characterizing multidimensional Thurstone and Luce models for identification and preference', In F. G. Ashby (Ed.), Multidimensional Models of Perception and Cognition, Hillsdale, NJ: Erlbaum. Marshall, A. W. and Olkin, I.: 1988, 'Families of multivariate distributions', J. American Statistical Association, 83, 834-841. McConway, K. J.: 1981, 'Marginalization and linear opinion pools', J. American Statistical Association, 76, 410-414. McFadden, D.: 1978, 'Modelling the choice of residential location', In A. Karlquist et al. (Eds), Spatial Interaction Theory and Planning Models, Amsterdam: North Holland. Rotondo, J.: 1986, 'A generalization of Luce's choice axiom and a new class of choice models', Psychometric Society, Abstract, 1986. Schmidt, F. F.: 1984, 'Consensus, respect, and weighted averaging', Synthese, 62, 25-46. Sharer, G.: 1976, A Mathematical Theory of Evidence, N.J.: Princeton University Press. Small, K. A.: 1986, 'A discrete choice model for ordered alternatives'~ Econometrika, 55, 409-425. Strauss, D.: 1981, 'Choice by features: An extension of Luce's choice model to account for similarities', Brit. J. Mathematical and Statistical Psychology, 34, 50-61.

272

A.A.J.

MARLEY

Tversky, A.: 1972a, 'Elimination by aspects: A theory of choice', PsychologicalReview, 79, 281-299. Tversky, A.: 1972b, 'Choice by elimination', J. Mathematical Psychology, 9, 341-367. Wagner, C. G.: 1989, 'Consensus for belief functions and related uncertainty measures', Theory and Decision, 26, 295-304.

Department of Psychology, McGill University, 1205 Ave. Dr. Penfield, Montreal, Quebec, Canada H 3 A 1BI.