Preferential Choice Models 1

Comparing Models of Preferential Choice by Jerome R. Busemeyer Indiana University, U.S.A. Shailendra Mehta Purdue University, U.S.A. & Rachal Barkan Ben Gurian University, Israel October 28, 2003

Preferential Choice Models 2 Research on preferential choice has rapidly grown over the past 40 years, both in terms of theory and data, and it has spread across a variety of fields, including business, consumer research, economics, management science, and psychology. However, a systematic comparison of theories against the basic facts has not been examined for many years. Consequently, it is difficult for researchers to become familiar with facts and theories outside their own fields. This article summarizes a number of major phenomena regarding preferential choice, including comparability effects, similarity effects, attraction effects, compromise effects, and reference point effects, all of which provide the benchmarks that any theory must strive to explain. The main purpose of the artic le is to use these benchmarks to compare the competing preferential choice theories. These theories include simple scalability models, random utility models, the elimination by aspects model, strategy switching models, the context dependent preference models, and connectionist network models of choice. Empirical Findings In a natural or realistic setting, preferential choices usually depend on a large number of dimensions or attributes such as economy, quality, practicality, familiarity, ect. Furthermore, these abstract attributes can be decomposed into more micro level aspects-for example, the quality of a car may depend on power, handling, style, and luxury features; the economy of a car may depend on purchase price, repair costs, gas mileage, and reliability (see Keeney & Raiffa, 1976, Ch. 1). For the purpose testing theories under experimental control, laboratory studies usually manipulate only two primary dimensions, such as for example, economy and quality. In this case, the choice options can be depicted as points within a two

Preferential Choice Models 3 dimensional space, such as that shown in Figure 1: the horizontal axis represents the evaluation of the options on the basis of the first dimension, e.g., economy; and the vertical axis represents the evaluation of the options on the second dimension, e.g., quality. For example, option A in Figure 1 is a high quality but economically poor product, option B is a low quality but economically good product, and option C is intermediate on both dimensions. We can also imagine other possible options in this space, which are variations of those shown in Table 1. We will denote the probability of choosing an option A from a choice set X as Pr[ A | X]. Figure 1: Two dimensional Representations of Actions

Dimension 2

D

S

A F

H G I C

E

B

T

Dimension 1

Comparability Effects. Comparability effects are found using binary choices, and they produce violations of two basic choice principles: order independence and strong stochastic transitivity. Order independence states that, for an arbitrary collection of four options, A, B, C, and D, if Pr[A|{A,C}] ≥ Pr[B|{B,C}] then Pr[A|{A,D}] ≥ Pr[B|{B,D}].

Preferential Choice Models 4 The strong stochastic transitivity property states that, for an arbitrary collection of three options, A, B, C, if Pr[A|{A,B}] ≥ Pr[B|{B,C}] ≥ .50 then Pr[A|{A,C}] ≥ Pr[A|{A,B}]. Busemeyer & Townsend (1993) review experiments where comparability effects produce violations of order independence. First, consider a choice between options I and H: there is no noticeable difference on dimension 1, but option I has a disadvantage relative to H on dimension 2, consequently option I is chosen less frequently. Next consider a choice between options B and H: option H has a large advantage and a large disadvantage relative to option B, which makes it difficult to compare, and so the options are chosen with equal frequency. Consequently, it is found that Pr[B|{B,H} ] > Pr[I|{I,H}]. Now consider replacing option H with option G in the two choices. When comparing option I and G, there is no noticeable difference on dimension 2, but option I has an advantage over G on dimension 1, consequently option I is chosen more frequently. When comparing options B and G, option B has both a large advantage and a large disadvantage relative to option G, which again makes it difficult to compare, and so the options are chosen with equal frequency. Consequently, the opposite pattern is found, that is Pr[B|{B,G}] < Pr[I|{I,G}], which is a violation of order independence. Mellers & Biagini (1994) review experiments where comparability effects produce violations of strong stochastic transitivity. First consider a choice between B and E: there is no noticeable difference on dimension 2, but option B has an advantage over E on dimension 1, so that Pr[B|{B,E} ] ≈ 1. Next consider a choice between options C and E: this comparison is more difficult, but if dimension 1 is more important, then it is found that Pr[E|{E,C}] ≥ .50. Finally consider a choice between options B and C: this is also a

Preferential Choice Models 5 very difficult comparison, and consequently it is found that Pr[B|{B,C}] < Pr[B|{B,E}], violating strong stochastic transitivity. The remaining effects, reviewed below, all involve comparisons between binary choices and triadic choices. The central issue concerns the effect of adding a third alternative on the distribution of choices between the two options carried over from the original binary choice set. The findings below produce violations of two choice principles: independence from irrelevant alternatives and regularity. Independence from irrelevant alternatives states that for an arbitrary collection of three options A, B, and C, if Pr[A|{A,B}] ≥ .50 then Pr[A|{A,B,C}] ≥ Pr[B|{A,B,C}]. Regularity states that for an arbitrary collection of three options A, B, and C, Pr[A|{A,B}] ≥ Pr[A|{A,B,C}]. Similarity effect. This refers to the effect, on choice probabilities, produced by adding a competitive option S to an earlier choice set containing only A and B, where option S is very similar to option A. Suppose that the options are designed so that the binary choices are all equal: Pr[A|{A,B}] = Pr[S|{B,S}] = Pr[A|{A,S}] = .50. When all three options are presented for choice, the two similar options, S and A, hurt each other more than the dissimilar option B, leaving the probability of choosing B unaffected. The empirical result is that the probability ordering changes to Pr[B|{A,B,S}] > Pr[A|{A,B,S}] = Pr[S|{A,B,S}] for the triadic choice set, producing a violation of independence from irrelevant alternatives. Similar results occur if option T is added to a set containing options A and B (see Tversky, 1972, for a review). Attraction effect. This refers to the effect, on choice probabilities, of adding a decoy option D to an earlier choice set containing only options A and B, where the decoy D is similar to, but also dominated by, option A. Suppose that in a binary choice, options

Preferential Choice Models 6 A and B are chosen with equal frequency so that Pr[A|{A,B}] = .50. Adding the decoy option D to this choice set enhances the probability of the nearby dominant option A, so that Pr[A|{A,B,D}] > Pr[A|{A, B}], which produces a violation of the regularity principle (Huber, Payne, & Puto, 1982; see Heath & Chatterjee, 1995, for a review). This finding is fairly robust. It has been obtained when A is favored over B (i.e., Pr[ A | {A,B,D} ] > Pr[A|{A, B}] > .50) as well as when B is favored over A (i.e., Pr[ A | {A,B,D} ] > Pr[A|{A,B}] < .50). Also a similar result holds if we add option E to a set containing A and B, in which case the probability of B is increased (Pr[B|{A,B,E}] > Pr[B|{A,B}]). Compromise effect. This refers to the effect, on choice probabilities, of adding an intermediate option C to an earlier choice set containing only two extreme options A and B, where the compromise C is midway between the two extremes. Suppose, that all the binary choices are equal so that Pr[ A | {A,B) ] = Pr[ A | {A,C) ] = Pr[ B | {B,C) ] = .50. A third robust finding is that when all three options are presented, the probability of the compromise option is increased relative to the extreme options so that Pr[ C | {A,B,C} ] > Pr[ A | {A,B,C}] and Pr[ C | {A,B,C} ] > Pr[ B | {A,B,C}], which is another violation of independence from irrelevant alternatives (Simonson, 1989; see Tversky & Simonson, 1993 for a review). Under some conditions the effect on the extreme options is symmetric so that Pr[ A | {A,B,C}] = Pr[ B | {A,B,C}], and for other conditions the effect may be asymmetric, so that Pr[ A | {A,B,C}] ≠ Pr[ B | {A,B,C}]. Reference point effect. Tversky and Kahneman (1991) conducted a pair of studies that manipulated reference points for choice to demonstrate what they interpreted as loss aversion effects. These studies involve target options A and B illustrated in Figure 1, as

Preferential Choice Models 7 well as reference points E and F in one study, or reference points S and T, in a second study. The first study manipulated a reference point, using either option E or F. Under one condition, participants were asked to imagine that they currently owned product E, and they were then given a choice of keeping E or trading it for either product A or product B. From the reference point of E, option B has a small advantage on dimension 1 and no disadvantage on dimension 2, whereas A has both large advantages (dimension 2) and disadvantages (dimension 1). Under these conditions, E was rarely chosen, and B was strongly favored over A. Under another condition, participants were asked to imagine that they owned option F, and they were then given a choice of keeping F or trading it for either A or B. From the reference point of F, A has a small advantage and no disadvantages, whereas B now has both large advantages and disadvantages. Under this condition, F was rarely chosen again, but now A was slightly favored over B, reversing the earlier preference relation between these two. This finding is another example of a violation of the independence from irrelevant alternatives property for choice. The second study also manipulated a reference point, but in this case, using either option S or T. In one condition, participants were asked to imagine that they trained on job T, but that job would end and no longer be available, and they had to choose between two new jobs A or B. From this reference point, job B has small advantages and disadvantages over S, whereas A has large advantages and disadvantages. Under these conditions, option B was strongly favored over option A. In a second condition,

Preferential Choice Models 8 participants were asked to imagine that they trained on job T, and in this case, preferences reversed, and option A was strongly favored over option B. This concludes the empirical review. The comparability, similarity, attraction, and compromise effects are well established, and they have also been observed at the individual level of analysis. The reference point effects are less well established and need further research regarding their robustness. Together these phenomena form a benchmark set of phenomena that any preferential choice model must attempt to explain. Theoretical Models Simple Scalability Models This class of models assumes that each option is assigned a real valued utility, but choice is a probabilistic function of these utilities (Becker, DeGroot, & Marschack, 1963). For example, the Luce (1959) ratio of strengths choice model is a member of this class, as well as the choice models used in several more recent applications (Hey and Orme, 1994; Harless and Camerer, 1994). More formally, each option in the set {A, B, C} is assigned a real valued utility: uA, uB, and uC. These utilities are used to compute the choice probabilities for any set of options. The probability of choosing A from {A,B} is Pr[ A |{A,B}] = F2 (uA, uB),

(SS-1)

where F2 is an increasing function of the first argument and a decreasing function of the second. The probability of choosing A from {A,B,C} is : Pr[ A |{A,B,C}] = F3 (uA, uB, uC),

(SS-2)

where F3 is a strictly increasing function of the first argument and a strictly decreasing function of the other two, and the same utilities are used as in the binary choices.

Preferential Choice Models 9 This class of models satisfies both the order independence property for binary choices as well as the independence from irrelevant alternatives for triadic choices. Consequently, this class of models cannot account for the compatibility effects, the similarity effect (Tversky, 1972a), nor the reference point effect. For example, to satisfy Pr[A|{A,B}] = F2 (uA, uB) > .50, we require that uA > uB, but this implies that Pr[A|{A,B,S}] = F3 (uA,uB,uS) > F3 (uB,uA,uS) = Pr[B|{A,B,S}] , which is contrary to the facts for the similarity effect. This class of models may be able to account for the attraction effect, depending on the specific assumptions about the forms of F2 and F3 . Random Utility Models This class of models assume s that each option is assigned a random utility, but the choice on each trial is deterministic: choose the option with the largest random utility. For example, the Thurstone choice model is a member of this class (see Becker, DeGroot, & Marschack, 1963; Luce & Suppes, 1965; McFadden, 1999; Thurstone, 1959; De Soete, Feger, & Klauer, 1989). A standard version of the random utility model can be formulated as follows. Given the complete set of options {A, B, C}, each option is assigned a random utility: UA, UB, and UC, according to a density function, f. The same density function f is used to compute the choice probabilities for any set of options. The probability of choosing A from {A,B} is given by Pr[A|{A,B}] = Pr[UA > UB],

(RU-1)

which is computed from the density function f. The probability of choosing A from {A,B,C} is given by Pr[A|{A,B,C}] = Pr[ (UA > UB) ∩ (UA > UC) ],

(RU-2)

Preferential Choice Models 10 which is also computed using the same density f. The standard random utility model can account for the comparability effects and the similarity effect when the random utilities are permitted to be correlated (Edgel & Geisler, 1980; De Soete et al., 1989). However, it cannot account for the attraction effect because it must satisfy the property of regularity (Block & Marschack, 1960; Luce & Suppes, 1965). To see this note that we can write the triadic choice probability as a product: Pr[A|{A,B,D}] = Pr[(UA>UB) ∩ (UA>UD)] = Pr[UA>UB]⋅Pr[UA>UD|UA>UB] and this product cannot be larger than the binary choice probability, Pr[UA>UB]. The random utility model cannot account for reference point effects either. Consider, for example, the case where the binary choice between options and A and B are reversed by the presence of a reference point option S versus T. The random utility model is insensitive to the unavailable reference point option manipulation, and therefore predicts the same binary choice probabilities under both reference point conditions. Elimination by Aspects Model This model was originally proposed by Tversky (1972a), and is based on characterizing each option as a collection of aspects. Consider a choice from a set of three options {A,B,C}. According to this model, alternatives are composed of common and unique aspects: a, b, and c denote the importance of the unique aspects of A, B, and C, respectively; ab, ac, and bc denote the importance of the common aspects between {A and B and not C}, {A and C and not B}, and {B and C and not A}, respectively. A choice is made by selecting an aspect according to its importance, and eliminating any option

Preferential Choice Models 11 that does not contain that aspect, and this elimination process continues until only one option remains, which is then chosen. For a binary choice between A and B, the EBA model predicts Pr[ A | {A,B}] = (a + ac) / [ (a + ac) + (b + bc)],

(EBA-1)

and for a triadic choice among {A,B,C}, the EBA model predicts Pr[A|{A,B,C}] = {a + ab⋅Pr[A|{A,B}] + ac⋅Pr[A|{A,C}] } / K,

(EBA-2)

where K = [ a + b + c + ab + ac + bc ] . The elimination by aspects model was originally designed to account for comparability effects and similarity effects on choice (Tversky, 1972). However, Tversky (1972b) proved that the elimination by aspects model satisfies the regularity principle. Therefore, the elimination by aspects model cannot account for the attraction effect. The elimination by aspects model cannot account for reference point effects. First note that option E does not have any aspects that are not contained in option B, so Pr[E|{B,E}] = 0. Similarly, F does not have any aspects that are not contained in option B, so Pr[F|{A,F}] = 0. These predictions are consistent with the fact that options E or F are rarely ever chosen. The finding that Pr[B|{A,B,E}] > Pr[A|{A,B,E}] implies that {(b+be)+ab⋅Pr[B|{A,B}]} > {(a+af)+ab⋅Pr[A|{A,B}]}. But this last inequality implies that Pr[B|{A,B,F}] > Pr[A|{A,B,F}], contrary to fact. Finally, this model cannot account for the compromise effect. In this case, all the binary choices are equal, which implies a+ac = b+bc

(e1)

a+ab = c+bc

(e2)

b+ab = c+ac

(e3)

Preferential Choice Models 12 and the following results hold for the triadic choice: Pr[ A | {A, B, C} ] =

{a + .5(ab + ac)}/K

= Pr[ B | {A, B, C} ] =

{b + .5(ab + bc)}/K

< Pr[ C | {A, B, C} ] =

{c + .5(ac + bc)}/K

which implies

(a+ac)+(a+ab) = (b+bc)+(b+ab)

(e4)

(a+ac)+(a+ab) < (c+bc)+(c+ac)

(e5)

From (e1) and (e4) we find a = b and ac = bc. From (e2) and (e5) we find that c > a = b. Reconsidering (e2) and (e3) this finally implies that ac = bc < ab. But this last result is inconsistent with the similarity relations among the options. Note that the two extreme options A and B have little in common. On the other hand, the compromise option C is closer on dimension 2 to A, so that ac > ab; also the compromise option C is closer in dimension 1 to B so that bc > ab. Thus the elimination by aspects model fails to account for the compromise effect. Strategy Switching Models. One appealing idea from the decision making literature is that individuals have a set of strategies for making decisions, and they may switch strategies depending on choice set size or choice context (Payne, Bettman, Johnson, 199x; Gigerenzer, 19xx). It is difficult to examine all possible strategy switching models, but we can consider a simple but reasonable case. Assume that an individual may switch from a compensatory to a non-compensatory strategy on any trial. The probability of using the non-compensatory strategy is denoted α n , and it is assumed to be an increasing function of set size, n.

Preferential Choice Models 13 For the compensatory strategy, we simply assume that it produces a real valued utility assignment for each option, denoted ui for option i. If the options all produce distinguishable utilities, then option with the greatest utility is chosen by this strategy. If the utilities are equal, then choice is random. For this strategy, it will be convenient to introduce an indicator function: δ i(uA, uB, uC) = 1 if ui = max{uA, uB, uC), otherwise zero. For the non-compensatory strategy, we assume a lexicographic strategy. In this case, an individual first considers the most important dimension, and he or she takes the best alternative on this first dimension; if more than one alternative is tied on the most important dimension, then the second dimension is considered, and the best on the second dimension is selected. Two options may be tied with respect to a particular dimension if the difference in their values is less than some small threshold, ∆, for that dimension. On any given trial, we allow the rank order importance of the dimensions to possibly change. The probability that dimension 1 is processed first is denoted π 1 , and the probability that dimension 2 is processed first is denoted π 2 = (1−π 1 ). The compensatory strategy satisfies both strong stochastic transitivity and order independence for binary choices. Therefore, to account for the comparability effects, we must assume that the lexicographic strategy is often used to make binary choices (α 2 > 0). For example, if the lexicographic strategy is used, then B will always be chosen over E, and E will be chosen over C with probability π 1 ; and B will be chosen over C with probability π 1 . If we set .50 < π 1 < 1, then this pattern of choice probabilities violates strong stochastic transitivity. This model can also account for the similarity effect, as long as we assume that the difference between option A and optio n S is smaller than the threshold, ∆, so that

Preferential Choice Models 14 these two alternatives are treated as tied. If the lexicographic strategy is used, then option B will be chosen whenever dimension 1 is processed first, and option A will be chosen whenever dimension 2 is processed first, and A is randomly chosen over S. Thus the probability of choosing option B equals π 1 , and the probability of choosing option A equals (.5)⋅π 2 . Setting π 1 = π2 reproduces the similarity effect. This model cannot explain the attraction effect. First, the dominated decoy D is rarely ever chosen, and so we must assume that the difference between this decoy and option A is greater than the threshold, ∆. Given this assumption, there are two ways that option A can be chosen: (a) the compensatory strategy is used and the utility for A is the maximum, or (b) the lexicographic strategy is used and dimension 1 is processed first. Therefore, to account for the attraction effect, we require that Pr[ A | {A,B,D} ] = (1-α 3 ) ⋅ δ A(uA,uB,uD) + α 3 ⋅ π 2 ,

(LEX-1)

> Pr[ A | {A,B,} ] = (1-α2 ) ⋅ δ A(uA,uB) + α2 ⋅ π2 = .50. There are two cases to consider. If δ A(uA,uB,uD) = 1, then it is impossible for the above inequality to hold. For in this case we require that (1-α3 )⋅1+α 3 ⋅π2 > (1-α2 )⋅1+α 2 ⋅π2 , which implies that (α 3 -α 2 )⋅π2 > (α3-α 2 ), which has no solution for α 3 > α 2 and 0 < π2 < 1. If δ A(uA,uB,uD) = 0, then the above inequality holds, but this also implies that u B > uA in which case it is impossible to ha ve Pr[B|{A,B,E}] > Pr[B|{A,B}], contrary to fact. 1 This model cannot explain the compromise effect either. Given that both of the decoys, D and F, are rarely ever chosen over option A, we can be assured that option C

If u A = u B, so that δA (u A ,u B) = .50, then the model still fails to explain the attraction effect. If Pr[A|{A,B}] = .50, then Pr[A|{A,B}] = (1-α2 )⋅(.50)+α2 ⋅π2 = .50 à π2 = .50 à Pr[A|{A,B,D}] = (1-α3 )⋅(.50)+α3 ⋅π2 = .50 = Pr[A|{A,B}], contrary to fact. 1

Preferential Choice Models 15 can be discriminated on each dimension from option A. Based on this assumption, the binary choice probabilities are given by Pr[ A | {A,B}] = (1-α2 ) ⋅ δ A(uA,uB) + α2 ⋅ π2 . Pr[ A | {A,C}] = (1-α2 ) ⋅ δ A(uA,uC) + α2 ⋅ π2 .

(LEX-2)

Pr[ B | {B,C}] = (1-α 2 ) ⋅ δ B(uB,uC) + α2 ⋅ π 1 . The probability of choosing option A over B and C in a triadic choice is given by Pr[ A | {A,B,C}] = (1-α 3 ) ⋅ δ A(uA,uB,uC) + α3 ⋅ π2 , Pr[ B | {A,B,C}] = (1-α 3 ) ⋅ δ B(uA,uB,uC) + α 3 ⋅ π 1 ,

(LEX-3)

Pr[ C | {A,B,C}] = (1-α 3 ) ⋅ δ C(uA,uB,uC). Note that when all three options are presented, the lexicographic strategy would ne ver choose the compromise, because it is not the best on any dimension. To account for the fact that the compromise is chosen most frequently in the triadic choice set, we must assume that (1−α 3 ) > 0 and δ C(uA,uB,uC) = 1, which also implies that δ A(uA,uC) = δ B(uB,uC) = 0. To account for the fact that Pr[A|{A,C}] = Pr[B|{B,C}] = .50, we must have π 2 = π1 = .50, but this finally implies that Pr[A|{A,B}] = (1-α2 )⋅δ A(uA,uB)+α2 ⋅(.50) ≠ .50, contrary to fact. Finally this model cannot account for the reference point effects. The finding that Pr[B|{A,B,E}] > Pr[A|{A,B,E}] implies that (1−α 3 )⋅δ B(uA,uB,uE)+α 3 ⋅π1 > (1−α 3 )⋅δ A(uA,uB,uE)+α3 ⋅π2 . On the one hand, this result can be explained by assuming uB > uA, so that δ B(uA,uB,uE) = 1; but if this is true, then this implies Pr[B|{A,B,F}] > Pr[A|{A,B,F}], contrary to fact. On the other hand, the finding that Pr[B|{A,B,E}] > Pr[A|{A,B,E}] can be explained by assuming that uA > uB and α 3 ⋅π 2 > (1−α3 )+α3 ⋅π1 ; but if this is true, then this also implies Pr[B|{A,B,F}] > Pr[A|{A,B,F}], contrary to fact.

Preferential Choice Models 16 Componential context model Tversky and Simonson (1993) proposed a context dependent preference model, called the compontial context model, which relies on the concept of loss aversion (Tversky & Kahneman, 1991). According to this model, the value assigned to each option has two components: a context free value and another component that depends on the context of the choice set. The context free value for option i is defined as a sum of the values on each dimension. For optio ns defined by two dimensions, this is: v i = v i1 + v i2

(CC-1)

For binary choices, only the context free component is used (see Eq. 9 of Tversky & Simonson, 1993), and binary choice probabilities are determined by a simple scalable choice model. The context component becomes involved when three or more options are presented in the choice set. The context component is based on the concept of advantages of one option over another. For example, option A has a large advantage in terms of quality over option B, and option B has a large advantage in terms of economy over option A. The advantage of option A over B on the quality dimension is equal to the difference (v A2 – v B2 ). The advantage of option B over A on the economy dimension is equal to the difference (v B1 – vA1). For triadic choice sets, the componential context model asserts Pr[ A | {A, B, S} ] = F[ V(A|A,B,S), V(B|A,B,S), V(S|A,B,S) ]

(CC-2)

where F is an increasing function of the first argument, and a decreasing function of the other two. The values of option A in a set {A,B,S} is given by V(A| A,B,S) = v A + θ ⋅ [ R(A,B) + R(A,S) ],

Preferential Choice Models 17

R( A, B) =

(v A2

( v A2 − v B2 ) ( v A1 − v S1 ) , R( A, S ) = − v B 2 ) + δ 1 (v B1 − v A1 ) (v A1 − v S1 ) + δ 2 (v S 2 − v A2 )

and δ i is a convex function, consistent with loss aversion (see Tversky & Simonson, 1993, p. 1185). The componential context model was developed to explain the attraction effect, and the compromise effect. It can also account for reference points effects. However, this model cannot account for comparability effects, because it assumes a simple scalable choice model for binary choices. Furthermore, this model also fails to account for the similarity effect found with triadic choices. Consider the case involving options A, B, and S. For the similarity effect, we find that the binary choices are all equal, which implies v A = v B = v S and this in turn implies that (v B1 – v A1) = (v A2 – v B2 ), (v B1 – v S1 ) = (v S2 – v B2 ), (v A1 – v S1 ) = (v S2 – v A2). For the triadic choice, we need to compare V(A|A,B,S) with V(B|A,B,S), which can be done for each of the three terms separately. The binary choice results imply that the first two terms are equal because v A = v B and (v A2 – v B2 ) = (v B1 – v A1). Finally, the third term favors option A because of loss aversion. To see this in more detail, note that R( A, S ) =

(v A1 − vS 1 ) = (v A1 − v S1 ) + δ 2 (v S 2 − v A2 )

1 δ (v − v ) 1 + 2 A1 S 1 ( v A1 − v S1 )

R( B, S ) =

(v B1 − vS 1 ) = (v B1 − v S1 ) + δ 2 (v S 2 − v B 2 )

1 δ (v − v ) 1 + 2 B1 S 1 ( v B1 − v S1 )

Loss aversion requires δ to be a convex function, which implies that δ(x+∆)/(x+∆) > δ(x)/x, and because (v B1 -v S1 ) = (v A1-v S1 ) + (v B1 -v A1), convexity implies that R(A,S) >

Preferential Choice Models 18 R(B,S). In sum, this model predicts that V(A|A,B,S) > V(B|A,B,S), which implies that Pr[ A | {A, B, S} ] > Pr[ B | {A, B, S} ], which is contrary to the observed facts. Decision Field Theory. Busemeyer and Townsend (1993) and Roe, Busemeyer, & Townsend (2001) proposed a connectionist model of preferential choice called decision field theory. This theory is a dynamic model, which predicts choice probabilities as a function of deliberation time. The time index, t, represents the amount of time that has passed during deliberation before a choice is made. The theory is based on three assumptions. First, it is assumed that a weighted utility is computed for each option at each moment in time t. The weighted value for each option i ∈ {A,B,C} at time t equals Ui(t) = W1 (t)⋅mi1 + W2 (t)⋅mi2 + ε i(t),

(D1)

where mij represents the value of alternative i on dimension j, and W1 (t) and W2 (t) = 1 W1 (t) are two stochastic variables, called attention weights, which are assumed to fluctuate over time according to a stationary stochastic process. The last term, ε i(t), is an error term representing the influence of irrelevant features at each moment in time (e.g., features outside of the experimenter’s control). The above equation is similar to Fisher, Jia, and Luce’s (2000) random weight utility model except that it is dynamic rathe r than static. The second assumption is that the weighted values from each option are contrasted to form what is called a valence. The valence for option i ∈ {A,B,C} is a contrast that compares the weighted value for option i to the average of the weighted values of the other options (j ≠ k ≠ i): v i(t) = Ui(t) – U(t)

(D2)

Preferential Choice Models 19 where U(t) = Σ k≠i Uk (t)/(n-1) is the average of the weighted utilities of all the alternatives other than option i. Valence is closely related to the concept of advantages and disadvantages used in Tversky’s (1969) additive difference model. The third assumption states that the valences are integrated over time to form a preference state for each action. The preference state for option i ∈ {A, B, C} is updated according to the linear dynamic system: Pi(t+h) = s⋅Pi(t) + v i(t) – sij⋅Pj(t) – sik ⋅Pk (t), j≠k≠i.

(D3)

Conceptually, the new state of preference is a weighted combination of the previous state of preference and the new input valence. Lateral inhibition is also introduced from the competing alternatives, and the strength of the lateral inhibition connection is a decreasing function of the dissimilarity between a pair of alternatives. The probability of choosing option i at time t is equal to the probability that the preference state for option i is maximum at time t. The equations for the choice probabilities are presented in Roe et al. (2001, Appendix B). Roe et al (2001) demonstrated that decision field theory provides an explanation for comparability effects, similarity effects, attraction effects, and compromise effects, using a common set of parameter values. However, Roe et al. (2001) did not examine the reference point effects. Below we show that decision field theory also accounts for these effects. First consider the study involving the reference point represented by option F. To derive predictions from decision field theory for this study, we simply set the values (mij in Equation D1) proportional to coordinates of the options in Figure 1. We assumed an equal probability of attending to each dimension at each moment in time (Pr[W1 (t)=1] =

Preferential Choice Models 20 Pr[W2 (t)=1] = .50). The positive feed back parameter in Equation (D1) was set equal to s = .94, and the lateral inhibitory coefficient for the distant options A and B was set to sAB = .001. These parameters are similar to those used in Roe et al. (2001). We then examined the predictions of the model for a wide range of parameter values for two critical parameters: the lateral inhibitory coefficient for a similar option option (e.g., sAF ) in Equation D3, and the standard deviation of error term (std of ε) in Equation (D1). The probability of choosing option B from a set containing {A, B, F} is shown in Figure 2 below. As can be seen in this figure, as long as the lateral inhibitory Figure 2: Decision field theory predictions for the first reference point effect.

Preferential Choice Models 21

parameter is greater than .001, the model predicts that option A is favored when reference point F is present. The opposite pattern is predicted by the model when the reference point is changed to option E so that choice set contains {A, B, E}. In this case the probability of choosing option B is favored whenever the lateral inhibitory parameter is greater than .001. Thus the model robustly predicts the first reference point effect for a wide range of parameter values.

Preferential Choice Models 22 To apply decision field theory to the second study, consider the reference point S. For this study, we assume that each option is described by three dimensions: the values of the first two dimensions are taken from the positions of the options shown in Figure 1, and the third dimension represents job availability. Jobs A and B both have a positive value on dimension 3 (they are available), whereas job S and T both have negative values on dimension 3 (they are no longer available). In particular, option S is assigned a slightly higher value on dimension 2 than A, a slightly lower value on dimension 1 than A, and it has a large negative value on dimension 3; option T is assigned a slightly higher value on dimension 1 than B, a slightly lower va lue on dimension 2 than B, and it has a large negative value on dimension 3. The large negative value on the third dimension prevents the unavailable option from being chosen from the triadic choice set. We assumed an equal probability of attending to each of the three dimensions, and the remaining parameters were the same as used to generate Figure 2. The choice probability results, predicted by the theory, are illustrated in the Figure 3 below.

Preferential Choice Models 23

As can be seen in the figure, decision field theory again reproduces the reversal in preference as a function of the reference point. In sum, we find that both reference point effects can be predicted for a wide range of parameter values by decision field theory. Decision field theory is one example of a connectionist model of decision making. Recently, two other connectionist models have been proposed to account for some of the phenomena reviewed above. Guo and Holyoak (2002) present a connectionist model that was designed to explain the similarity and attraction effects, but at this time, it does not

Preferential Choice Models 24 account for the other effects. Usher & McClelland (2002) proposed an artificial neural network model that shares some assumptions contained in decision field theory, and this model is has the potential to predict most of the effects presented in Table 2. Comparison of Models Table 2, shown below, summarizes the ability of each model to account for each effect. At this time, only decision field theory is capable of explaining all of the results.

Model

Comparability Similarity Attraction Compromise Reference Effects Effects Effects Effects Point Effects no no yes no no

Simple Scalability Random yes yes no yes Utility Elimination yes yes no no by Aspects Strategy yes yes no no * Switching Componential no no yes yes Context Decision yes yes yes yes Field Theory * Switching occurs between a compromise and a lexicographic strategy.

no no no yes yes

However, it is possible to modify and extend the other models in Table 2 so that they can overcome their current limitations. For example, one could construct a hybrid simple scalability/random utility model by assuming that the utilities of the simple scalability model are random. As another example, one could construct a more complex strategy switching model by assuming that one switches between a random utility compensatory rule and an elimination-by-aspects heuristic rule. Comparisons among these more complex hybrid versions will require quantitative tests that take into consideration both model accuracy and model complexity (Myung, 2001).

Preferential Choice Models 25

References Becker, G., Degroot, M.H., and Marschak, J. (1963). Probabilities of choices among very similar objects: An experiment to decide between two models. Behavioral Science, 8(4), 306-311. Block, H. D., and Marschak, J. (1960). Random orderings and stochastic theories of response. In I. Olkin, S. Ghurye, W. Hoeffding, W. Madow and H. Mann (Eds.), Contributions to probability and statistics (pp. 97-132). Stanford: Stanford University Press. Bock, R. D., and Jones, L. V. (1968). The measurement and prediction of judgment and choice. San Francisco, CA: Holden-Day. Busemeyer, J. R., and Townsend, J. T. (1993). Decision Field Theory: A dynamic cognition approach to decision making. Psychological Review, 100, 432-459. De Soute, G., Feger, H., and Klauer, K. C. (Eds.) (1989). New developments in probabilistic choice modeling. Amsterdam: North Holland. Edgell, S. E., and Geisler W. S. (1980) A set-theoretic random utility model of choice behavior. Journal of Mathematical Psychology, 21(3), 265-278. Fischer, G. W., Jia, J., and Luce, M. F. (2000) Attribute conflict and preference uncertainty: The RandMAU model. Management Science, 46, 669-684. Gigerenzer, G., and Selten, R. (Eds.). (2001). Bounded rationality: The adaptive toolbox. Cambrige: MIT Press. Gonzalez-Vallejo, C. (2002). Making tradeoffs: A probabilistic and contextsensitive model of choice behavior. Psychological Review, 109(1), 137-154.

Preferential Choice Models 26 Guo, F. Y., and Holyoak, K. J. (2002). Understanding similarity in choice behavior: Connectionist model. Proceedings of the Cognitive Science Society Meeting. Hafter (197?) Harless, D.W., and Camerer, C. (1994). The predictive validity of generalized expected utility theories. Econometrica, 62(6), 1251-1289. Haykin, S. (1994) Neural Networks. New York: Macmillan. Hauser, J., and Wernerfelt, B. (1990). An evaluation cost model of consideration sets. Journal of Consumer Research¸16, 393-408. Heath, T. B., and Chatterjee, S. (1991) How entrants affect multiple brands: A dual attraction mechanism. Advances in Consumer Research, 18, 768-771. Hey, J.D., and Orme, C. (1994). Investigating generalizations of expected utility theories using experimental data. Econometrica, 62(6), 1291-1326. Huber, J., Payne, J. W., and Puto, C. (1982). Adding asymmetrically dominated alternatives: Violations of regularity and the similarity hypothesis. Journal of Consumer Research, 9(1), 90-98. Keeney, R. L., and Raiffa, H. (1976). Decisions with multiple objectives: Preference and value tradeoffs. New York: John Wiley & Sons Lapersonne, E., Laurent, G., and Le Goff, J.J. (1995). Considerations sets of size one: An empirical investigation of automobile purchases. International Journal of Research in Marketing, 12, 55-66. Lehmann, D. R., and Pan, Y. (1994) Context effects, new brand entry, and consideration sets. Journal of Marketing Research, 31, 364-374. Luce, R. D., and Suppes, P. (1965). Preference, utility, and subjective probability.

Preferential Choice Models 27 In R. D. Luce R.D., B. Bush and E.Galanter (Eds.), Handbook of mathematical psychology Vol.3 (pp. 249-410). New York: Wiley. Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley, 1959. McFadden, D. (1981). Econometric Models of Probabilistic Choice, in C.F. Manski and D. McFadden (Eds.) Structural Analysis of Discrete Data with Economic Applications. Cambridge, MA: MIT Press. pp. 198-272. Mellers, B. A. and Biagini, K. (1994) Similarity and choice. Psychological Review, 101, 505-518. Narayana, C.L., and Markin, R.J. (1975) Consumer behavior and product performance: An alternative conceptualization: Journal of Marketing, 39, 1-6. Nowlis, S.M., and Simonson, I. (2000). Sales promotions and the choice context as competing influences on consumer decision making. Journal of Consumer Psychology, 9(1), 1-16. Payne, J. W., Bettman, J. R., and Johnson, E. J. (1993). The adaptive decision maker. NY: Cambridge University Press. Roe, R.M., Busemeyer, J.R., and Townsend, J.T. (2001). Mulitialternative decision field theory: A dynamic connectionist model of decision making. Psychological Review, 108(2), 370-392. Simonson, I., and Tversky, A. (1992) Choice in context: Tradeoff contrast and extremeness aversion. Journal of Marketing Research, XXIX, 281-295. Simonson, I. (1989) Choice based on reasons: The case of attraction and compromise effects. Journal of Consumer Research, 16, 158-174.

Preferential Choice Models 28 Sjoberg, L. (1977). Choice frequency and similarity. Scandinavian Journal of Psychology, 18, 103-115. Thurstone, L. L. (1959). The measurement of values. Chicago: University of Chicago Press. Tversky, A. (1972a). Elimination by aspects: A theory of choice. Psychological Review, 79(4), 281-299. Tversky, A. (1972b). Choice by elimination. Journal of Mathematical Psychology, 9(4), 341-367. Tversky, A. (1969). Intransitivity of preferences. Psychological Review, 76, 3148. Tversky, A., and Kahneman, D. (1991). Loss aversion in riskless choice – A reference dependent model. Quarterly Journal of Economics, 106(4) 1039-1061. Tversky, A., and Satath, S. (1979) Preference trees. Psychological Review, 86(6), 542-573. Tversky, A., and Simonson, I. (1993). Context dependent preferences. Management Science, 39, 1179-1189. Usher, M., and McClelland, J.L. (2002). Decisions, decisions: Loss aversion, information leakage, and inhibition in multi- alternative choice situations Under review at Psychological Review.

Comparing Models of Preferential Choice by Jerome R. Busemeyer Indiana University, U.S.A. Shailendra Mehta Purdue University, U.S.A. & Rachal Barkan Ben Gurian University, Israel October 28, 2003

Preferential Choice Models 2 Research on preferential choice has rapidly grown over the past 40 years, both in terms of theory and data, and it has spread across a variety of fields, including business, consumer research, economics, management science, and psychology. However, a systematic comparison of theories against the basic facts has not been examined for many years. Consequently, it is difficult for researchers to become familiar with facts and theories outside their own fields. This article summarizes a number of major phenomena regarding preferential choice, including comparability effects, similarity effects, attraction effects, compromise effects, and reference point effects, all of which provide the benchmarks that any theory must strive to explain. The main purpose of the artic le is to use these benchmarks to compare the competing preferential choice theories. These theories include simple scalability models, random utility models, the elimination by aspects model, strategy switching models, the context dependent preference models, and connectionist network models of choice. Empirical Findings In a natural or realistic setting, preferential choices usually depend on a large number of dimensions or attributes such as economy, quality, practicality, familiarity, ect. Furthermore, these abstract attributes can be decomposed into more micro level aspects-for example, the quality of a car may depend on power, handling, style, and luxury features; the economy of a car may depend on purchase price, repair costs, gas mileage, and reliability (see Keeney & Raiffa, 1976, Ch. 1). For the purpose testing theories under experimental control, laboratory studies usually manipulate only two primary dimensions, such as for example, economy and quality. In this case, the choice options can be depicted as points within a two

Preferential Choice Models 3 dimensional space, such as that shown in Figure 1: the horizontal axis represents the evaluation of the options on the basis of the first dimension, e.g., economy; and the vertical axis represents the evaluation of the options on the second dimension, e.g., quality. For example, option A in Figure 1 is a high quality but economically poor product, option B is a low quality but economically good product, and option C is intermediate on both dimensions. We can also imagine other possible options in this space, which are variations of those shown in Table 1. We will denote the probability of choosing an option A from a choice set X as Pr[ A | X]. Figure 1: Two dimensional Representations of Actions

Dimension 2

D

S

A F

H G I C

E

B

T

Dimension 1

Comparability Effects. Comparability effects are found using binary choices, and they produce violations of two basic choice principles: order independence and strong stochastic transitivity. Order independence states that, for an arbitrary collection of four options, A, B, C, and D, if Pr[A|{A,C}] ≥ Pr[B|{B,C}] then Pr[A|{A,D}] ≥ Pr[B|{B,D}].

Preferential Choice Models 4 The strong stochastic transitivity property states that, for an arbitrary collection of three options, A, B, C, if Pr[A|{A,B}] ≥ Pr[B|{B,C}] ≥ .50 then Pr[A|{A,C}] ≥ Pr[A|{A,B}]. Busemeyer & Townsend (1993) review experiments where comparability effects produce violations of order independence. First, consider a choice between options I and H: there is no noticeable difference on dimension 1, but option I has a disadvantage relative to H on dimension 2, consequently option I is chosen less frequently. Next consider a choice between options B and H: option H has a large advantage and a large disadvantage relative to option B, which makes it difficult to compare, and so the options are chosen with equal frequency. Consequently, it is found that Pr[B|{B,H} ] > Pr[I|{I,H}]. Now consider replacing option H with option G in the two choices. When comparing option I and G, there is no noticeable difference on dimension 2, but option I has an advantage over G on dimension 1, consequently option I is chosen more frequently. When comparing options B and G, option B has both a large advantage and a large disadvantage relative to option G, which again makes it difficult to compare, and so the options are chosen with equal frequency. Consequently, the opposite pattern is found, that is Pr[B|{B,G}] < Pr[I|{I,G}], which is a violation of order independence. Mellers & Biagini (1994) review experiments where comparability effects produce violations of strong stochastic transitivity. First consider a choice between B and E: there is no noticeable difference on dimension 2, but option B has an advantage over E on dimension 1, so that Pr[B|{B,E} ] ≈ 1. Next consider a choice between options C and E: this comparison is more difficult, but if dimension 1 is more important, then it is found that Pr[E|{E,C}] ≥ .50. Finally consider a choice between options B and C: this is also a

Preferential Choice Models 5 very difficult comparison, and consequently it is found that Pr[B|{B,C}] < Pr[B|{B,E}], violating strong stochastic transitivity. The remaining effects, reviewed below, all involve comparisons between binary choices and triadic choices. The central issue concerns the effect of adding a third alternative on the distribution of choices between the two options carried over from the original binary choice set. The findings below produce violations of two choice principles: independence from irrelevant alternatives and regularity. Independence from irrelevant alternatives states that for an arbitrary collection of three options A, B, and C, if Pr[A|{A,B}] ≥ .50 then Pr[A|{A,B,C}] ≥ Pr[B|{A,B,C}]. Regularity states that for an arbitrary collection of three options A, B, and C, Pr[A|{A,B}] ≥ Pr[A|{A,B,C}]. Similarity effect. This refers to the effect, on choice probabilities, produced by adding a competitive option S to an earlier choice set containing only A and B, where option S is very similar to option A. Suppose that the options are designed so that the binary choices are all equal: Pr[A|{A,B}] = Pr[S|{B,S}] = Pr[A|{A,S}] = .50. When all three options are presented for choice, the two similar options, S and A, hurt each other more than the dissimilar option B, leaving the probability of choosing B unaffected. The empirical result is that the probability ordering changes to Pr[B|{A,B,S}] > Pr[A|{A,B,S}] = Pr[S|{A,B,S}] for the triadic choice set, producing a violation of independence from irrelevant alternatives. Similar results occur if option T is added to a set containing options A and B (see Tversky, 1972, for a review). Attraction effect. This refers to the effect, on choice probabilities, of adding a decoy option D to an earlier choice set containing only options A and B, where the decoy D is similar to, but also dominated by, option A. Suppose that in a binary choice, options

Preferential Choice Models 6 A and B are chosen with equal frequency so that Pr[A|{A,B}] = .50. Adding the decoy option D to this choice set enhances the probability of the nearby dominant option A, so that Pr[A|{A,B,D}] > Pr[A|{A, B}], which produces a violation of the regularity principle (Huber, Payne, & Puto, 1982; see Heath & Chatterjee, 1995, for a review). This finding is fairly robust. It has been obtained when A is favored over B (i.e., Pr[ A | {A,B,D} ] > Pr[A|{A, B}] > .50) as well as when B is favored over A (i.e., Pr[ A | {A,B,D} ] > Pr[A|{A,B}] < .50). Also a similar result holds if we add option E to a set containing A and B, in which case the probability of B is increased (Pr[B|{A,B,E}] > Pr[B|{A,B}]). Compromise effect. This refers to the effect, on choice probabilities, of adding an intermediate option C to an earlier choice set containing only two extreme options A and B, where the compromise C is midway between the two extremes. Suppose, that all the binary choices are equal so that Pr[ A | {A,B) ] = Pr[ A | {A,C) ] = Pr[ B | {B,C) ] = .50. A third robust finding is that when all three options are presented, the probability of the compromise option is increased relative to the extreme options so that Pr[ C | {A,B,C} ] > Pr[ A | {A,B,C}] and Pr[ C | {A,B,C} ] > Pr[ B | {A,B,C}], which is another violation of independence from irrelevant alternatives (Simonson, 1989; see Tversky & Simonson, 1993 for a review). Under some conditions the effect on the extreme options is symmetric so that Pr[ A | {A,B,C}] = Pr[ B | {A,B,C}], and for other conditions the effect may be asymmetric, so that Pr[ A | {A,B,C}] ≠ Pr[ B | {A,B,C}]. Reference point effect. Tversky and Kahneman (1991) conducted a pair of studies that manipulated reference points for choice to demonstrate what they interpreted as loss aversion effects. These studies involve target options A and B illustrated in Figure 1, as

Preferential Choice Models 7 well as reference points E and F in one study, or reference points S and T, in a second study. The first study manipulated a reference point, using either option E or F. Under one condition, participants were asked to imagine that they currently owned product E, and they were then given a choice of keeping E or trading it for either product A or product B. From the reference point of E, option B has a small advantage on dimension 1 and no disadvantage on dimension 2, whereas A has both large advantages (dimension 2) and disadvantages (dimension 1). Under these conditions, E was rarely chosen, and B was strongly favored over A. Under another condition, participants were asked to imagine that they owned option F, and they were then given a choice of keeping F or trading it for either A or B. From the reference point of F, A has a small advantage and no disadvantages, whereas B now has both large advantages and disadvantages. Under this condition, F was rarely chosen again, but now A was slightly favored over B, reversing the earlier preference relation between these two. This finding is another example of a violation of the independence from irrelevant alternatives property for choice. The second study also manipulated a reference point, but in this case, using either option S or T. In one condition, participants were asked to imagine that they trained on job T, but that job would end and no longer be available, and they had to choose between two new jobs A or B. From this reference point, job B has small advantages and disadvantages over S, whereas A has large advantages and disadvantages. Under these conditions, option B was strongly favored over option A. In a second condition,

Preferential Choice Models 8 participants were asked to imagine that they trained on job T, and in this case, preferences reversed, and option A was strongly favored over option B. This concludes the empirical review. The comparability, similarity, attraction, and compromise effects are well established, and they have also been observed at the individual level of analysis. The reference point effects are less well established and need further research regarding their robustness. Together these phenomena form a benchmark set of phenomena that any preferential choice model must attempt to explain. Theoretical Models Simple Scalability Models This class of models assumes that each option is assigned a real valued utility, but choice is a probabilistic function of these utilities (Becker, DeGroot, & Marschack, 1963). For example, the Luce (1959) ratio of strengths choice model is a member of this class, as well as the choice models used in several more recent applications (Hey and Orme, 1994; Harless and Camerer, 1994). More formally, each option in the set {A, B, C} is assigned a real valued utility: uA, uB, and uC. These utilities are used to compute the choice probabilities for any set of options. The probability of choosing A from {A,B} is Pr[ A |{A,B}] = F2 (uA, uB),

(SS-1)

where F2 is an increasing function of the first argument and a decreasing function of the second. The probability of choosing A from {A,B,C} is : Pr[ A |{A,B,C}] = F3 (uA, uB, uC),

(SS-2)

where F3 is a strictly increasing function of the first argument and a strictly decreasing function of the other two, and the same utilities are used as in the binary choices.

Preferential Choice Models 9 This class of models satisfies both the order independence property for binary choices as well as the independence from irrelevant alternatives for triadic choices. Consequently, this class of models cannot account for the compatibility effects, the similarity effect (Tversky, 1972a), nor the reference point effect. For example, to satisfy Pr[A|{A,B}] = F2 (uA, uB) > .50, we require that uA > uB, but this implies that Pr[A|{A,B,S}] = F3 (uA,uB,uS) > F3 (uB,uA,uS) = Pr[B|{A,B,S}] , which is contrary to the facts for the similarity effect. This class of models may be able to account for the attraction effect, depending on the specific assumptions about the forms of F2 and F3 . Random Utility Models This class of models assume s that each option is assigned a random utility, but the choice on each trial is deterministic: choose the option with the largest random utility. For example, the Thurstone choice model is a member of this class (see Becker, DeGroot, & Marschack, 1963; Luce & Suppes, 1965; McFadden, 1999; Thurstone, 1959; De Soete, Feger, & Klauer, 1989). A standard version of the random utility model can be formulated as follows. Given the complete set of options {A, B, C}, each option is assigned a random utility: UA, UB, and UC, according to a density function, f. The same density function f is used to compute the choice probabilities for any set of options. The probability of choosing A from {A,B} is given by Pr[A|{A,B}] = Pr[UA > UB],

(RU-1)

which is computed from the density function f. The probability of choosing A from {A,B,C} is given by Pr[A|{A,B,C}] = Pr[ (UA > UB) ∩ (UA > UC) ],

(RU-2)

Preferential Choice Models 10 which is also computed using the same density f. The standard random utility model can account for the comparability effects and the similarity effect when the random utilities are permitted to be correlated (Edgel & Geisler, 1980; De Soete et al., 1989). However, it cannot account for the attraction effect because it must satisfy the property of regularity (Block & Marschack, 1960; Luce & Suppes, 1965). To see this note that we can write the triadic choice probability as a product: Pr[A|{A,B,D}] = Pr[(UA>UB) ∩ (UA>UD)] = Pr[UA>UB]⋅Pr[UA>UD|UA>UB] and this product cannot be larger than the binary choice probability, Pr[UA>UB]. The random utility model cannot account for reference point effects either. Consider, for example, the case where the binary choice between options and A and B are reversed by the presence of a reference point option S versus T. The random utility model is insensitive to the unavailable reference point option manipulation, and therefore predicts the same binary choice probabilities under both reference point conditions. Elimination by Aspects Model This model was originally proposed by Tversky (1972a), and is based on characterizing each option as a collection of aspects. Consider a choice from a set of three options {A,B,C}. According to this model, alternatives are composed of common and unique aspects: a, b, and c denote the importance of the unique aspects of A, B, and C, respectively; ab, ac, and bc denote the importance of the common aspects between {A and B and not C}, {A and C and not B}, and {B and C and not A}, respectively. A choice is made by selecting an aspect according to its importance, and eliminating any option

Preferential Choice Models 11 that does not contain that aspect, and this elimination process continues until only one option remains, which is then chosen. For a binary choice between A and B, the EBA model predicts Pr[ A | {A,B}] = (a + ac) / [ (a + ac) + (b + bc)],

(EBA-1)

and for a triadic choice among {A,B,C}, the EBA model predicts Pr[A|{A,B,C}] = {a + ab⋅Pr[A|{A,B}] + ac⋅Pr[A|{A,C}] } / K,

(EBA-2)

where K = [ a + b + c + ab + ac + bc ] . The elimination by aspects model was originally designed to account for comparability effects and similarity effects on choice (Tversky, 1972). However, Tversky (1972b) proved that the elimination by aspects model satisfies the regularity principle. Therefore, the elimination by aspects model cannot account for the attraction effect. The elimination by aspects model cannot account for reference point effects. First note that option E does not have any aspects that are not contained in option B, so Pr[E|{B,E}] = 0. Similarly, F does not have any aspects that are not contained in option B, so Pr[F|{A,F}] = 0. These predictions are consistent with the fact that options E or F are rarely ever chosen. The finding that Pr[B|{A,B,E}] > Pr[A|{A,B,E}] implies that {(b+be)+ab⋅Pr[B|{A,B}]} > {(a+af)+ab⋅Pr[A|{A,B}]}. But this last inequality implies that Pr[B|{A,B,F}] > Pr[A|{A,B,F}], contrary to fact. Finally, this model cannot account for the compromise effect. In this case, all the binary choices are equal, which implies a+ac = b+bc

(e1)

a+ab = c+bc

(e2)

b+ab = c+ac

(e3)

Preferential Choice Models 12 and the following results hold for the triadic choice: Pr[ A | {A, B, C} ] =

{a + .5(ab + ac)}/K

= Pr[ B | {A, B, C} ] =

{b + .5(ab + bc)}/K

< Pr[ C | {A, B, C} ] =

{c + .5(ac + bc)}/K

which implies

(a+ac)+(a+ab) = (b+bc)+(b+ab)

(e4)

(a+ac)+(a+ab) < (c+bc)+(c+ac)

(e5)

From (e1) and (e4) we find a = b and ac = bc. From (e2) and (e5) we find that c > a = b. Reconsidering (e2) and (e3) this finally implies that ac = bc < ab. But this last result is inconsistent with the similarity relations among the options. Note that the two extreme options A and B have little in common. On the other hand, the compromise option C is closer on dimension 2 to A, so that ac > ab; also the compromise option C is closer in dimension 1 to B so that bc > ab. Thus the elimination by aspects model fails to account for the compromise effect. Strategy Switching Models. One appealing idea from the decision making literature is that individuals have a set of strategies for making decisions, and they may switch strategies depending on choice set size or choice context (Payne, Bettman, Johnson, 199x; Gigerenzer, 19xx). It is difficult to examine all possible strategy switching models, but we can consider a simple but reasonable case. Assume that an individual may switch from a compensatory to a non-compensatory strategy on any trial. The probability of using the non-compensatory strategy is denoted α n , and it is assumed to be an increasing function of set size, n.

Preferential Choice Models 13 For the compensatory strategy, we simply assume that it produces a real valued utility assignment for each option, denoted ui for option i. If the options all produce distinguishable utilities, then option with the greatest utility is chosen by this strategy. If the utilities are equal, then choice is random. For this strategy, it will be convenient to introduce an indicator function: δ i(uA, uB, uC) = 1 if ui = max{uA, uB, uC), otherwise zero. For the non-compensatory strategy, we assume a lexicographic strategy. In this case, an individual first considers the most important dimension, and he or she takes the best alternative on this first dimension; if more than one alternative is tied on the most important dimension, then the second dimension is considered, and the best on the second dimension is selected. Two options may be tied with respect to a particular dimension if the difference in their values is less than some small threshold, ∆, for that dimension. On any given trial, we allow the rank order importance of the dimensions to possibly change. The probability that dimension 1 is processed first is denoted π 1 , and the probability that dimension 2 is processed first is denoted π 2 = (1−π 1 ). The compensatory strategy satisfies both strong stochastic transitivity and order independence for binary choices. Therefore, to account for the comparability effects, we must assume that the lexicographic strategy is often used to make binary choices (α 2 > 0). For example, if the lexicographic strategy is used, then B will always be chosen over E, and E will be chosen over C with probability π 1 ; and B will be chosen over C with probability π 1 . If we set .50 < π 1 < 1, then this pattern of choice probabilities violates strong stochastic transitivity. This model can also account for the similarity effect, as long as we assume that the difference between option A and optio n S is smaller than the threshold, ∆, so that

Preferential Choice Models 14 these two alternatives are treated as tied. If the lexicographic strategy is used, then option B will be chosen whenever dimension 1 is processed first, and option A will be chosen whenever dimension 2 is processed first, and A is randomly chosen over S. Thus the probability of choosing option B equals π 1 , and the probability of choosing option A equals (.5)⋅π 2 . Setting π 1 = π2 reproduces the similarity effect. This model cannot explain the attraction effect. First, the dominated decoy D is rarely ever chosen, and so we must assume that the difference between this decoy and option A is greater than the threshold, ∆. Given this assumption, there are two ways that option A can be chosen: (a) the compensatory strategy is used and the utility for A is the maximum, or (b) the lexicographic strategy is used and dimension 1 is processed first. Therefore, to account for the attraction effect, we require that Pr[ A | {A,B,D} ] = (1-α 3 ) ⋅ δ A(uA,uB,uD) + α 3 ⋅ π 2 ,

(LEX-1)

> Pr[ A | {A,B,} ] = (1-α2 ) ⋅ δ A(uA,uB) + α2 ⋅ π2 = .50. There are two cases to consider. If δ A(uA,uB,uD) = 1, then it is impossible for the above inequality to hold. For in this case we require that (1-α3 )⋅1+α 3 ⋅π2 > (1-α2 )⋅1+α 2 ⋅π2 , which implies that (α 3 -α 2 )⋅π2 > (α3-α 2 ), which has no solution for α 3 > α 2 and 0 < π2 < 1. If δ A(uA,uB,uD) = 0, then the above inequality holds, but this also implies that u B > uA in which case it is impossible to ha ve Pr[B|{A,B,E}] > Pr[B|{A,B}], contrary to fact. 1 This model cannot explain the compromise effect either. Given that both of the decoys, D and F, are rarely ever chosen over option A, we can be assured that option C

If u A = u B, so that δA (u A ,u B) = .50, then the model still fails to explain the attraction effect. If Pr[A|{A,B}] = .50, then Pr[A|{A,B}] = (1-α2 )⋅(.50)+α2 ⋅π2 = .50 à π2 = .50 à Pr[A|{A,B,D}] = (1-α3 )⋅(.50)+α3 ⋅π2 = .50 = Pr[A|{A,B}], contrary to fact. 1

Preferential Choice Models 15 can be discriminated on each dimension from option A. Based on this assumption, the binary choice probabilities are given by Pr[ A | {A,B}] = (1-α2 ) ⋅ δ A(uA,uB) + α2 ⋅ π2 . Pr[ A | {A,C}] = (1-α2 ) ⋅ δ A(uA,uC) + α2 ⋅ π2 .

(LEX-2)

Pr[ B | {B,C}] = (1-α 2 ) ⋅ δ B(uB,uC) + α2 ⋅ π 1 . The probability of choosing option A over B and C in a triadic choice is given by Pr[ A | {A,B,C}] = (1-α 3 ) ⋅ δ A(uA,uB,uC) + α3 ⋅ π2 , Pr[ B | {A,B,C}] = (1-α 3 ) ⋅ δ B(uA,uB,uC) + α 3 ⋅ π 1 ,

(LEX-3)

Pr[ C | {A,B,C}] = (1-α 3 ) ⋅ δ C(uA,uB,uC). Note that when all three options are presented, the lexicographic strategy would ne ver choose the compromise, because it is not the best on any dimension. To account for the fact that the compromise is chosen most frequently in the triadic choice set, we must assume that (1−α 3 ) > 0 and δ C(uA,uB,uC) = 1, which also implies that δ A(uA,uC) = δ B(uB,uC) = 0. To account for the fact that Pr[A|{A,C}] = Pr[B|{B,C}] = .50, we must have π 2 = π1 = .50, but this finally implies that Pr[A|{A,B}] = (1-α2 )⋅δ A(uA,uB)+α2 ⋅(.50) ≠ .50, contrary to fact. Finally this model cannot account for the reference point effects. The finding that Pr[B|{A,B,E}] > Pr[A|{A,B,E}] implies that (1−α 3 )⋅δ B(uA,uB,uE)+α 3 ⋅π1 > (1−α 3 )⋅δ A(uA,uB,uE)+α3 ⋅π2 . On the one hand, this result can be explained by assuming uB > uA, so that δ B(uA,uB,uE) = 1; but if this is true, then this implies Pr[B|{A,B,F}] > Pr[A|{A,B,F}], contrary to fact. On the other hand, the finding that Pr[B|{A,B,E}] > Pr[A|{A,B,E}] can be explained by assuming that uA > uB and α 3 ⋅π 2 > (1−α3 )+α3 ⋅π1 ; but if this is true, then this also implies Pr[B|{A,B,F}] > Pr[A|{A,B,F}], contrary to fact.

Preferential Choice Models 16 Componential context model Tversky and Simonson (1993) proposed a context dependent preference model, called the compontial context model, which relies on the concept of loss aversion (Tversky & Kahneman, 1991). According to this model, the value assigned to each option has two components: a context free value and another component that depends on the context of the choice set. The context free value for option i is defined as a sum of the values on each dimension. For optio ns defined by two dimensions, this is: v i = v i1 + v i2

(CC-1)

For binary choices, only the context free component is used (see Eq. 9 of Tversky & Simonson, 1993), and binary choice probabilities are determined by a simple scalable choice model. The context component becomes involved when three or more options are presented in the choice set. The context component is based on the concept of advantages of one option over another. For example, option A has a large advantage in terms of quality over option B, and option B has a large advantage in terms of economy over option A. The advantage of option A over B on the quality dimension is equal to the difference (v A2 – v B2 ). The advantage of option B over A on the economy dimension is equal to the difference (v B1 – vA1). For triadic choice sets, the componential context model asserts Pr[ A | {A, B, S} ] = F[ V(A|A,B,S), V(B|A,B,S), V(S|A,B,S) ]

(CC-2)

where F is an increasing function of the first argument, and a decreasing function of the other two. The values of option A in a set {A,B,S} is given by V(A| A,B,S) = v A + θ ⋅ [ R(A,B) + R(A,S) ],

Preferential Choice Models 17

R( A, B) =

(v A2

( v A2 − v B2 ) ( v A1 − v S1 ) , R( A, S ) = − v B 2 ) + δ 1 (v B1 − v A1 ) (v A1 − v S1 ) + δ 2 (v S 2 − v A2 )

and δ i is a convex function, consistent with loss aversion (see Tversky & Simonson, 1993, p. 1185). The componential context model was developed to explain the attraction effect, and the compromise effect. It can also account for reference points effects. However, this model cannot account for comparability effects, because it assumes a simple scalable choice model for binary choices. Furthermore, this model also fails to account for the similarity effect found with triadic choices. Consider the case involving options A, B, and S. For the similarity effect, we find that the binary choices are all equal, which implies v A = v B = v S and this in turn implies that (v B1 – v A1) = (v A2 – v B2 ), (v B1 – v S1 ) = (v S2 – v B2 ), (v A1 – v S1 ) = (v S2 – v A2). For the triadic choice, we need to compare V(A|A,B,S) with V(B|A,B,S), which can be done for each of the three terms separately. The binary choice results imply that the first two terms are equal because v A = v B and (v A2 – v B2 ) = (v B1 – v A1). Finally, the third term favors option A because of loss aversion. To see this in more detail, note that R( A, S ) =

(v A1 − vS 1 ) = (v A1 − v S1 ) + δ 2 (v S 2 − v A2 )

1 δ (v − v ) 1 + 2 A1 S 1 ( v A1 − v S1 )

R( B, S ) =

(v B1 − vS 1 ) = (v B1 − v S1 ) + δ 2 (v S 2 − v B 2 )

1 δ (v − v ) 1 + 2 B1 S 1 ( v B1 − v S1 )

Loss aversion requires δ to be a convex function, which implies that δ(x+∆)/(x+∆) > δ(x)/x, and because (v B1 -v S1 ) = (v A1-v S1 ) + (v B1 -v A1), convexity implies that R(A,S) >

Preferential Choice Models 18 R(B,S). In sum, this model predicts that V(A|A,B,S) > V(B|A,B,S), which implies that Pr[ A | {A, B, S} ] > Pr[ B | {A, B, S} ], which is contrary to the observed facts. Decision Field Theory. Busemeyer and Townsend (1993) and Roe, Busemeyer, & Townsend (2001) proposed a connectionist model of preferential choice called decision field theory. This theory is a dynamic model, which predicts choice probabilities as a function of deliberation time. The time index, t, represents the amount of time that has passed during deliberation before a choice is made. The theory is based on three assumptions. First, it is assumed that a weighted utility is computed for each option at each moment in time t. The weighted value for each option i ∈ {A,B,C} at time t equals Ui(t) = W1 (t)⋅mi1 + W2 (t)⋅mi2 + ε i(t),

(D1)

where mij represents the value of alternative i on dimension j, and W1 (t) and W2 (t) = 1 W1 (t) are two stochastic variables, called attention weights, which are assumed to fluctuate over time according to a stationary stochastic process. The last term, ε i(t), is an error term representing the influence of irrelevant features at each moment in time (e.g., features outside of the experimenter’s control). The above equation is similar to Fisher, Jia, and Luce’s (2000) random weight utility model except that it is dynamic rathe r than static. The second assumption is that the weighted values from each option are contrasted to form what is called a valence. The valence for option i ∈ {A,B,C} is a contrast that compares the weighted value for option i to the average of the weighted values of the other options (j ≠ k ≠ i): v i(t) = Ui(t) – U(t)

(D2)

Preferential Choice Models 19 where U(t) = Σ k≠i Uk (t)/(n-1) is the average of the weighted utilities of all the alternatives other than option i. Valence is closely related to the concept of advantages and disadvantages used in Tversky’s (1969) additive difference model. The third assumption states that the valences are integrated over time to form a preference state for each action. The preference state for option i ∈ {A, B, C} is updated according to the linear dynamic system: Pi(t+h) = s⋅Pi(t) + v i(t) – sij⋅Pj(t) – sik ⋅Pk (t), j≠k≠i.

(D3)

Conceptually, the new state of preference is a weighted combination of the previous state of preference and the new input valence. Lateral inhibition is also introduced from the competing alternatives, and the strength of the lateral inhibition connection is a decreasing function of the dissimilarity between a pair of alternatives. The probability of choosing option i at time t is equal to the probability that the preference state for option i is maximum at time t. The equations for the choice probabilities are presented in Roe et al. (2001, Appendix B). Roe et al (2001) demonstrated that decision field theory provides an explanation for comparability effects, similarity effects, attraction effects, and compromise effects, using a common set of parameter values. However, Roe et al. (2001) did not examine the reference point effects. Below we show that decision field theory also accounts for these effects. First consider the study involving the reference point represented by option F. To derive predictions from decision field theory for this study, we simply set the values (mij in Equation D1) proportional to coordinates of the options in Figure 1. We assumed an equal probability of attending to each dimension at each moment in time (Pr[W1 (t)=1] =

Preferential Choice Models 20 Pr[W2 (t)=1] = .50). The positive feed back parameter in Equation (D1) was set equal to s = .94, and the lateral inhibitory coefficient for the distant options A and B was set to sAB = .001. These parameters are similar to those used in Roe et al. (2001). We then examined the predictions of the model for a wide range of parameter values for two critical parameters: the lateral inhibitory coefficient for a similar option option (e.g., sAF ) in Equation D3, and the standard deviation of error term (std of ε) in Equation (D1). The probability of choosing option B from a set containing {A, B, F} is shown in Figure 2 below. As can be seen in this figure, as long as the lateral inhibitory Figure 2: Decision field theory predictions for the first reference point effect.

Preferential Choice Models 21

parameter is greater than .001, the model predicts that option A is favored when reference point F is present. The opposite pattern is predicted by the model when the reference point is changed to option E so that choice set contains {A, B, E}. In this case the probability of choosing option B is favored whenever the lateral inhibitory parameter is greater than .001. Thus the model robustly predicts the first reference point effect for a wide range of parameter values.

Preferential Choice Models 22 To apply decision field theory to the second study, consider the reference point S. For this study, we assume that each option is described by three dimensions: the values of the first two dimensions are taken from the positions of the options shown in Figure 1, and the third dimension represents job availability. Jobs A and B both have a positive value on dimension 3 (they are available), whereas job S and T both have negative values on dimension 3 (they are no longer available). In particular, option S is assigned a slightly higher value on dimension 2 than A, a slightly lower value on dimension 1 than A, and it has a large negative value on dimension 3; option T is assigned a slightly higher value on dimension 1 than B, a slightly lower va lue on dimension 2 than B, and it has a large negative value on dimension 3. The large negative value on the third dimension prevents the unavailable option from being chosen from the triadic choice set. We assumed an equal probability of attending to each of the three dimensions, and the remaining parameters were the same as used to generate Figure 2. The choice probability results, predicted by the theory, are illustrated in the Figure 3 below.

Preferential Choice Models 23

As can be seen in the figure, decision field theory again reproduces the reversal in preference as a function of the reference point. In sum, we find that both reference point effects can be predicted for a wide range of parameter values by decision field theory. Decision field theory is one example of a connectionist model of decision making. Recently, two other connectionist models have been proposed to account for some of the phenomena reviewed above. Guo and Holyoak (2002) present a connectionist model that was designed to explain the similarity and attraction effects, but at this time, it does not

Preferential Choice Models 24 account for the other effects. Usher & McClelland (2002) proposed an artificial neural network model that shares some assumptions contained in decision field theory, and this model is has the potential to predict most of the effects presented in Table 2. Comparison of Models Table 2, shown below, summarizes the ability of each model to account for each effect. At this time, only decision field theory is capable of explaining all of the results.

Model

Comparability Similarity Attraction Compromise Reference Effects Effects Effects Effects Point Effects no no yes no no

Simple Scalability Random yes yes no yes Utility Elimination yes yes no no by Aspects Strategy yes yes no no * Switching Componential no no yes yes Context Decision yes yes yes yes Field Theory * Switching occurs between a compromise and a lexicographic strategy.

no no no yes yes

However, it is possible to modify and extend the other models in Table 2 so that they can overcome their current limitations. For example, one could construct a hybrid simple scalability/random utility model by assuming that the utilities of the simple scalability model are random. As another example, one could construct a more complex strategy switching model by assuming that one switches between a random utility compensatory rule and an elimination-by-aspects heuristic rule. Comparisons among these more complex hybrid versions will require quantitative tests that take into consideration both model accuracy and model complexity (Myung, 2001).

Preferential Choice Models 25

References Becker, G., Degroot, M.H., and Marschak, J. (1963). Probabilities of choices among very similar objects: An experiment to decide between two models. Behavioral Science, 8(4), 306-311. Block, H. D., and Marschak, J. (1960). Random orderings and stochastic theories of response. In I. Olkin, S. Ghurye, W. Hoeffding, W. Madow and H. Mann (Eds.), Contributions to probability and statistics (pp. 97-132). Stanford: Stanford University Press. Bock, R. D., and Jones, L. V. (1968). The measurement and prediction of judgment and choice. San Francisco, CA: Holden-Day. Busemeyer, J. R., and Townsend, J. T. (1993). Decision Field Theory: A dynamic cognition approach to decision making. Psychological Review, 100, 432-459. De Soute, G., Feger, H., and Klauer, K. C. (Eds.) (1989). New developments in probabilistic choice modeling. Amsterdam: North Holland. Edgell, S. E., and Geisler W. S. (1980) A set-theoretic random utility model of choice behavior. Journal of Mathematical Psychology, 21(3), 265-278. Fischer, G. W., Jia, J., and Luce, M. F. (2000) Attribute conflict and preference uncertainty: The RandMAU model. Management Science, 46, 669-684. Gigerenzer, G., and Selten, R. (Eds.). (2001). Bounded rationality: The adaptive toolbox. Cambrige: MIT Press. Gonzalez-Vallejo, C. (2002). Making tradeoffs: A probabilistic and contextsensitive model of choice behavior. Psychological Review, 109(1), 137-154.

Preferential Choice Models 26 Guo, F. Y., and Holyoak, K. J. (2002). Understanding similarity in choice behavior: Connectionist model. Proceedings of the Cognitive Science Society Meeting. Hafter (197?) Harless, D.W., and Camerer, C. (1994). The predictive validity of generalized expected utility theories. Econometrica, 62(6), 1251-1289. Haykin, S. (1994) Neural Networks. New York: Macmillan. Hauser, J., and Wernerfelt, B. (1990). An evaluation cost model of consideration sets. Journal of Consumer Research¸16, 393-408. Heath, T. B., and Chatterjee, S. (1991) How entrants affect multiple brands: A dual attraction mechanism. Advances in Consumer Research, 18, 768-771. Hey, J.D., and Orme, C. (1994). Investigating generalizations of expected utility theories using experimental data. Econometrica, 62(6), 1291-1326. Huber, J., Payne, J. W., and Puto, C. (1982). Adding asymmetrically dominated alternatives: Violations of regularity and the similarity hypothesis. Journal of Consumer Research, 9(1), 90-98. Keeney, R. L., and Raiffa, H. (1976). Decisions with multiple objectives: Preference and value tradeoffs. New York: John Wiley & Sons Lapersonne, E., Laurent, G., and Le Goff, J.J. (1995). Considerations sets of size one: An empirical investigation of automobile purchases. International Journal of Research in Marketing, 12, 55-66. Lehmann, D. R., and Pan, Y. (1994) Context effects, new brand entry, and consideration sets. Journal of Marketing Research, 31, 364-374. Luce, R. D., and Suppes, P. (1965). Preference, utility, and subjective probability.

Preferential Choice Models 27 In R. D. Luce R.D., B. Bush and E.Galanter (Eds.), Handbook of mathematical psychology Vol.3 (pp. 249-410). New York: Wiley. Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley, 1959. McFadden, D. (1981). Econometric Models of Probabilistic Choice, in C.F. Manski and D. McFadden (Eds.) Structural Analysis of Discrete Data with Economic Applications. Cambridge, MA: MIT Press. pp. 198-272. Mellers, B. A. and Biagini, K. (1994) Similarity and choice. Psychological Review, 101, 505-518. Narayana, C.L., and Markin, R.J. (1975) Consumer behavior and product performance: An alternative conceptualization: Journal of Marketing, 39, 1-6. Nowlis, S.M., and Simonson, I. (2000). Sales promotions and the choice context as competing influences on consumer decision making. Journal of Consumer Psychology, 9(1), 1-16. Payne, J. W., Bettman, J. R., and Johnson, E. J. (1993). The adaptive decision maker. NY: Cambridge University Press. Roe, R.M., Busemeyer, J.R., and Townsend, J.T. (2001). Mulitialternative decision field theory: A dynamic connectionist model of decision making. Psychological Review, 108(2), 370-392. Simonson, I., and Tversky, A. (1992) Choice in context: Tradeoff contrast and extremeness aversion. Journal of Marketing Research, XXIX, 281-295. Simonson, I. (1989) Choice based on reasons: The case of attraction and compromise effects. Journal of Consumer Research, 16, 158-174.

Preferential Choice Models 28 Sjoberg, L. (1977). Choice frequency and similarity. Scandinavian Journal of Psychology, 18, 103-115. Thurstone, L. L. (1959). The measurement of values. Chicago: University of Chicago Press. Tversky, A. (1972a). Elimination by aspects: A theory of choice. Psychological Review, 79(4), 281-299. Tversky, A. (1972b). Choice by elimination. Journal of Mathematical Psychology, 9(4), 341-367. Tversky, A. (1969). Intransitivity of preferences. Psychological Review, 76, 3148. Tversky, A., and Kahneman, D. (1991). Loss aversion in riskless choice – A reference dependent model. Quarterly Journal of Economics, 106(4) 1039-1061. Tversky, A., and Satath, S. (1979) Preference trees. Psychological Review, 86(6), 542-573. Tversky, A., and Simonson, I. (1993). Context dependent preferences. Management Science, 39, 1179-1189. Usher, M., and McClelland, J.L. (2002). Decisions, decisions: Loss aversion, information leakage, and inhibition in multi- alternative choice situations Under review at Psychological Review.