Imitation, Group Selection and Cooperation - Lakehead University

3 downloads 0 Views 201KB Size Report
Nov 19, 2001 - imitation—so that the process is formally one of “group selection.” A subpopulation that is not signalling and defecting against one and all can ...
IMITATION, GROUP SELECTION AND COOPERATION∗ Philippe Gr´egoire [email protected] Arthur Robson [email protected] Department of Economics University of Western Ontario London, Ontario Canada N6A 5C2 November 19, 2001



Robson acknowledges support from the Social Sciences and Humanities Research Council of Canada and from a Canada Council Killam Research Fellowship. We also thank the referees and editors of this review for their comments.

ABSTRACT

A prior signalling stage is added to the prisoner’s dilemma and the overall population involved is divided into a number of subpopulations. Evolution involves both local and global imitation—so that the process is formally one of “group selection.” A subpopulation that is not signalling and defecting against one and all can be invaded by two “secret handshake” mutants. A subpopulation that is composed entirely of the secret handshake strategy can be invaded by a single “sucker punch” mutant. Nevertheless, if there are at least three subpopulations, the population cooperates always, in the limit as the mutation rate tends to zero.

1

Introduction

In many circumstances, it is compelling that human beings imitate the choices made by others rather than undertake the costly detailed analysis that would permit fully rational choice. Indeed, biologists and anthropologists often consider imitation to be the basis of culture.1 Within economics, evolutionary games are motivated by the realistic demand that individuals be boundedly rational, in general, and many of the naive adaptive learning rules used can be interpreted as imitation, in particular.2 Key results of this literature are that 1

Bonner (1980) discusses animal culture in this light. Cavalli-Sforza and Feldman (1981) and Boyd and Richerson (1985) consider human societies in which both such cultural evolution and biological evolution occur. Rogers (1988) presents an example where imitation is not socially advantageous. 2

Kandori (1997) and Mailath (1998) provide surveys of this literature. Schlag (1999) discusses theoretical rationales for imitative dynamics.

1

such naive behavior may nevertheless yield rational outcomes—Nash equilibria, for example. Such results illuminate why imitation might have been favored by biological evolution. Although imitation may sometimes lead to rational outcomes, this is not inevitable. Neither are “irrational” outcomes always unrealistic. For example, the wide interest game theorists and psychologists have in the prisoner’s dilemma is partly due to the occurrence of “irrational” cooperation in experiments.3 Imitation in an environment with a local spatial structure can lead to cooperation in the prisoner’s dilemma.4 The present paper shows that such cooperation can also occur in a model involving imitation and a global process analogous to group selection. The model adds a prior signalling stage to the one-shot prisoner’s dilemma. Each strategy then specifies whether or not to use the signal and how to react to the presence or absence of the signal in an opponent. The signal has a small cost. This set of strategies includes the “secret handshake,” that signals and cooperates only against fellow signallers, and the “sucker punch” that signals and defects against one and all.5 The population is divided into a number of subpopulations. At the start of each period, experimentation occurs, so that each individual randomly selects another strategy with a small probability. Within each period, imitation occurs after each of two rounds of play. In each round, the members of each subpopulation play against the other members. After the first round, one of the 3

See, for example, Rapoport and Chammah (1965). Note that Kreps, Milgrom, Roberts and Wilson (1982) show that cooperation may be typical, in the repeated prisoner’s dilemma, if there is suitable incomplete information about other players’ payoffs, and there is a large number of repetitions. 4

Bergstrom and Stark (1993) give an example of this; Eshel, Samuelson and Shaked (1998) present a general model. 5

Robson (1990 and 1992) introduce the “secret handshake” and the “sucker punch.”

2

strategies with the highest payoff in each subpopulation is imitated by all of its members. The second round serves just to exhibit the payoff obtained by each subpopulation’s strategy against itself. After this second round, one of the strategies with the highest payoff present anywhere is imitated by all players in every subpopulation. The key result here is as follows. Suppose that all mutation rates are of comparable orders of magnitude and all of these converge to zero. If there are at least three subpopulations, the entire population then cooperates in the prisoner’s dilemma. The secret handshake is, in general, needed for this result—nevertheless the secret handshake itself may vanish in this same limit. In the next section, the key result is derived under some simplifying assumptions. Sections 3 and 4 present a formal treatment of the general model. Section 5 discusses other related papers and Section 6 concludes.

2

Defect vs Secret Handshake and Sucker Punch Player 2 C C

D r

t

r

s

Player 1 D

p

s p

t

Figure 1: Prisoner’s dilemma.

3

The prisoner’s dilemma is depicted in Figure 1, where t > r > p > s (lower-left-corner payoffs are Player 1’s). To obtain the intuition for the emergence of cooperation at the expense of defection, consider the following simplified model. Most importantly, the set of strategies is abbreviated, but mutations also arrive in a convenient order. Take first a single population of N individuals. In each period, there is a round robin tournament, so each player plays every other player once and obtains the resulting average payoff. All players then imitate the existing strategy having the highest payoff. Suppose the population is initially composed just of cooperators, using C, and defectors, using D. If zD is the number of defectors and πi (zD ) is the average payoff of strategy i ∈ {C, D}, then πD (zD ) > πC (zD ) for any zD ≥ 1. Hence the population evolves to all defect if there is at least one defector to start with.6 Player 2 D

SH p

D

p−δ

p

p

Player 1 p SH

r−δ

p−δ

r−δ

Figure 2: Defect versus the secret handshake.

Suppose a signal is available, with a small cost δ > 0. Given an initial population of D, consider the introduction of mutants who play the secret handshake, SH: They send the 6

Strategy C is dropped for the rest of this section. Although this may seem innocuous, C is always stochastically stable in the general model when the whole population is divided into two or more subpopulations.

4

signal before playing; if an opponent signals also, they cooperate; if an opponent does not signal, they defect.7 The payoffs from a single encounter are as in Figure 2. If there are zSH secret handshake mutants, their average payoff is πSH (zSH ) =

p(N −zSH ) N −1

+

r(zSH −1) N −1

− δ;

whereas that for D is πD (zSH ) = p. Hence πSH (zSH ) > πD (zSH ) for all zSH ≥ 2, given that δ(N − 1) < r − p,

(A)

as is assumed throughout. A cooperative outcome is therefore achieved, with at least two secret handshake mutants.8 Player 2 SH SH

SP t−δ

r−δ

s−δ

r−δ

Player 1 SP

p−δ

s−δ

p−δ

t−δ

Figure 3: The secret handshake versus the sucker punch.

Suppose now a “sucker punch” mutant, SP , that sends the signal but defects against all opponents, is introduced into a population of secret handshakers. The single-encounter 7

The signal is “secret” only in the sense that a population of D-strategists do not condition on it.

8

For such imitation to take place, individuals apparently need to know the complete specification of the most successful strategy. For simplicity, suppose this information is obtained by interrogating those playing this strategy. More generally, if both signallers and non-signallers are present, this information could be obtained by observing those who are most successful. Even if there are only signallers, for example, strategies that differ only in their treatment of non-signallers must have tied payoffs. It is not crucial then which of the two possible best strategies is adopted, although allowing for arbitrary choice would complicate the analysis.

5

payoffs are as in Figure 3. If zSP is the number of SP mutants, then, essentially as in the case of C versus D, πSP (zSP ) > πSH (zSP ) for all zSP ≥ 1, so one SP mutant is enough to invade. The SP strategy pays a cost of δ, in contrast to D. Clearly, one D can then invade a population of SP. The signal can now be reused to restart the cycle.9 D 1  SP 

1

@ 2 @ @ R

SH

Figure 4: Resistances with a single subpopulation.

Suppose there is a small rate of mutation. At least two mutations are required to leave D for SH, but only one to leave SH for SP or to leave SP for D. The “resistance” from D to SH is then 2, as opposed to 1 from SH to SP and from SP to D, as in Figure 4. It follows that the prevailing state in the limit when mutation rates tend to zero is defect, D. Suppose now that the population is divided in M ≥ 3 subpopulations. Mutations occur at the beginning of each period, followed by two round robin tournaments among members of the same subpopulation. After the first tournament, all members of each subpopulation imitate the best “local” strategy. After the second, the best “global” strategy is imitated by all individuals. Thus the most successful subpopulation takes over, as in group selection.10 9

In this example, a positive signal cost is needed in order to leave the SP state. However, it will be shown that SP is always vulnerable to a “mirror-image” secret handshake that does not signal, that cooperates against non-signallers but defects against signallers. Two such mutants can invade a population of SP. Hence it is possible to leave SP, even when δ = 0. More discussion of this is in the Appendix. 10

The present process of group selection involves only indirect interaction via imitation. Such imitation

6

Consider a population where all the subpopulations are playing D. If two SH mutations occur in any one subpopulation, so this switches to SH after the first round, then every subpopulation will imitate it after the second round, since SH has a higher payoff against itself than does D. To leave SH, however, all subpopulations must switch to SP after the first round, since SH has a higher payoff against itself than does SP . The resistance from SH to SP is then M > 2. Finally, to leave SP for D still requires only one mutation in any subpopulation. The situation is then as in Figure 5. It follows that SH prevails when the rate of mutation is small.11 D 1  SP 

M

@ 2 @ @ R

SH

Figure 5: Resistances with multiple subpopulations.

The following section shows that the basic conclusion here is robust to allowing all possible pure strategies and all possible mutations.12 That is: Suppose the mutation rate tends to zero. In this limit, the population defects, when it is not divided in subpopulations; spends requires a flow of information on payoffs and strategies between the subpopulations, as might be provided by word-of-mouth, for example. This contrasts with Hausken (2000), for example, who considers more direct competition, involving the division of aggregate production between subgroups. 11

If the matching here were a one-shot random pairing of all players, the group selection mechanism would produce C in the limit as the rate of mutation tends to zero, even when C and D are the only strategies. This is because two C mutants have a positive probability of meeting in a subpopulation of D’s. However, although this probability is independent of the mutation rate, it is small if N is large, so that considering the limit as mutation rates tend to zero may not be appropriate. (A random matching mechanism was studied by Robson and Vega-Redondo, 1996.) 12

The possibility of multiple best strategies must also be treated.

7

some time cooperating and some time defecting, when there are two subpopulations; and cooperates, when divided in three or more subpopulations. However, the generalization in the next section is not completely straightforward. For example, the strategy that does not signal and cooperates against all opponents is always stochastically stable when there are two or more subpopulations, as is the strategy that does not signal, cooperates against non-signallers and defects against signallers. The secret handshake itself may or may not be stochastically stable, even with three or more subpopulations.

3

The General Model

Consider first a description of the prisoner’s dilemma, when this is augmented by an earlier stage at which a costly signal can either be sent or not. The overall strategy set is S = {s1 , s2 , . . . , s8 }, as defined in Table 1, and the game is then as in Figure 6, where lower-leftcorner payoffs are Player 1’s. The population is composed of M ≥ 1 subpopulations, each with N ≥ 3 members.13 Mutation occurs at the start of each period, followed by two round-robin tournaments within each subpopulation. After the first, individuals imitate one of the strategies in their own subpopulation with the highest payoff. After the second, one of the strategies anywhere that has the highest payoff against itself is imitated by all players. The timing of events is as in Figure 7. At the beginning of a period, all subpopulations are in the same state s, say. After the first tournament, subpopulation i is homogeneous and in state s0(i) , i = 1, ..., M, say. 13

The case N = 2 requires special treatment, but cooperation when M ≥ 3 still arises.

8

Strategies

Signal?

Against no

Against yes

s1 (C)

no

c

c

s2

no

c

d

s3

no

d

c

s4 (D)

no

d

d

s5

yes

c

c

s6

yes

c

d

s7 (SH)

yes

d

c

s8 (SP )

yes

d

d

Table 1: Strategies. After the second tournament, imitation of the most successful subpopulation, subpopulation 1, say, brings the entire population to the same state, s0(1) . In detail: At the beginning of a period, all players use the same strategy. Each player then deviates from that strategy with a small probability ε. Given that a player deviates, she chooses strategy i with probability λi > 0, i = 1, ..., 8, with

P8

i=1

λi = 1. The overall

probability of deviating to si from any other strategy is then λi ε, i = 1, ..., 8 (A player may experiment but then revert to her original strategy). Let z = (z1 , z2 , . . . , z8 ) describe the number of players using each strategy, immediately after mutation. Suppose the payoff to a si -strategist against a sj strategist in the first

9

tournament is aij , so the average payoff of an si -strategist is then "

πi (z) =

1 (zi − 1) aii + N −1

# X

zj aij .

j6=i

Set πi (z) = −∞ when zi = 0, so that no absent strategy is imitated. Imitation is now according to a version of the best reply dynamic from Kandori, Mailath and Rob (1993), as follows. Let Π(z) = {k : πk (z) = maxj πj (z) } and let Ψ(z) = {k ∈ Π(z) : zk ≥ zj , ∀j ∈ Π(z)}. Define b(z) = (b1 (z), ..., b8 (z)) as the distribution of strategies after the first tournament, given the distribution z before. The first-round best-reply dynamics are then:

b(z) = N ei , with probability

1 , for all i ∈ Ψ(z), #(Ψ(z))

where ei is the ith unit vector and #(Ψ(z)) is the size of Ψ(z). That is: If there is a unique best strategy, all individuals immediately choose it. If there are several best strategies, but one of these has more adherents than any other of these, all individuals choose this one. If there are several best strategies that have the most adherents in this sense, one of these strategies is chosen at random by all individuals. This ensures each subpopulation remains monomorphic after the first tournament.14 Table 2 presents the minimal number of mutations for a subpopulation initially in any state to switch to any other state after the first tournament. These minimal mutation numbers are resistances, as in Young (1998). Only some of these resistances are needed 14

These tie-breaking assumptions rule out stable mixtures of strategies with equal payoffs, such as those of s1 and s2 . These rules serve to reduce the number of absorbing states, but do not seem to affect whether cooperation or defection would be observed in the end.

10

exactly; for the rest, lower bounds are sufficient. The resistance from s7 to s2 , that is defined to be A, helps determine whether the secret handshake is likely in the long-run. The Appendix shows that A ∈ {2, ..., N }, that A → ∞ as N → ∞ and presents notes on the other entries.15 If M = 1, Table 2 completes the description of the Markov chain. Final state s1

s2

s3

s4

s5

s6

s7

s8

s1

0

≥2

1

1

≥1

≥1

1

1

s2

≥2

0

1

1

≥1

≥1

≥1

≥1

s3

≥1

≥1

0

≥1

1

1

1

1

s4

≥2

≥2

≥2

0

≥2

≥2

2

≥2

s5

1

1

1

1

0

1

≥1

1

s6

1

1

1

1

≥1

0

≥1

≥1

s7

≥A

A

≥1

≥1

≥2

1

0

1

s8

≥1

1

≥1

1

≥1

≥1

≥1

0

Initial state

Table 2: Resistances in the first tournament; overall resistances with one subpopulation.

If M ≥ 2, a second tournament occurs in each period. The version of the best re15

Transitions can be facilitated by the presence of third types. For example, a mutant of type s2 will take over a population of s1 ’s in the presence of a mutant of type s5 , since the payoffs to s1 , s2 and s5 are then r, N 1−1 ((N − 2)r + t), and N 1−1 ((N − 2)r + s) − δ, respectively. The resistance from s1 to s2 is then exactly 2. Tighter bounds can be found for many of the resistances here.

11

ply dynamic that operates after this is analogous to that after the first tournament as follows. Let m = (m1 , m2 , . . . , m8 ) describe the number of subpopulations in states 1, 2, . . . , 8, respectively, immediately before the second tournament. This tournament simply generates the payoff of each strategy against itself, π ˆi (m), say, for i = 1, 2, . . . , 8, from Figure 6. Set π ˆi (m) = −∞ if mi = 0, so that no absent strategy can be imitated. Define n o ˆ ˆ ˆ ˆ Π(m) = {k : π ˆk (m) = maxj π ˆj (m)} and Ψ(m) = k ∈ Π(m) : mk ≥ mj , ∀j ∈ Π(m) . Let ˆb (m) = (ˆb1 (m), ..., ˆb8 (m)) be the distribution of subpopulations over states after the second tournament when the distribution was originally m. The second-stage best-reply dynamic is then:

ˆb(m) = M ei , with probability

1 ˆ , for all i ∈ Ψ(m), ˆ #(Ψ(m))

ˆ ˆ where ei is the ith unit vector and #(Ψ(m)) is the size of Ψ(m). This ensures that the entire population is, once again, monomorphic. The overall resistances between states for the entire population are as in Table 3. Again, only some of the exact resistances are needed, and lower bounds suffice for the remainder.

12

Notes on these entries are given in the Appendix.16,17

4

Cooperation vs Defection and the Speed of Convergence

If there are three or more subpopulations, Table 3 has immediate implications favoring cooperation rather than defection. Suppose the four states that cooperate against themselves are grouped together, as are the four that defect against themselves. That is: Define C = {s1 , s2 , s5 , s7 } as the set of cooperating states and D = {s3 , s4 , s6 , s8 } as the set of defecting states. The above general model describes a regular Markov chain, P ε , say, with state space S= C ∪ D, where P ε (s, s0 ) is the probability of a transition from s ∈ S to s0 ∈ S. Table 3 implies that, if ε is sufficiently 16

The present model involves instantaneous adjustment dynamics after each tournament. However, placing any positive upper bound on the number of imitating subpopulations in the second round would not affect the overall conclusions here. This is for the same reason that the results Kandori, Mailath and Rob (1993) obtained with the “best reply dynamic” carry over to a more general case. Further, the results here also hold under the more general dynamics sketched as follows. Instead of all individuals, or all subpopulations, imitating with probability one, suppose the number of individuals, or subpopulations, that imitate is randomly drawn after each tournament. In the second phase, suppose further that imitation by an entire subpopulation is of a strategy with the highest payoff within a subpopulation with the highest average payoff over all subpopulations. Assume that s + t < 2r and that ties are dealt with in an analogous fashion to before. Suppose the probability that all individuals (or all subpopulations) imitate is strictly positive. The only stationary states of the unperturbed process remain the eight pure strategy states and, furthermore, the present resistances remain valid. 17

All but one of the resistances needed for the analysis involve the presence of only two strategies. Thus replacing the rule “imitate the best strategy” by “imitate any better strategy” would only matter for the transition from s7 to s2 , where s5 is needed. Altogether, then, Table 2 and Table 3 remain valid as long as there is a strictly positive probability that all the s5 mutants imitate s2 in this transitional population leading from s7 to s2 .

13

Final state s1

s2

s3

s4

s5

s6

s7

s8

s1

0

≥M

M

M

≥M

≥M

M

M

s2

≥M

0

M

M

≥M

≥M

≥M

≥M

s3

≥1

≥1

0

≥1

1

≥1

1

≥1

s4

≥2

≥2

≥2

0

≥2

≥2

2

≥2

s5

1

1

≥M

≥M

0

≥M

≥1

≥M

s6

1

1

1

1

≥1

0

≥1

≥1

s7

≥A

A

≥M

≥M

≥M

M

0

M

s8

≥1

1

≥1

1

≥1

≥1

≥1

0

Initial state

Table 3: Overall resistances with multiple subpopulations. small, then: (i) For all (c, d) ∈ C × D, P ε (c, d) ≤ K1 εM , where K1 > 0, and: (ii) For all d ∈ D, there exists c ∈ C such that P ε (d, c) ≥ K2 ε2 , where K2 > 0.18 Suppose that µε : S → [0, 1] is the unique stationary distribution of P ε , so that µε P ε = µε . 18

Strategy s4 requires two mutations to be transformed to s7 —this is the secret handshake. Other defecting states are even more vulnerable to cooperating mutants. For example, s8 can be invaded by a single s2 mutant. Although s2 simply defects against s8 , it avoids the cost of the signal, δ > 0. Even if δ = 0, s2 would function as a secret handshake against s8 and so only two s2 mutants could invade an s8 population. Indeed a transition from defection to cooperation never requires more than two mutants, even if δ = 0. Thus cooperation will still be favored over defection when δ = 0, as long as M ≥ 3.

14

It follows that19 4K1 εM

X c∈C

=

XX

µε (c) ≥

XX

µε (c)P ε (c, d)

c∈C d∈D

µε (d)P ε (d, c) ≥ K2 ε2

c∈C d∈D

X

µε (d).

d∈D

Since M ≥ 3, it follows that C will be observed with high probability in the long-run, if ε is small. Furthermore, since the probability of transition from D to C is no smaller than a term of order ε2 , the expected time for the first transition from D to C is no larger than a term of order ε−2 , regardless of M ≥ 3 and N ≥ 3. In this sense, convergence to cooperation is rapid. This approach lumps cooperating and defecting states together. A more detailed analysis is needed to determine which particular cooperating states are stochastically stable, even if M ≥ 3. The next section provides this, also considering the cases M = 1 and M = 2.

5

Stochastically Stable Distributions

Consider the Markov process P ε with transition resistances as in Table 3. For every state s ∈ S, a directed graph with a unique path from any state s0 6= s to s defines an s-tree, Ts , say. The resistance of this tree, R(Ts ), say, is the sum of the resistances of its links. In this model, the recurrent classes of the unperturbed Markov process P 0 are all the singleton states. The “stochastic potential” of each such state s is then the minimum resistance of all the s-trees. A state s is “stochastically stable” iff limε→0 µε (s) > 0, where µε (·) is the P P P Since µε is stationary, c,c0 ∈C µε (c)P ε (c, c0 ) + d∈D,c∈C µε (d)P ε (d, c) = c∈C µε (c). The equality then P P follows since c0 ∈C P ε (c, c0 ) = 1 − d∈D P ε (c, d), for all c ∈ C. 19

15

unique stationary distribution of P ε . Stochastically stable states are then accurate long run predictions, when mutation rates are small. Young (1998, for example) shows that the stochastically stable states are precisely those with minimum stochastic potential.

5.1

One Subpopulation

Consider the graph in Figure 8, where each link drawn attains the minimum resistance over all links originating at the same state, from Table 2. In general, an s-tree built only from such links must be a minimal s-tree. In the present case, such a minimal s-tree can be built from the graph, for each s, by removing the link originating at s. Thus the stochastic potential of si , i 6= 4, is 8 and the stochastic potential of s4 is 7, so that: Theorem 5.1 Consider the system described by the Markov process P ε , where M = 1. The only stochastically stable state is then s4 . Defection is the predicted long run outcome when the mutation rate is small. Group selection is needed here to obtain cooperation in the prisoner’s dilemma.

5.2

Two Subpopulations.

The graph in Figure 9 is again composed of links that have minimal resistance over all links with the same initial state, now from Table 3. Again, a minimal resistance s-tree can be built from this graph for each s by removing the link from s. The stochastic potential of s1 , s2 , s4 or s7 is then 10, but the stochastic potential of every other state is 11. Thus: 16

State

Links

Resistance

s1

(s1 , s4 ) , (s1 , s8 ) both M

s2

(s2 , s3 )

M

s3

(s3 , s5 )

1

s4

(s4 , s7 )

2

s5

(s5 , s1 )

1

s6

(s6 , s1 ) , (s6 , s2 ) both 1

s7

(s7 , s2 ) , (s7 , s6 ) A and M , resp.

s8

(s8 , s4 )

1

Table 4: Selected possible minimal resistance links for M ≥ 3. Theorem 5.2 Consider the system described by the Markov process P ε , where M = 2. The stochastically stable states are then precisely s1 , s2 , s4 and s7 .

Although s1 , s2 , or s7 involve cooperating, s4 involves defecting, so two subpopulations is still not enough to eliminate non-cooperative states.

5.3

Three or More Subpopulations.

Table 4 presents one or two links from each initial state that are candidates to have minimal resistance over all such links, from Table 3. (i) Consider first A < M and Figure 10 that then depicts a graph built from minimal resistance links. A minimal resistance s1 - or s2 -tree can be obtained from this graph by 17

removing the link originating at s1 or s2 , respectively. Although removing the link originating from other si ’s need not produce an si -tree in this graph, any si -tree must have at least the resistance obtained by summing the minimal resistances out of all states other than si . Hence, if A < M , min R (Ts ) = Ts

    M +A+6

s = s1 , s2 ,

   >M +A+6

s 6= s1 , s2 ,

(ii) Consider now the case A ≥ M and Figure 11 that depicts a graph also using minimal resistance links from Table 4. A minimal resistance s-tree can now be obtained by removing the link from each s, for each s. Hence, if A ≥ M ,

min R (Ts ) = Ts

    2M + 6

s = s1 , s2 , s7

   > 2M + 6

s 6= s1 , s2 , s7 .

Altogether, then: Theorem 5.3 Consider the system described by the Markov process P ε , for M ≥ 3. (i) If A < M , there are precisely two stochastically stable states—s1 and s2 .20 (ii) If A ≥ M , there are precisely three stochastically stable states—s1 , s2 , and s7 .21

That is, regardless of the value of A, three or more subpopulations ensure that the only stochastically stable states involve cooperation. It is then cooperation rather defection that is likely in the long run when the mutation rate is small. 20

Since A ≤ N, N < M ensures A < M.

21

Choosing N large enough ensures A ≥ M.

18

To evaluate these results, it is useful to first ask: What would group selection achieve in the unmodified prisoner’s dilemma? It can be shown that the answer depends on the relationship between the size of each subpopulation and the number of subpopulations. That is, cooperation obtains when M > N, whereas defection prevails when N > M. The modifications of the game, the secret handshake, in particular, are then necessary for cooperation when N > M ≥ 3. Further, this case where the number in each subpopulation is larger than the number of subpopulations is at least as plausible as the alternative. The result for N > M ≥ 3 and M > A is especially interesting in that the secret handshake, although it is necessary to obtain cooperation, is not stochastically stable itself. With a small mutation rate, the secret handshake would then be rare in the long run; as indeed would any occurrence of the signal.

6

Other Related Work

The imitation of a best strategy in the second round here is analogous to group selection. Group selection is often dismissed by biologists.22 The argument is not that a model of group selection is logically impossible but that it requires implausible values of the parameters. Note that this criticism of group selection lacks much force here since the evolution involved is only metaphorical—selection involves imitation rather than differential reproductive success. The mechanism here of imitation across subpopulations represents cultural competition, the 22

For example, Williams (1966) describes it as an “unnecessary distraction.” Dawkins (1982) argues that the only replicator is the gene. A recent exception is Sober and Wilson (1998), who argue that biological group selection lies behind altruism.

19

existence of which is plausible for human beings. Other work in economics that applies a formal process of group selection is due to VegaRedondo (1993) and Canals and Vega-Redondo (1998). These papers show that a Paretoefficient equilibrium can be selected, in a coordination game, when learning occurs at two levels, within groups and across groups. The present result is stronger in that a Paretoefficient outcome is selected, even when this is not an equilibrium.23 Hausken (2000) essentially shows that competition among groups may induce individuals to rationally cooperate in the prisoner’s dilemma. The addition of signalling strategies here, however, allows cooperative outcomes over a much wider range of parameter values. In particular, it allows cooperation when the number of individuals in each group exceeds the number of groups. Robson (1990) showed that a secret handshake might temporarily lead to cooperation in the one-shot prisoner’s dilemma. Wiseman and Yilankaya (1999) considered this more carefully, showing cooperation would occur a positive fraction of the time in a model that allowed the secret handshake, the sucker punch and defect. The present paper strengthens this conclusion to obtain cooperation always, developing further a sketch of a group selection mechanism from Robson (1992). 23

Indeed, this is true even relative to the augmented game as in Figure 6, since the unique equilibrium in this game remains (s4 , s4 ), that is, unconditional defection.

20

7

Conclusion

In this paper, we considered a modified prisoner’s dilemma, with signalling and imitation based on the Kandori, Mailath and Rob best reply dynamic. Although a secret handshake strategy exists, the only stochastically stable state remains defect when the population is not divided into subpopulations. Suppose, however, the population is divided into subpopulations and imitation across subpopulations of the best strategy also occurs. If there at least three subpopulations, the stochastically stable states now all involve cooperation.

8

Appendix

8.1

Notes on Table 2

Recall that assumption (A) holds here. It is then straightforward to verify that a single mutant of the final type obtains a higher payoff than the residual population of the initial type, for all the entries in Table 2 that are exactly 1. All claims that a resistance is greater than or equal to 1 are trivial. Notes on the remaining entries are as follows. (i) Initial state s1 . A single mutant of type s2 , that obtains the same payoff as the s1 ’s, cannot invade given N ≥ 3. (ii) Initial state s2 . Similarly, a single s1 cannot invade. (iii) Initial state s4 . Since s4 pays no signalling cost and defects against all opponents, it cannot be invaded by any single mutant. (iv) Initial state s7 . (a) If A ≤ N is the resistance from s7 to s2 , then A ≥ 2 since a 21

single s2 cannot invade.24 In addition: Given any fixed upper bound on the total number of mutants, the payoff to the s7 ’s can be made arbitrarily close to the own-payoff, r − δ, by choosing N large enough. On the other hand, the payoff to the s2 ’s can be made arbitrarily close to the payoff each s2 obtains against each s7 , p. Hence A → ∞ as N → ∞. (b) The resistance to s1 is at least A since any pattern of mutants that induces a transition to s1 must induce a transition to s2 instead if all the s1 ’s are replaced by s2 ’s. (c) A single s5 , with the same payoff as the s7 ’s, cannot invade since N ≥ 3.

8.2

Notes on Table 3

Given the transitions within each subpopulation have resistances as in Table 2, the remaining issue is: How many subpopulations must switch for the whole population to make the transition? The payoffs that each strategy obtains against itself, or “own-payoffs,” are relevant here. All “≥ 1” entries in Table 3 are again trivial. Also obvious is that no entry in Table 3 can be less than the entry in Table 2. When the final state has a higher own-payoff than does the initial state, it suffices for a single subpopulation to make the transition and the entry in Table 3 replicates that in Table 2. If the resistance in Table 2 is exactly 1 and the final state has a lower payoff than the initial state, all subpopulations must mutate and the resistance in Table 3 is exactly M. Indeed, whenever the transition is to a final state with a lower own-payoff, the resistance must be at least M. The only remaining entries can be dealt with as follows. 24

With one s2 mutant and one s5 , the payoff to s2 will be maximal if t is large enough. The lower bound 2 for A is therefore tight. On the other hand, if N is large enough, A < N.

22

(i) Initial state s1 . One way to switch to s2 is for all M subpopulations to mutate, involving at least M mutations. All the other possibilities involve α > 0 subpopulations switching to s2 , β ≥ 0 switching to some mixture of other states, and M − α − β > 0 remaining s1 . For s2 to invade with the same own-payoff, it is necessary that α ≥ M − α − β. The number of mutations needed is then at least 2α + β ≥ M. (ii) Initial state s2 . A similar argument to (i) applies for the transition to s1 . (iii) Initial state s7 . A similar argument also applies for the transition to s5 .

8.3

Costless Signalling

When δ = 0, resistances from non-signalling states to signalling ones may be smaller than those in Tables 2 and 3, and resistances from signalling states to non-signalling ones may be larger. In Table 2, the relevant minimal resistances that depend on δ are as follows. (i) Initial state s5 : More than one s1 mutant and more than one s3 mutant is now needed to invade. (ii) Initial state s8 : Exactly two s2 mutants are now needed to invade. One mutant alone cannot invade s8 , so the resistance to any state other than s2 is ≥ 2. For a single subpopulation, the links in Figure 8 are still the least resistant when δ = 0, but the link (s8 , s2 ) now has resistance 2. That is, with one subpopulation, the stochastic potential of s4 and s8 is 8, as opposed to 9 for all si , i 6= 4, 8. Therefore, when signalling is costless, and there is one subpopulation, the stochastically stable states are s4 and s8 .

23

In Table 3, the relevant new minimal resistances are as follows, where dxe is the smallest integer greater than or equal to x. (i) Initial state s1 : One s7 mutant in

M  2

subpopulations is now enough to invade, since

s7 against itself now yields the same payoff as s1 against itself. (ii) Initial state s5 : Resistance to s1 is now ≥ M , since at least two mutants are needed in at least half of the subpopulations. Resistance to s2 is

M  2

, arising from one mutant in at

least half of the subpopulations. Note that the resistance to s7 is unaffected but is actually ≥ M. (iii) Initial state s6 : Resistances to s1 and s2 remain 1 and hence minimal. (iv) Initial state s7 : At least two mutations in at least half of the subpopulations are needed to switch to s2 , so this resistance is now ≥ M. The resistance to s1 is now also ≥ M. (v) Initial state s8 : Two mutations are now needed to switch to s2 . Resistances to states other than s2 are now or were actually already ≥ 2. Least resistant links are now as in Table 5. Further, it is possible to build an s2 -tree, an s7 -tree, an s4 -tree, and an s8 -tree using only these minimal resistance links. Indeed, Figure 12 shows these trees for s2 (left-hand side) and s7 (right-hand side). Removing the link (s4 , s7 ) and adding the link (s2 , s4 ) in the s2 -tree gives a minimal resistance s4 -tree. Similarly, if the link (s8 , s2 ) is removed and the link (s7 , s8 ) is added in the s7 -tree, a minimal resistance s8 -tree is obtained. Any si -tree must have at least the resistance obtained by summing the

24

State

Links

Resistance

s1

(s1 , s7 )

M 

s2

(s2 , s3 ), (s2 , s4 ) both M

s3

(s3 , s5 ), (s3 , s7 ) both 1

s4

(s4 , s7 )

2

s5

(s5 , s2 )

M 

s6

(s6 , s1 ), (s6 , s2 ) both 1

s7

(s7 , s6 ), (s7 , s8 ) both M

s8

(s8 , s2 )

2

2

2

Table 5: Minimal resistance links when δ = 0, M ≥ 2. minimal resistances out of all states other than si . Therefore, when M ≥ 2,       2 M2 + M + 6 s = s2 , s7     M  min R(Ts ) = 2 + 2M + 4 s = s4 , s8 2 Ts          ≥ M + 2M + 6 s = s1 , s3 , s5 , s6 2 Hence, when signalling is costless and there are two subpopulations, the stochastically stable states are s2 , s4 , s7 and s8 . If there are instead at least three subpopulations, the stochastically stable states are s2 and s7 .

25

References [1] Bergstrom, T.C. and Stark, O. (1994). “How Altruism Can Prevail Under Natural Selection.” American Economic Review 83, 149-155. [2] Bonner J.T. (1980). The Evolution of Culture in Animals. Princeton: Princeton University Press. [3] Boyd, R. and Richerson, P. (1985). Culture and the Evolutionary Process. Chicago: University of Chicago Press. [4] Canals, J. and Vega-Redondo, F. (1998). “Multi-Level Evolution in Population Games.” International Journal of Game Theory 27, 21-35. [5] Cavalli-Sforza, L. and Feldman, M. (1981). Cultural Transmission and Evolution. Cambridge: Cambridge University Press. [6] Dawkins, R. (1982). “Replicators and Vehicles.” in Current Problems in Sociobiology. Cambridge: Cambridge University Press. [7] Eshel, I., Samuelson, L. and Shaked, A. (1998). “Altruists, Egoists and Hooligans in a Local Interaction Model.” American Economic Review 88, 157-179. [8] Hausken, K. (1995). “The Dynamics of Within-Group and Between-Group Interaction.” Journal of Mathematical Economics 24, 655-687. [9] Hausken, K. (2000). “Cooperation and Between-Group Competition.” Journal of Economic Behavior and Organization 42, 417-425. 26

[10] Kandori, M. (1997) “Evolutionary Game Theory in Economics,” in Advances in Economics and Econometrics: Theory and Application, Seventh World Congress, Volume I, Kreps, D. and Wallis, K. (eds.), Cambridge: Cambridge University Press. [11] Kandori, M., Mailath, G., and Rob, R. (1993). “Learning, Mutation, and Long Run Equilibria in Games.” Econometrica 61, 29-56. [12] Kreps, D.M., Milgrom, P., Roberts, J. and Wilson, R. (1982). “Rational Cooperation in the Finitely Repeated Prisoner’s Dilemma.” Journal of Economic Theory 27, 245-252. [13] Mailath, G.J. (1998) “Do People Play Nash Equilibrium? Lessons from Evolutionary Game Theory,” Journal of Economic Literature 36, 1347-1374. [14] Rapoport, A. and Chammah, A.M. (1965). Prisoner’s Dilemma; a Study in Conflict and Cooperation. Ann Arbor: University of Michigan Press. [15] Robson, A.J. (1990). “Efficiency in Evolutionary Games: Darwin, Nash and the Secret Handshake.” Journal of Theoretical Biology 144, 379-396. [16] Robson, A.J. (1992). “Evolutionary Game Theory.” in Recent Developments in Game Theory, John Creedy, Jeff Borland and J¨ urgen Eichberger (eds.). Aldershot, England: Edward Elgar. [17] Robson, A.J. and Vega-Redondo, F. (1996) “Efficient Equilibrium Selection in Evolutionary Games with Random Matching,” Journal of Economic Theory 70, 65-92.

27

[18] Rogers, A. (1988). “Does Biology Constrain Culture?” American Anthropologist 90, 819-831. [19] Schlag, K. (1999). “Justifying Imitation,” University of Bonn. [20] Sober, E., Wilson, D.S. (1998). Unto Others : The Evolution and Psychology of Unselfish Behavior. Cambridge, Mass.: Harvard University Press. [21] Wiseman, T. and Yilankaya, O. (1999) “Cooperation, Secret Handshakes, and Imitation in the Prisoner’s Dilemma,” Northwestern University CMS-EMS Discussion Paper 1248. [22] Vega-Redondo, F. (1993). “Competition and Culture in an Evolutionary Process of Equilibrium Selection: A Simple Example.” Games and Economic Behavior 5, 618631. [23] Williams, G. (1966). Adaptation and Natural Selection. Princeton: Princeton University Press. [24] Young, H.P. (1998). Individual Strategy and Social Structure, Princeton University Press.

28

Player 2

s1 s1

s2

s3

s4

s2 r

r

s3 r

r r r

r−δ r

p p

t

p−δ

r−δ

s−δ

p−δ p

t−δ s

t−δ s

p−δ

s−δ p

t

t−δ s

p

r

s8

t−δ s

t

r

s7

s−δ

r−δ

p

p

t

r−δ

t p

p

s

s6

s−δ

t

p

s5 r

s

p

s t

t

s t

t s

s

s t

t s

r r

s4

p−δ p

Player 1

s5

s6

s7

s8

r r−δ

s−δ r

r−δ

p p−δ

p−δ

s−δ

s−δ t−δ

p p−δ

t−δ

r−δ r−δ

p

s t−δ

r−δ t

s

p

s

s−δ

r−δ

r−δ

t

r

t

s

t−δ

r−δ

s−δ

t−δ

r

t

p−δ

s−δ t−δ

t−δ s−δ p−δ p−δ t−δ s−δ p−δ p−δ

r−δ r−δ s−δ t−δ r−δ r−δ s−δ t−δ

Figure 6: The augmented prisoner’s dilemma.

29

t−δ s−δ p−δ p−δ t−δ s−δ p−δ p−δ

s

s

s

?

?

?

...

s ?

mutations first tournament imitation within subpopulations

?

?

?

s0(1)

s0(2)

s0(3)

?

?

?

?

...

s0(M )

second tournament imitation across subpopulations

?

?

?

?

s0(1)

s0(1)

s0(1)

?

s0(1)

...

Figure 7: Timing of events in each period.

s4

2 - s 7

1 - s 8

1 - s 2

6

1 s1

1  1

s6

 1

s5

 1

?

s3

Figure 8: A key graph with one subpopulation.

s4

2 - s M (= 2) - s6 7

1 - s 2

6

M (= 2)

1 s8

M (= 2) s  1  1

s5

 1

?

s3

Figure 9: A key graph with two subpopulations.

30

1 - s 4

s8

2 - s 7

A - s 2

6

M

M

1 - s  1 1

s6

s5

?

 1

s3

Figure 10: A key graph when M ≥ 3 and A < M .

1 - s 4

s8

2 - s 7

M - s 6

6

1

M s1

 1

 1

s5

s3

?

 M

s2

Figure 11: A key graph when A ≥ M ≥ 3.

M 

s1 6

1

s6

2-

s7

M- s 8

s5

M  6

6

2

s4

M 

2- s 2

6

1

2

s3

1- s 5

s3

2-

s2

M- s 4

2

Figure 12: Key graphs when δ = 0, M ≥ 2.

31

M  6

6

s8

2- s 7 2

s6

1- s 1