Adaptive Walks with Noisy Fitness Measurements

15 downloads 6029 Views 60KB Size Report
1. Adaptive Walks with Noisy Fitness Measurements. Bennett Levitan and Staurt Kauffman. Santa Fe Institute. 1399 Hyde Park Road. Santa Fe, NM 87501 USA.
May, 1995

1

Adaptive Walks with Noisy Fitness Measurements Bennett Levitan and Staurt Kauffman Santa Fe Institute 1399 Hyde Park Road Santa Fe, NM 87501 USA [email protected] phone: (505) 984-8800 fax: (505) 982-0565 To appear in Molecular Diversity Keywords

adaptive walk / error / fitness / molecular evolution / NK model / noise SUMMARY Adaptive walks are an optimization technique for searching a space of possible solutions, for example, a space of different molecules. The goal is to find a point in the space (a molecule) optimal or near-optimal in some property, generally referred to as the “fitness,” such as its ability to bind to a given receptor. Adaptive walking, an analog of natural selection, is a powerful technique to search landscapes. However, errors in the measurements will cause errors in the adaptive walks. Mutant molecules of higher fitness may be ignored or mutants of lower fitness may be accepted. To examine the effect of measurement error on adaptive walks, we simulate single-agent hill-climbing walks on NK landscapes of varying ruggedness where Gaussian noise is added to the fitness values to model measurement error. We consider both constant measurement noise and noise whose variance decays exponentially with fitness. We show that fitness-independent noise can cause walks to “melt” off the peaks in a landscape, wandering in larger regions as the noise increases. However, we also show that a small amount of noise actually helps the walk perform better than with no noise. For walks in which noise decreases exponentially with fitness, the most characteristic behavior is that the walk meanders throughout the landscape until it stumbles across a point of relatively high fitness, then it climbs the landscape towards the nearest peak. Finally, we characterize the balance between selection pressure and noise and show that there are several classes of walk dynamic behavior. 1 . INTRODUCTION All physical measurements are inherently noisy. In many problems for which we seek optimal or near-optimal solutions, measurement noise can manifest in important ways. For example, when measuring the affinity of an antibody for an antigen, uneven mixing and errors in density or volume measurements will alter the measured affinity. Assays such as equilibrium dialysis and binding to receptor-coated beads are most sensitive over ranges that can not always be matched in advance to molecules with unknown properties. When performing enzyme-linked immunosorbent assays (ELISA), the sensitivity of the results is influenced by the type of immunoglobin being detected (IgG, IgA, etc.) [1], the type of buffers used [2] and hapten size [3]. We are interested in how such measurement noise effects adaptive walks, particularly adaptive walks used for molecular evolution. An adaptive walk is an optimization technique in which a search is carried out in a space of possible solutions, for example a sequence space of different molecules [4, 5, 6, 7, 8, 9, 10]. The goal of the search is to find a point in the space (a molecule) optimal or near-optimal in some property, generally referred to as the “fitness,” such as its ability to bind to a given receptor. The basic method of an adaptive walk is to first select one or many molecules and then make small mutations in some or all of the molecules. If the fitnesses of some mutants exceeds that of the first molecules, the walk moves to some or all of the more fit

May, 1995

2

mutants. This process is repeated until molecules of high fitness are found. There are many variations on this process such as varying the number of molecules to start with, the strategy for selecting these initial molecules, the number of mutants to produce, and the type of mutations made during the walk; but the basic evolution-like search is common to them all. Adaptive walking is a powerful technique because of its ability to search different regions of a space in parallel, moving a population of molecules closer and closer to regions of high fitness. Adaptive walks have begun to show their value not just in theoretical work but in pharmaceutical design, studying the immune system, and several other domains [4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16]. However, errors in the measurements will cause errors in the adaptive walks. Mutant molecules of higher fitness may be ignored or mutants of lower fitness may be accepted in the walk; and the end result may be distinctively different than that expected. In this paper, we examine the effect of noisy fitness evaluations on adaptive walks. We do simulations of simple, single-agent hill-climbing adaptive walks where Gaussian noise is added to the fitness values to model errors in measurements. The walks are in a simple, generic space, an example molecular sequence space. Fitness values of molecules are assigned using the “NK” fitness function, a function capable of generating landscapes of varying degrees of roughness that has proven useful in modeling systems in biology, evolution, economics and molecular evolution [4, 7, 9, 15, 16]. Aside from its utility in these fields, we chose the NK model because of earlier work in the field to which we can compare our results and for its computational simplicity. Because real-world noise processes are of varying types, we consider two different schedules for the change in noise level with fitness. In one schedule, the noise is constant at all fitness levels. In the other, noise variance decays exponentially with increasing fitness. In section 2, we describe in detail the sequence space, fitness function, measurement noise and adaptive walk algorithm for our simulations. We then discuss three types of results in section 3. Subsection 3.1 shows how fitness-independent noise causes a walk to “melt” off the peaks in a landscape, wandering in a larger and larger region as the noise increases. The walk algorithm provides pressure for increased fitness, while the noise pushes walks away from fitness peaks. However, we also show that a small amount of noise actually helps the walk perform better than with no noise. In subsection 3.2, we examine walks in which noise decreases exponentially with increasing fitness. The most characteristic behavior under these circumstances is that the walk meanders throughout the landscape until it stumbles across a point of relatively high fitness, then it climbs the landscape towards the nearest peak. In subsection 3.3, we characterize the balance points between the pressures of the adaptive walk and the noise and show that they result in several classes of walk dynamic behavior. 2 . METHODS To simulate a noisy adaptive walk, we need to specify (i) a space whose points correspond to the items we choose from in the walk, (ii) a neighbor relationship between these points, (iii) a function that assigns fitness values to these points, (iv) the noise that confounds measurements of these fitnesses, and (v) an adaptive walk algorithm. We discuss these below: 2 . 1 . Sequence space, neighbor relationship and fitness function The space we use below is the N-dimensional Boolean hypercube, also known as an Ndimensional Hamming space. In this space, each point a is specified by a vector of length N whose components, a1 - aN, each have a value of either zero or one. Notationally, a = [a1 a2 . . . aN] where ai ∈ {0, 1}

(1)

For example, in molecular terms, this space corresponds to the sequence space of polymers of length N with two possible monomers at each site ai. Each point in the space thus has N neighbors which differ at one site, or N "1-mutant neighbors." For example, the point [0 1 1 0] has 1-mutant neighbors [1 1 1 0], [0 0 1 0], [0 1 0 0] and [0 1 1 1]. Though the issues we examine are

May, 1995

3

applicable to problems in many domains; for clarity, in the remainder of this paper we refer to the sequence space and the walks in these molecular terms. We use the “NK” fitness function to assign fitnesses to points in the space [4]. "Fitness" refers to a property of points in the space, for example, the affinity of an antibody for a given antigen. Many real-world optimization problems lend themselves to more complex spaces and more complex fitness functions; however, we argue below that our results are qualitatively applicable to more general spaces and fitness functions. In the NK model on the Boolean hypercube, N is the length of the polymer (dimensionality of the sequence space) and K is the degree of epistasis, the degree to which the contribution to fitness of a particular monomer is influenced by monomers at other sites. The fitness of a point a, f(a), is a number between zero and one, where one is the highest fitness. f(a) is defined as f ( a) =

1 N ∑ wi (a) N i =1

(2)

wi (a) ∈[0,1] where wi(a) is the fitness contribution of the monomer at site i to the fitness of a. wi(a) is a function of K+1 variables, the monomer at site i and the monomers at K other sites. The choice of those sites can be defined in any way; for example, they may be the K sites closest to site i in a. Since each ai has two possible values, wi(a) has 2K+1 different values, one for each combination of monomers for the K + 1 sites. In the NK model, these wi(a) values are chosen at random uniformly from the 0 to 1 interval. In our work below, the K epistatic sites for ai are the K / 2 sites on either side of site i (or the (K - 1)/2 and (K + 1)/2 sites on either side for K odd). We consider the components of vector a to be arranged in a circle (periodic boundary conditions). The utility of the NK model is that, by changing K, we can tune the ruggedness of the landscape. Landscape ruggedness is an important property; since the more rugged a landscape, the more difficult it is for an adaptive walk to find points of high fitness. A K of zero means that wi(a) depends on ai only. Such a function has only one peak with a very high correlation between the fitnesses of adjacent points in the landscape. The polymer of the peak is given by the set of aiּthat give the larger of the two (2K+1 = 2) values of wi for each i. Finding the global maximum is trivial in such a landscape [4]. A K of N-1 means that wi(a) depends on the monomers at all sites in a. This function gives a completely random landscape with many peaks whose fitnesses on average are much lower than those for K = 0. There is zero correlation between the fitnesses of adjacent points in K = N-1 landscapes, and finding points of high fitness is very difficult (Such searches are likely NP hard problems [17]). Increasing K from 0 to N - 1 gives increasingly rugged (multipeaked) landscapes whose peaks are of increasingly lower average fitness. An example application of the NK model is the human immune system, for which the best fitting model has approximately N = 112 and K = 40 [4, 6, 7]. In out work below, we use a variety of N and K values. 2 . 2 . Noise in measurements Errors in physical measurement often result from the accumulated insult of many stochastic processes. For example, in column affinity measurements, errors are due to the diffusion of the ligand, uneven mixing of the ligand, uneven distribution of ligand and other processes. Such compounded noise processes can be modeled by a Gaussian distribution. Thus, we define the measured fitness of point a, g(a), as g(a) = f (a) + N(0, σ 2 )

(3)

where f(a) is the true fitness as defined above and N(0, σ ) is Gaussian distributed noise of zero mean and variance σ 2 . In Eqn. (3), the noise variance is the same at all points in the landscape. However, in many physical measurements, the measurement error depends on the true value of the property 2

May, 1995

4

being measured. An intuitive example occurs when a human weighs an object by hand. The error in estimating the weight increases with the true weight of the object. Many other physiologic sensory processes show measurement errors that increase as the magnitude of the measurement increases [18]. In measurements used for adaptive walks in molecular evolution, the measurement errors may also depend on the magnitude of the measurement. For example, in some methods of measuring affinity, molecules bind to resin-impregnated ligands in numbers proportional to their affinities. Because relatively few of them remain in the resin, molecules with low affinities are more susceptible to thermal agitation, uneven mixing, and anisotropic distribution of the ligand than molecules with high affinities. In these cases, the measurement error is high for low affinities and low for high affinities. In other methods of measuring affinity, and for other definitions of fitness, measurement error likely relates to fitness in more complex ways. To incorporate these types of measurement errors in our model, we allow the noise variance to vary with true fitness. We consider two types of noise schedules in our work below: noise whose variance is constant with fitness and noise whose variance decreases exponentially with fitness. We set the rate at which the variance changes with a noise dampening parameter k:,

σ 2 ( f ) = σ 02 exp[− k ( f − 0.5)] .

(4)

As k increases, variance increases for fitnesses below 0.5 and decreases for fitnesses above 0.5. The exponential is defined as σ 02 at fitness 0.5, since 0.5 is the mean fitness in the landscape (We reserve the variable σ 2 for noise constant with fitness). We do not explicitly consider noise that increases with fitness. We argue in the Discussion that noise that increases with fitness influences adaptive walks on NK landscapes much in the manner that constant noise does. 2 . 3 . Adaptive walk algorithm There are many possible algorithms for the adaptive walk. We use the “any-ascent” or “random” walk algorithm [4, 19], because it is straightforward and allows for easy analysis of the noise issues. It is also relevant to some techniques in molecular evolution such as phage display [11, 12, 13]. The algorithm can be summarized as follows: (1) Randomly select a point a1 in the sequence space. (2) Measure its fitness g(a1). (3) Randomly mutate one site on a1 to give point a2. (4) Measure the mutant’s fitness g(a2). (5) If ∆g = g(a2) - g(a1) > 0, set a1 to a2, otherwise do nothing. (6) Repeat steps 2 - 5. Despite the simplicity of the algorithm, there are a few points worthy of note: (i) We have chosen to consider mutations of only one site at a time (step 3), so only the N 1-mutant neighbors of a1 are candidates for a2. The algorithm could trivially be adapted to consider mutants at any distance from a1. (ii) The sampling of neighbors is “with replacement,” meaning that the same 1mutant neighbor can be chosen several times before other neighbors are considered. (iii) There is no termination to this walk. In the noiseless case, walks could theoretically be terminated by checking if all 1-mutant neighbors have lower fitness than the present point a1, thereby insuring that the final a1 is a local peak. With noise, however, there is no means to be certain that a1 is a local peak. Instead, we terminate our walks after a given number of iterations. (iv) Step (5) insures that the only mutants accepted are those with higher measured fitness than a1. Iterations that do not result in changing a1 are considered just as “costly” or time-wasting as iterations which do. (v) With each iteration, g(a1) is remeasured, even if this a1 is the same as that from the previous iteration. Consequently, what seemed like a highly fit point last iteration may seem to be of very low fitness this generation. This remeasurement reflects the types of constant remeasurement that occurs in chemical systems such as with deconvolution techniques [20, 21] and phage display and is an important part of the algorithm. Below, we will occasionally refer to the point a1 presently

May, 1995

5

under consideration by the algorithm as the “agent” whose fitness is being optimized. 3 . RESULTS 3 . 1 . Noise variance unrelated to fitness There are four possible scenarios in an iteration of the walk algorithm: If both ∆g = g(a2) g(a1) and ∆f = f(a2) - f(a1) are positive, then the walk acts as if switching to a2 increases its fitness and the walk really is increasing in fitness, a true positive [Tab. 1]. If ∆g is positive and ∆f is negative, the walk acts as if switching to a2 increases its fitness but really is decreasing in fitness, a false positive. True and false negatives are defined similarly. The performance of a walk is dictated by the relative rates of these scenarios and the landscape structure. As shown below, a noisy walk initially progresses toward fitness peaks but at higher fitnesses increasingly makes errors that cause it to perform less and less well. Walking agents end up hovering in a limited range of fitnesses below the values of the peaks whose basins of attraction they are in. (We define the “basin of attraction” of a peak as the set of all points which, under the influence of a noiseless version of the walk, could reach that peak. The same point may be in several basins.) We refer to this noise-induced hovering as “melting off the peaks,” since an agent would end up hovering in a similar region even if it started at a peak in a landscape. In this subsection, we characterize the melting but also show that small amounts of noise can improve the performance of walks. We explain why the walk acts this way when characterizing walk dynamics in subsection 3.3. Fitness changes in adaptive walks Figures 1(a) and 1(b) show fitness vs. generation for adaptive walks with different amounts of noise on the same N = 30, K = 0 landscape. Because K = 0, the landscape has only one, easy-to-find peak; and any delay or difficulty finding the peak is due to the noise, not the landscape structure. A generation is defined as one iteration of the walk algorithm, whether or not a new mutant is accepted. Figure 1(a) shows a walk with a small amount of noise ( σ 2 = 0.00005) starting at fitness of 0.383 and walking very close to the peak fitness of 0.583. These particular values reflect the random initial starting point for the walk and the random number generator seed used for the landscape; they do not have any special significance. The solid line shows true fitness, the dotted line shows the measured fitness varying about the true fitness, and the dashed line shows fitness for a noiseless walk on the same landscape starting at the same initial point. As in the noiseless walk, the true fitness initially climbs rapidly and slowly asymptotes towards the peak. After the initial rise, the agent stays near the peak but constantly fluctuates around fitness values somewhat below the peak. Figure 1(b) shows a walk on the same landscape from the same initial point with more noise ( σ 2 = 0.001). Here the agent moves towards mean fitness 0.5 and then wanders about considerably. The larger amount of noise is evident in the larger separation between true and measured fitness values. Both false positives and false negatives are evident in these noisy adaptive walks. False positives occur whenever there is a decrease in true fitness. Drops in fitness never occur in this walk algorithm when there is no noise. False negatives are evident from the slope of the true fitness curves. The true fitness rises much more slowly in 1(b) than in 1(a) or in the noiseless case. With more noise, the walk is more likely to ignore opportunities to move to points with higher fitness and will climb to higher fitnesses more slowly. Both errors contribute to the melting. We get a much better sense of the cumulative effect of noise on the walks by plotting the average fitnesses over many runs. Figures 1(c) and 1(d) show averaged measured and true fitnesses for 2000 different walks: 100 different landscapes with 20 different random starting locations in each. In these figures, the average true and measured fitnesses always start from 0.5, the average fitness in the landscape. Unlike in the plots of single walks, the average measured fitness here is always more positive than the average true fitness. This happens because, in step (5), switching to a given a2 is more likely when the noise in the measurement of a2 is positive. The walk algorithm biases in favor of positively-signed noise.

May, 1995

6

The walks in Fig. 1(c) have a small amount of noise and reach fitnesses close to the peak of 0.667 expected for K = 0 landscapes [7]. The walks in Fig. 1(d) have a large amount of noise and only reach fitnesses slightly above 0.5. The higher noise plot also shows a much larger difference between true and measured fitness. The effects of varying noise levels on landscapes of varying ruggedness are shown in Fig. 2(a). In this graph, “terminal fitness” refers to the average true fitness after the walk’s initial transient, in these cases the average is over generations 300 to 400. With no noise, the walk progresses to a peak. In the limit of infinite noise, the walk always moves towards the average landscape fitness of 0.5; since ∆g and ∆f are independent and the walk is completely random. Intermediate amounts of noise cause the walk to progresses towards fitnesses between these extremes. A surprising observation results from focusing on the very low variance region in Fig. 2(a), shown in Fig. 2(b). For multipeaked landscapes (K > 0), a very small amount of noise actually improves performance. The walk algorithm can get trapped in local fitness peaks, many of which are at fitnesses far from that of the global peak. The noise lessens the probability that an agent will be trapped in local peaks during the course of the walk and frees the agent to move under the influence of the p(true positive) and p(true negative) curves. Note that, in an infinite time walk, the agent is never trapped in any local fitness peak. With infinite time, the noise always has a finite probability of causing the multiple, consecutive errors needed to compel an agent to leave a peak. However, in a finite time walk, the agent may remain in a low-fitness local peak when the walk terminates. Walks on K = 0 landscapes are not significantly helped by noise as there are no local peaks to get trapped in on single peak landscapes. As K increases, for a given amount of noise, the terminal fitness drops because the fitnesses of peaks in NK landscapes drop. Increasing K also leads to an increase in the range of noise variances over which the terminal fitness is higher than that for noiseless walks. This trend is caused by the higher density of local peaks in higher K landscapes. With more local peaks, there is more to be gained by having noise to push an agent out of local peaks. Locations of agents in adaptive walks So far, we have examined how the fitnesses found in adaptive walks change with noise; but we still do not know where the agents are. To track the location of the agents, we generated the types of plots shown in Fig. 3. At the start of a walk and at every 25 steps, we “planted a flagpole” at the location of the adapting agent. At every subsequent step, we stored the Hamming distance (number of sites in which they differ) between the present point and each of the previous flagpoles. This gives a series of curves which demonstrate how far the agent has walked from previous points. The four plots in this figure are all for the same landscape and starting point, they differ only in noise level. A flagpole plot for a walk with no noise on an N = 30, K = 0 landscape is shown in 3(a). The generation 0 flagpole curve initially rises rapidly as the agent quickly finds 1-mutant neighbors with higher fitness, then rises more slowly as more fit neighbors become more scarce. Eventually at around generation 195 the agent finds the single peak and the curve remains flat. The generation 25 flagpole curve follows a similar course but flattens more rapidly since it started well-after the agent began walking. Subsequent flagpole curves show less and less of the initial rise and, after generation 195, show no movement at all. The interpretation of these curves is that the agent has first rapidly and then slowly moved across the space as it finds its way to the peak and stays there. A flagpole plot for a very large amount of noise is shown in Fig. 3(b). Here the curves from every generation rapidly rise and remain wandering about an average Hamming distance of 15. Since 15 is half the Hamming distance across an N = 30 space, the agent is on average halfway across the space from all its previous locations. These curves show that the agent meanders about the sequence space with little regard for the fitness values and has completely melted off the single peak. Flagpole plots for intermediate cases are shown in Figs. 3(c) and 3(d). In Fig. 3(c), the early generation curves first rise rapidly and after the initial transient meander within a range of distances. The generation 0 curve stays above distance 15, showing that the walk does not ignore fitness values. The later generation curves all wander within a Hamming distance of 7. The

May, 1995

7

interpretation of this plot is that the agent moves towards the peak under the influence of the adaptive walk, but is simultaneously pushed away from the peak by the constant noise. Stated differently, the noise encourages the walk to wander over the entire space, but the adaptive walk keeps it trapped within a region of Hamming width 7. Figure 3(d) shows the same types of curves but with greater melting of the agent from the peak in a region of Hamming width 9. In general, with more noise, (i) the further from the peak the agent wanders, (ii) the larger the region in which the agent wanders, (iii) the closer the average Hamming distance from all previous flagpoles moves towards N / 2. It is this wandering further away from the peak that constitutes the increased melting with increasing noise observed above. We have shown only the K = 0 flagpole curves since they are much easier to interpret and use for demonstrative purposes. Flagpole plots for K > 0 landscapes are qualitatively very similar to those of K = 0 landscapes with two exceptions: (i) Since the K > 0 landscape peaks are of lower fitness than the K = 0 peaks, the Hamming distance curves plateau much more rapidly and at lower fitnesses. (ii) With multiple peaks, the agent will initially remain melted off a peak but within the basin of that peak, only to eventually move to another basin and remain melted off its peak. The agent can stay arbitrarily long in one basin, but will always have finite probability to move to the other basins. The greater the noise, the larger the area within the basin of a peak the agent wanders in and the greater the probability that it will move into a different basin of attraction. 3 . 2 . Noise variance decreasing with fitness While some real world systems may have the same level of noise at all fitness levels, others will have noise which changes with fitness. One example is equilibrium dialysis, which can have different schedules by which measurement noise changes with fitness depending on how well the assay is matched to the expected properties of the molecules being examined. We consider here the exponential drop in noise variance with fitness from Eqn. (4). When the noise variance drops with fitness, we expect a radically different type of behavior than we have seen so far. As a simple example, consider the case where σ 2 is 0.001 below fitness 0.6 and zero above 0.6. As long as the agent remains at fitnesses below 0.6, the walk should wander about much as described above, such as in Fig. 1(b). However, should the agent stumble across a point of high fitness (through what might be an essentially random move), the noise will suddenly be gone and the agent will climb to the peak of one of the basins in which that point lies. Similar behavior will occur with monotonically decreasing noise variance, though the change from a noisy walk to a noiseless walk may be more gradual. Thus, the typical walk we expect to see will start with an agent wandering about the landscape and not showing strong evidence of hill climbing and will later exhibit either a gradual or sudden transition to hill climbing. We observed three types of walks: The first is a gradual walk towards a peak, occasionally with steps towards lower fitness, but eventually ending locked onto the peak. Figure 4(a) shows such a case. The difference between measured and true fitness declines with each step towards higher fitness, eventually becoming so small that the walk remains at the local peak with very high probability. This behavior occurs provided the signal-to-noise ratio (SNR = / σ 2 ) increases rapidly with fitness. The second type of walk is shown in Fig. 4(b). This walk is on the same landscape and starts at the same point as in Fig. 4(a) but has ten times the noise variance. Instead of climbing towards a peak in discrete steps, the walk wanders about in points with fitnesses around and above 0.5, then suddenly stumbles across a point of high fitness at about step 280. Once the agent reaches a point of high fitness, the noise variance drops considerably and the agent climbs towards a peak and stays there with high probability, much as in the first type of walk. This type of behavior also requires the signal-to-noise ratio to increase rapidly with fitness. The third type of walk is when the agent moves constantly from basin of attraction to basin of attraction and never successfully hill climbs in any. This behavior occurs when the SNR does not increase or increases only slowly with fitness and is similar to that shown in Fig. 1(b). Averaged measured and true fitness for 2,000 walks with fitness-dependent noise are shown in Figs. 4(c) and 4(d). As with constant noise, the averaged measured fitness exceeds the

May, 1995

8

average true fitness. Figure 4(c) shows walks of the first type, and Fig. 4(d) shows walks of the third type. Despite the highly-varying average measured fitness, the average true fitness curve plateaus to a fitness characteristic for the σ 02 and k chosen. The average effects of σ 02 and dampening parameter k on terminal fitness for walks on multipeaked landscapes (N = 30, K = 15) are shown in Fig. 5(a). As with constant noise, terminal fitness is increased for 0 < σ 02 < 0.0015 and reduced for all larger noise variances [Fig. 5(b)]. The effect of changing the noise dampening also varies with σ 02 : increasing k increases the terminal fitness for σ 02 > 0.0015 and decreases terminal fitness for lower σ 02 . Increasing k dampens the deleterious effects of noise, so walks will in general do better with higher k.. But for those walks where noise lessens the probability of getting trapped in a local peak and leads to higher terminal fitness, reducing the noise hinders the walk. 3 . 3 . Noisy walk dynamics In the sections above, we discussed the conflicting forces acting on an agent in a noisy adaptive walk: selection pressure moves the agent towards peaks, and noise moves it away from peaks. In this Subsection, we ask whether we can estimate in advance the fitnesses at which these forces balance. Can we account for the walk’s dynamics? We found that there are three classes of behavior: (i) The agents always walk to a peak, (ii) The agents ascend or descend towards a particular fitness value (attractor) and remain meandering at fitness values in its vicinity, (iii) Early in the walk, the agents cluster about two different attractors, then eventually all migrate towards one of these attractors. We begin by examining the probabilities of the four cases in Tab. 1. In our adaptive walk, the only iterations in which movements take place are those in which ∆g > 0. Thus, to examine the dynamics of the adaptive walk, we need to focus on the probabilities of true and false positives. From Tab. 1, p(true positive) = p(∆f > 0 | ∆g > 0, f) and

(5)

p(false positive) = 1 - p(true positive) = p(∆f ≤ 0 | ∆g > 0, f), (6) where f refers to the fitness of a1 in the walk algorithm. Note that in more general types of adaptive walks, such as those with a “temperature” that gives a non-zero probability of accepting a mutant even if ∆g < 0 [22], the true negative and false negative probabilities would also be important in walk dynamics. The need to define (5) and (6) conditionally on f is apparent from applying Bayes’ rule to (5) which gives p( ∆g > 0 | ∆f > 0, f ) p( ∆f > 0 | f ) p( ∆g > 0 | ∆f > 0, f ) p( ∆f > 0 | f ) + p( ∆g > 0 | ∆f ≤ 0, f ) p( ∆f ≤ 0 | f ) . (7) There are two phenomena responsible for changes in these probabilities as a function of fitness: (i) As fitness increases, fewer and fewer one-mutant neighbors of a point have higher fitness than that of the point [Fig. 6(a)] [4, 17]. This change decreases p(∆f > 0 | f) for increasing f and will decrease p(true positive). At the fitness of the global peak, p(∆f > 0 | f) = 0 and p(true +) = 0. Similarly, at the global minimum, p(∆f > 0 | f) = 1 and p(true +) = 1. As K increases and the landscape becomes more random, the 1-mutant neighbors of high fitness points will be less correlated with the points’ fitnesses and move more towards lower fitnesses [Fig. 6(b)], accentuating the effects of high fitness on p(true +) even more. (ii) As fitness increases in K > 0 landscapes, the average difference in true fitness between neighbors () decreases. This decreases the signal-to-noise ratio and decreases p(true positive). The effects of these phenomena on the four cases in Tab. 1 are listed in Tab. 2. Both phenomena lead to decreasing p(true +) and increasing p(false +) with increasing fitness. In noiseless walks on NK landscapes, the rate of improvement drops with increasing fitness because of these same two phenomena: more fit neighbors are harder to find at higher fitness and the average fitness changes decrease with increasing fitness. In noisy walks, there is p( ∆f > 0 | ∆g > 0, f ) =

May, 1995

9

the additional influence of the changes in p(true +) and p(false +). To study the effects of these changes, we define the ratio of (5) and (6) as P(f) = p(∆f > 0 | ∆g > 0, f) / p(∆f ≤ 0 | ∆g > 0, f) (8) When P(f) > 1, moves in the adaptive walk are more likely to be correct than incorrect and to move the agent to higher fitnesses. When P(f) < 1, moves are more likely to move the agent to lower fitnesses. With this ratio we can characterize some of the walk’s dynamics. Ideally, P(f) could be calculated or estimated for a given landscape or set of landscapes. For this paper, we measured these curves by iterating through all 2 N points in a given landscape and through each of the N 1-mutant neighbors of each point. For each point and neighbor, we repeatedly added noise to the true fitness of the point and its neighbor and considered which of the four cases listed in Tab. 1 would occur in the adaptive walk if the point were a1 and its neighbor were a2. We totaled the number of each case and calculated p(true positive) and p(true negative) as a function of fitness. Behaviors Analyzing the p(true positive) and p(false positive) results, we find that there are broadly three classes of behavior. Behavior class I occurs when there is no noise. In this class, whatever its initial location in the space, an adapting agent moves unfailingly to a peak and remains at that peak. The peak reached is one of the several peaks whose basins of attraction the agent is in initially. With no noise, p(true positive) = 1 and p(false positive) = 0 for all fitnesses. The attractors of these walk are the peaks in the landscape. There are many examples of this type of behavior in the literature [4, 5, 17]. Behavior class II occurs with noisy fitness measures. In this class, whatever their initial locations in the space, adapting agents move on average towards a particular fitness value fA and meander in its vicinity. fA is the global attractor for the walk. Once near fitness fA, the agents will meander both above and below it, occasionally moving from within one peak’s basin of attraction to another. The range of wandering and degree of basin switching depends on the noise variance and the landscape structure. The significance of fA can be seen from examining the p(true positive) and p(false positive) as a function of fitness for various amounts of noise. From our discussion on Eqns. (5) and (6), it is clear that when there is any noise in fitness measurements, p(true positive) and p(false positive) must intersect at least once over the 0 - 1 fitness range. For noise constant with fitness (k = 0), there is generally only one intersection whose location depends on the noise variance [Fig. 7]. This intersection is fA. At low noise levels, the intersection of the probability curves occurs at the highest fitnesses in NK landscapes (around fitness 0.7 - 0.8). As the noise variance increases, the intersection moves to lower fitnesses. In the limit as σ 02 goes to infinity, the intersection reaches fitness 0.5, because the probability of ∆f being positive or negative becomes independent of ∆g. fA is an attractor because P(f) < 1 for f < fA and P(f) > 1 for f > fA. On average, the walk will move to higher fitnesses for f < fA and towards lower fitnesses for f > fA. At f = fA, the walk is equally likely to be correct or incorrect when making a step. This is the dynamical rational behind the “melting off the peaks” phenomenon discussed in Subsection 3.1. The migration from peaks to the attractor can also be regarded as a manifestation of the “error catastrophe,” in which the increasing error rate overcomes selection’s power to keep adapting agents near a fitness peak [4]. It is important to note that fA is not the exact location of the attractor, since we are not considering the sizes of fitness jumps from either side of fA. Nor is fA a very strong attractor, as generally the noise will constantly move agents away from the attractor. But it does characterize approximately how far from optimal fitness the noise keeps the walk on average. The larger σ 2 , the more the agents are dispersed about fA. We noted several examples of this behavior in Figs. 1, 2, and 3. To demonstrate that walks truly progress towards fA, we simultaneously walked several hundred agents, initially distributed uniformly in the 0 - 1 fitness range, and examined the

May, 1995

10

distribution of their fitnesses well after the initial transient. Walks were on the same N = 15, K = 4 landscape with k = 0 and differed in σ 2 . Starting with a uniform distribution of fitnesses greatly eases observing how the population moves towards fitness attractors. We divided the 0 - 1 fitness range into 30 equal bins. To populate a bin, we randomly selected agent initial locations from the points in the landscape which belonged to that bin. If there were insufficient distinct points to fill a bin, we included duplicates of the few points available. Bins to which no points in the landscape corresponded were left empty. On the particular landscape for our walks, of the 30 fitness bins, initially 19 were occupied each with 20 agents giving a total of 380 agents walking simultaneously. Fitness histograms for the agents at generation 400 for σ 2 = 0, 0.0005, 0.005, and 0.1 are shown in Fig. 8. Arrows show the location of attractor fA for each noisy walk. The bin starting at fitness 0.8 is always empty, as no points in this particular landscape belong in this bin. With no noise, the distribution of the agents reflects basins of attraction for the peaks in the landscape [Fig. 8(a)]. Agents gather at local peaks in proportion to the sizes of the basins for the peaks, which also explains the asymmetry in the distribution. With a very small amount of noise, the distribution shifts to higher fitnesses [Fig. 8(b)]. The shift reflects the increase in terminal fitnesses allowed by slightly noisy fitness measurements (Subsection 3.1). The agents are clustered around the fA; though at this low noise level the histogram still reflects the landscape peak structure more than the noise-induced global attractor. With a moderate amount of noise, the distribution shifts to significantly lower fitnesses [Fig. 8(c)]. The histogram is centered about fA but is wider than in Fig. 8(b), reflecting the wider range over which the agents wander with the larger noise variance. The histogram is symmetric and no longer strongly reflects the underlying landscape structure. Finally, with a large amount of noise, the distribution is centered about a fitness slightly above 0.5, as expected from fA With this much noise, the agents are essentially performing a random walk, and their distribution is approaching that of fitnesses for all points in the landscape. As Fig. 8 shows, the agents in walks do cluster about the fA predicted from (5) and (6). These and other simulations we have performed indicate that, despite not taking into account the sizes of the fitness jumps, fA still serves well as an attractor for average behavior. However, fA is not a strong attractor. At low noise levels, the landscape’s peak structure has greater influence on the agents than fA, and the agents will cluster tightly about the fitnesses of the peaks. At high noise levels, the agents will be widely dispersed about fA, though they will have fitness fA on average. Behavior class III occurs when there is more than one intersection of the p(true positive) and p(false positive) curves. This class occurs only with noisy fitness measures in which the noise decreases with increasing fitness. The p(true positive) and p(false positive) curves for exponentially-decreasing noise are of the forms shown in Fig. 9. At low fitness (far left in the plots), the curves are similar to those in Fig. 7 for high constant noise variance. As fitness increases, the noise variance decreases, and the curves are of the form of those for intermediate constant noise variance. At high fitness (far right in the plots), the curves may become similar to those for the noiseless case. Depending on the fitness 0.5 noise level ( σ 02 ) and the rate at which the noise drops (k), the probability curves will intersect differently. If the noise is never large and decays rapidly with fitness (k high, σ 02 not large), the curves will be of the form shown in Fig. 9(a). If the noise is large at low fitnesses but does not decay rapidly with fitness (k low, σ 02 large), the curves will cross once [Fig. 9(b)]. However, if the noise is large at lower fitnesses but decays rapidly with fitness (k high, σ 02 large), the curves cross three times as in Fig. 9(c). The walk dynamics depend on which form the p(true positive) and p(false positive) curves take. The curves in Fig. 9(a) gives walks whose behavior is similar to that of constant noise class II walks with low variance. The walks move to a high fitness global attractor, and the agents are not widely dispersed. They differ from constant noise class II walks in that they take longer to reach the attractor. The delay is due to the relatively high probability of false positives and false negatives that occur at the humps in the curves compared to constant noise class II curves. However, once an agent has reached fitnesses somewhat beyond the hump, it will climb towards peaks much more accurately and rapidly than constant noise class II walks (“grabbing hold of the

May, 1995

11

landscape.”) The curves in Fig. 9(b) also yield class II behavior, but with a wide dispersion of agents. The curves in Fig. 9(c) yield class III behavior. In this class, there are two attractors fA1 and fA2 with a repellor fR between them. The fitness of the first intersection is fA1. The second intersection indicates the repellor. Here, P(f) > 1 for fA1 < f < fR and P(f) < 1 for fR < f < fA2 The result is average movement away from fR to either higher or lower fitnesses. The third intersection is fA2 and is another attractor. In Fig. 9(b), fA1 = 0.5, fR = 0.69, and fA2 = 0.76. The most intriguing of the three classes of behavior is class III, as it implies that on average a walk will localize in either one of two fitness attractors. Figure 10 shows the distribution of fitness values for a class III walk starting from a uniform distribution. The landscape is the same as that is Figs. 8 and 9. Even at the first generation, the distribution shows hints of two attractors [10(a)]. By generation 8, there is clear evidence of a repellor around fitness 0.7 and attractors around fitnesses 0.5 and 0.74 - 0.78, matching the attractor and repellor locations from Fig. 9(c) [10(b)]. However, by generation 50, agents are leaking out of the upper attractor into the lower attractor [10(c)]. Finally, at generation 400, almost all evidence of the upper attractor is gone [10(d)]. We see that the rate at which agents move from the fA2 to fA1 exceeds the rate from fA1 to fA2; and that just as the attractors are weak, so is the repellor. Depending on the noise variance and how it changes with fitness, there can be many different shapes to the p(true positive) and p(true negative) curves. However, we can make two general statements: (i) Because of the constraints described for Eqn. (7), every plot with non-zero measurement noise must have an odd number of intersections. Thus every repellor will have attractors on both sides. (ii) Monotonically decreasing noise will almost always give curves of one of the forms shown in Fig. 9. There are somewhat artifactual exceptions to having only one or three intersections of the p(true positive) and p(false positive) curves. They occur when the landscape is such that a fitness bin only contain fitness peaks or only contains fitness minima, which can happen when there are few points in the landscape (small N) or when the bins are small. When there are only peaks in a bin, p(∆f > 0 | ∆g) = 0 for fitnesses in that bin, since there are no more fit 1-mutant neighbors. As a result, p(true positive) = 0 [Eqn. (7)]. Similarly, p(∆f > 0 | ∆g) = 1 and p(true positive) = 1 for a bin with only minima. These abrupt changes in p(true positive) will introduce additional intersections and lead to behavior similar to that in behavior class III. 4 . DISCUSSION In this paper, we considered the consequences of measurement error on adaptive walks. We simulated hill-climbing adaptive walks on NK landscapes of varying size and ruggedness with both constant measurement noise and noise whose variance decayed exponentially with fitness. We found that increased noise variance reduces the fitnesses reached by walks. Instead of walking towards a local peak, as in noiseless walks, the agents wandered within a region whose size generally increased and average fitness generally decreased with increased noise. However, small amounts of noise improved walk performance on multipeaked landscapes. When noise variance decreased with fitness, there were sudden or gradual transitions from wandering to hill climbing as the agent stumbled upon points of high fitness with lower noise variance. When examining the dynamics of noisy adaptive walks, we found three classes of behavior: (i) Agents climb unfailingly to one of the peaks in whose basin it was originally located (noiseless walk), (ii) Agents move on average towards a particular fitness value (attractor) and wander about that fitness level in a region whose size depends on the noise variance. For noise constant with fitness, the agents will be widely dispersed about the attractor. For noise decreasing with fitness or noise constant with fitness but of a small variance, the agents will be clustered about the attractor. (iii) Agents move towards either of two different fitness attractors, then eventually all migrate to one of the attractors. Our results should offer insights into different paradigms for performing sequence space searches for molecular evolution. There are many paradigms and several methods for measuring molecular properties of interest whose precision depend critically on numerous details. The particulars of measurement noise and its effect on a search will vary considerably depending on which searching and measuring techniques are used. Rather than focus on one particular combination, we tried to examine the issue generically, using landscapes of varying degrees of

May, 1995

12

ruggedness, noise with varying characteristics, and an elementary type of adaptive walk. This generality also allows our work to apply to other fields in which adaptive walks are used such as economics, business and the study of species evolution. Measurement in all these fields is error prone and able to influence adaptive walks. Despite this generality, several particulars of the simulation should be kept in mind: (i) The two features of the NK landscape responsible for the fitness’s initial rapid rise and gradual reaching of a plateau during noiseless adaptive walks are decreased probability of finding more fit neighbors at higher fitness and decreased average difference in fitness at higher fitness. These two factors have additional effects in noisy adaptive walks as discussed in Subsection 3.3. However, while most landscapes will share the first property, many may not share the second. (ii) We did not explicitly consider cases where the noise variance increased with fitness. The reason for studying such cases would be to examine walks in which the SNR decreases with fitness. However, since the SNR already decreases with increasing fitness in NK landscapes with constant noise, we have already considered such cases. To achieve the same SNR at all fitnesses in NK models would require carefully tuning the noise variance to decline in proportion to . However, on other landscapes, it may be important to explicitly consider noise increasing with fitness. (iii) Molecular fitness measurements can always be repeated to reduce measurement error to an arbitrary degree. However, the repetition may require inordinate amount of time or expense and is not always feasible. (iv) In many molecular evolution paradigms, billions of molecules are simultaneously examined. After many rounds of the search and amplification of fitter molecules, the number of copies of a particular molecule is exponentially related to its fitness relative to that of the other molecules present. The additional copies of the most fit molecules in the population will reduce the measurement error of those particular molecules and may lead to a greater reduction in noise with fitness during successive rounds of the search. (v) Our use of measurement error decaying exponentially with fitness is an assumption. Physical measurements may have errors that change with fitness in a different manner. However, our observations do not depend critically on the exponential functional form. We believe our exponential noise results will be qualitatively similar for most noise processes which decrease monotonically with fitness. An intriguing issue is the similarity between noisy adaptive walks and non-zero temperature walks. In adaptive walks with non-zero temperature, the choice whether to switch to a mutant is stochastic. For example, a mutant might be accepted with probability exp[-∆f / (Temp)] for ∆f < 0 and with probability 1 for ∆f > 0. Both noisy and non-zero temperature walks make errors that reduce fitness during the walk; and, with proper adjustment of the error rates, either can perform better than noiseless, zero-temperature walks. However, they differ in that noisy walks have stochastic measurements but deterministic decisions, while non-zero temperature walks have deterministic measurements but stochastic decisions. In simulated annealing, the temperature-induced error rate is reduced in a prescribed manner during the course of the walk so as to give high probability of finding the highest peaks in the landscape [22, 23, 24]. Our work suggests that there can be a similar schedule with noise, dropping the noise variance from an initially high value during the course of the walk so as to give high probabilities of finding the global or near global peak. A potential advantage of such a noise schedule is that it can be implemented experimentally by taking advantage of finite population size effects. When performing selection on evolving populations of molecules, the noise can be reduced by tuning the population to larger sizes over the rounds of selection. It seems much more difficult to endogenously implement in a molecular system the exponential acceptance criterion of simulated annealing. Further work is necessary to compare the two methods in detail. It also seems valuable to consider the possibility of walks with mixed noise and temperature, using a probabilistic decision to optimize whether to accept a mutant as a function of stochastic measurements. We would like to thank Bill Macready, Per Bro, Pim Stemmer and Doran Lancet for insightful discussions and suggestions.

May, 1995

13

____________________ This work has been supported by the National Institute of Health (# GM49619), the National Science Foundation (# BIR9218746) and the Department of Energy (# DE-FGO3-93ER61639). Dr. Levitan receives support from a combined Sloan Foundation/National Science Foundation Molecular Evolution postdoctoral fellowship (# 94-4-2ME) and the Santa Fe Institute.

May, 1995

14

References [1]

Bangsborg, J.M., Shand, G.H., Hansen, K., and Wright, J.B.,” APMIS, 102 (1994) 501-508.

[2]

Craig, W.Y., Poulin, S.E., Nelsin, C.P., and Ritchie, R.F., Clin. Chem., 40 (1994) 882-888.

[3]

Danilova, N.P., J. of Immun. Meth., 173 (1994) 111-117.

[4]

Kauffman, S., The Origins of Order, Oxford University Press, NY, 1993.

[5]

Weinberger, E.D., Physical Review A, 44 (1991) 6399-6413.

[6]

Macken, C.A., Hagan, P.S., and Perelson, A.S., SIAM J. App. Math., 51 (1991) 799-827.

[7]

Kauffman, S.A., Weinberger, E.D., and Perelson, A.S., in Perelson, A.S. (Ed.) Theoretical Immunology, Part I: Santa Fe Institute Studies in the Sciences of Complexity, Addison-Wesley, MA, 1988. pp. 349-382.

[8]

Macken, C.A. and Perelson, A.S., Proc. Natl. Acad. Sci., 86 (1989) 6191-6195.

[9]

Schuster, P. and Stadler, P., Computers Chem., 18, (1994) 295-324,.

[10]

Perelson, A.S. and Macken, C.A., Santa Fe Institute, Tech. Rep. 94-11-060, 1994.

[11]

Markland, W., in Exploiting Biological Diversity: Small molecule libraries for drug discovery, CambridgeHealthtech Institute, 1995.

[12]

Scott, J.K. and Smith, G.P., Science, 249 (1990) 386-390.

[13]

Scott, J.K., Trends in Biochem. Sci. 17, (1992) 241-245.

[14]

Kenan, D.J., Tsai, D.E., and Keene, J.D., Trends in Biochem. Sci., 19 (1994) 57-64.

[15]

Westhoff, F.H., Yarbrough, B.V., and Yarbrough, R.M., Amherst College Department of Economics, Tech. Rep., 1994.

[16]

Kauffman, S. and Macready, W., “Technological evolution and adaptive organizations,” Complexity (in press).

[17]

Weinberger, E., Biol. Cybernet., 63 (1990) 325-336.

[18]

Hochberg, J. E., Perception, second ed., Prentice-Hall, NJ, 1978.

[19]

Jones, T.C., “Evolutionary Algorithms, Fitness Landscapes and Search,” Doctoral thesis, Univ. of New Mexico, 1995.

[20]

Geysen, H.M., Rodda, S.J., Mason, T.J., Tribbick, G. and Schoofs, P.G., J. Immun. Methods, 102 (1987) 259-274.

[21]

Houghton, R.A., Pinilla, C., Blondelle, S.E., Appel, J.R., Dooley, C.T. and Cuervo, J.H., Nature 354 (1991) 84-86.

[22]

Kirkpatrick, S., C. D. Gelatt, J., and Vecchi, M.P., Science, 220 (1983) 671-680.

[23]

Aarts, E. and Korst, J., Simulated Annealing and Boltzmann Machines, Wiley, NY, 1989.

[24]

Axencott, R., Simulated Annealing, Wiley, NY, 1992.

May, 1995

15

FIGURES Fig. 1. Measured and true fitness for noisy adaptive walks in N = 30, K = 0 landscapes with constant (k = 0) noise variance. The dotted lines show measured fitness, the solid lines show true fitness. The dashed lines in (a) and (c) show fitness for noiseless walks on the same landscapes. “generation” refers to an iteration of the walk algorithm, whether or not a new mutant was accepted. (a) One agent adapting on one landscape with σ 2 = 0.00005. The peak fitness is 0.583. (b) Same walk and landscape as in (a) but with σ 2 = 0.001; (c) Average over 100 different landscapes and of 20 randomly distributed initial points on each landscape with σ 2 = 0.0001; (d) Same walk and landscapes as in (d) but with σ 2 = 0.005. Fig. 2. Average terminal fitness as a function of noise variance for K = 0, 7, 15 and 29 in N = 30 landscapes. The average is of fitness values from generations 300 through 400 for walks on the same landscapes as in Figs. 1(c) and 1(d). (a) Full range of noise variance; (b) Detail of very low noise variance region of (a). Fig. 3. Hamming distances from present point in walk to all previously planted “flagpoles” in the same N = 30, K = 0 space. See text for details. Noise is independent of fitness. (a) No noise; (b) σ 2 = 0.1; (c) σ 2 = 0.000001; (d) σ 2 = 0.00001. Fig. 4. Measured and true fitnesses for adaptive walks in N = 30, K = 15 landscapes with noise variance exponentially decreasing with fitness. The dashed line is measured fitness, the thick solid line is true fitness, and the solid line focused on fitness = 0 is the difference between these two. a) One agent adapting on one landscape with k =15 and σ 02 = 0.0005; (b) Walk on same landscape as in (a) but with k = 15 and σ 02 = 0.005; (c) Averages over the same landscapes and initial points as in 1(c) for k = 35, σ 02 = 0.001; (d) Same landscapes as in (c) but with k = 35 and σ 02 = 0.01. Fig. 5. Average terminal fitness as a function of noise dampening factor k for N = 30, K = 15 landscapes with fitness-dependent noise. The average is over generations 300 through 400 on the same landscapes as in Figs. 4(c) and 4(d). (a) Full range of noise variance; (b) Detail of very low noise variance region of (a). Fig. 6. Distribution of 1-mutant neighbors of a point as a function of fitness. Distributions are shown for points who fitnesses are in a bin of width 0.0333; arrows show the average fitness in each bin. Each curve corresponds to the arrow to which it is closest. (a) Distributions averaged over 25 N = 15, K = 0 landscapes. At low fitness, most neighbors have higher fitnesses. At high fitness, most neighbors have lower fitnesses. (b) Distributions averaged over 25 N = 15, K = 3 landscapes. Fig. 7. The effect of fitness-independent noise on the p(true positive) and p(false positive) curves. Curves are shown for σ 2 = 0.0005 (crosses), 0.005 (squares) and 0.2 (circles). For all curves shown, N = 15, K = 4 and k = 0. The thick lines show p(true positive), the thin lines show p(false positive). As σ 2 increases, the intersection of the curves moves from high fitness towards fitness 0.5. This intersection is fitness attractor fA. Fig. 8. Histograms of agent fitnesses after 400 generations in N = 15, K = 4, k = 0 walks with fitness-independent noise. The 0 - 1 fitness range is broken into 30 bins, each of length 0.0333. All walks began with a uniform distribution of 20 agents per bin except for those fitness bins having no corresponding points in the landscape (see text). Arrows show fitnesses fA, the attractors for the noisy walks predicted by the p(true positive) and p(false positive) curves. (a) σ 2 = 0; (b) σ 2 = 0.0005; (c) σ 2 = 0.005; (d) σ 2 = 0.1. Fig. 9. Different forms for p(true positive) and p(false positive) curves when noise decreases exponentially with fitness. All plots are for the same N = 15, K = 4 landscape. Arrows indicate

May, 1995

16

fitness attractors and repellors. (a) σ 02 = 0.001, k = 12. Because of binary roundoff and finite size effects (small N), the curves show no intersection at high fitnesses. However, for reasons given in Subsection 3.3, the curves must intersect. Dotted lines indicate the approximate intersection. (b) σ 02 = 0.2, k = 10; (c) σ 02 = 0.1, k = 12. Fig. 10. Histograms of agent fitnesses at various times in a walk with class III behavior. Agents are initially distributed uniformly with 20 agents in each of 19 bins. Arrows indicate the fitness attractors and repellor. N = 15, K = 4, σ 02 = 0.1, and k = 12. (a) generation 1; (b) generation 8; (c) generation 50; (d) generation 400.

May, 1995 TABLES Tab. 1. Classes of decisions made in a noisy adaptive walk

∆f+b

∆g+a

∆g-

true +

false -

∆ffalse + true a - ∆g is the observed change in fitness from the present point to the mutant under consideration. b - ∆f is the true change in fitness.

17

May, 1995

18

Tab. 2. Changes in probabilities of true/false positives/negatives due to landscape properties p(true +) p(false +) p(true -) p(false -)

Effect of decease in p(∆f > 0 | f) ↓ ↑ ↑ ↓

Effect of decrease in ↓ ↑ ↓ ↑