Parallel Ant Colony Optimization

0 downloads 0 Views 77KB Size Report
Abstract: Ant colony optimization is a population-based search technique that makes use of meta-heuristics. ACO algorithms are very useful for optimization ...
77 International Journal of Research and Reviews in Computer Science (IJRRCS)

Vol. 1, No. 2, June 2010

Parallel Ant Colony Optimization Ahmed Sameh*, Asmaa Ayman**, Noran Hasan** *Dept. of Computer Sc. & Info. Sys, Prince Sultan University, **Department of Computer Science, The American University in Cairo P.O.Box 66833 Riyadh 11586, Saudi Arabia Email: [email protected]

Abstract: Ant colony optimization is a population-based search technique that makes use of meta-heuristics. ACO algorithms are very useful for optimization problems, as they examine numerous solution options at each step of the algorithm. Parallelized ACO algorithm techniques even accelerates the search, we experiment with the parallel interacting ant colonies, that exchange information to make use of the best solutions obtained so far. Pruning takes place whenever a slave hits the optimal and reports it to the master. Keywords: Ant colony optimization; Information exchange; TSP; Traveling salesman problem; Parallel

information, where separate ant colonies exchange trail information – pheromone matrices. Other general parallelization strategies include different scaling levels, where parallelism can be exploited at different scales. 1.

Parallel independent ant colonies, as described in [5], have no communication overhead at all.

2.

Parallel interacting ant colonies; make use of the best pheromone matrices of neighbors to be exchanged. High communication overhead.

3.

Parallel ants, each ant/slave is assigned a separate processor with which to build its solution. Where the master processor is responsible for receiving the input, placing ants randomly, updating the pheromone and producing the output. Communication overhead is moderate

4.

Parallel evaluation of solution elements, each slave processor is assigned an equal number of solution elements to evaluate, as they are independent from one another. This approach is computationally expensive and is more suitable to highly constrained problems.

5.

Parallel combination of ants and evaluation of solution elements, each ant is assigned a group of processors, equal number for all ants, within each group, a group master is responsible for constructing the ant’s tours and delegating the solutions evaluation to the slaves.

1. Introduction Ant Colony Optimization (ACO) is a population based search metaphor, inspired by the foraging behavior of the real ants. Population based means that a population of agents is used to find the desired goal. Ants communicate through stigmergy, an indirect communication that takes place through the change of the surrounding environment; ants use pheromone deposition as their stigmergic way of communication. Pheromone is used as a method of communication among ants. While population based approaches are suitable for parallel processing; yet, there is no golden rule for how to efficiently parallelize ACO algorithms. One of the main problems that are facing the parallelization of ACO algorithms is the information exchange, which represents the stigmergic communication among the ants. Information exchange stimulates the two characteristics of the ACO algorithms, which are exploration and exploitation: exploration is caused by the randomness that marks the ACO algorithms, thus exploring new routes and discovering more tours that might have minimum length, while exploitation makes use of what other ants discovered and follow their pattern.

2. General Parallelization Strategies Stützle [5] described the simplest case of parallelization, parallel independent ACO searches that do not interact; this is the largest scale of parallelization where entire searches can be performed concurrently. Michel and Middendorf [7] discuss exchanging

3. Parallel Ant Colonies Stützle [5] discusses some ways to efficiently parallelize ACO algorithms. The simplest way is to independently execute the sequential algorithm on k parallel processors. Parallel runs have no communication overhead. They are useful in randomized algorithms. In case of parallel independent runs, the best solution of the k runs is taken as the final solution; the main interest is the solution quality distribution best-of-k-runs. Another

78 International Journal of Research and Reviews in Computer Science (IJRRCS)

Vol. 1, No. 2, June 2010

interesting idea is to run the algorithm with different search strategies.

application as the tour construction process is highly random.

Also more improvement over parallel runs can be gained through solution exchanges among the search processes, through exchanging of ants with best solutions among the single ant colonies [2]. As we are going to see later, exchanging of the whole pheromone matrices might influence the search directions of other ants.

TSP is the problem of finding the shortest or cheapest round-trip route that visits each city exactly once. It is a well-known NP-hard problem. Given n cities, TSP can be represented by a graph with n nodes and the edges between them have weights representing the distance or cost of going from one city to the other.

Speeding up single runs, master-slave approach is introduced. One master processor is used to update the main data structures for the ACO algorithm, constructs initial solutions for the local search algorithms, and sends the solutions to other processors which improve them by local search. The master collects these locally optimal solutions, and in case a sufficient number of such solutions have arrived, it updates the trail matrix before constructing more solutions.

ACO has been applied to TSP. Each ant moves through the graph of cities until it completes a tour and offers this tour as its suggested solution. Each ant may drop pheromone on the edges that were part of its solution. The amount of pheromone dropped, if any, is determined by the quality of the ant’s solution relative to those obtained by the other ants. The best ant may be the only one dropping pheromone or each ant may drop pheromone proportionally to the quality of its solution. The ants choose the cities to go to based on a heuristic function whose variables include the amount of pheromone on the edges. Therefore, an ant is more likely to visit an edge the more it was visited in the past. Some randomness needs to be maintained to allow for exploration of possibly better but never visited solutions and to avoid getting stuck in a local minimum. This cycle is repeated a number of times and best solution found is presented as the optimal solution.

4. Information Exchange When developing parallel genetic algorithms and parallel ant colony optimization (ACO) algorithms, it is common to adopt the strategy of information exchange that plays a major role in the algorithms. Parameters of exchange can be solutions, pheromone matrices, or parameters. A fixed condition should be selected to cause the information exchange to happen; this can be a time interval, number of iterations, or even a comparison, that compares the locally optimal solution to the global optimal solution obtained so far, in the last case, information exchange can be carried out when the local optima is better than the global one so far. The fundamental principle of parallel ant colony algorithm is to divide N ants into P ant colonies. Normally the numbers of every ant colony is the same, such that the number of ants per each sub ant colony is the total number of ants divided by the number of processes, n = N / Procs. In the algorithm designing, each colony is distributed to a processor, and then the ant colony can search the best solution independently. In order to avoid the local optimization in some processors when the ant colony is doing the job, the processors should carry out the information exchange with each other in the chosen fixed condition (i.e. time interval, etc). The parallel ACO can make the ant colony in each processor increase the speed of convergence, which is more important when the scale of the problem is larger.

5. Application of PACO to TSP Parallel ACO runs are useful in randomized algorithms. ACO and TSP are considered a perfect

Parallel Ant Colony Optimization (PACO) is a parallel implementation of ACO where ants do their work simultaneously on different processing units. This intuitively provides improved performance and speedup. However, the ants need to communicate to exchange information about the solutions they find.

6. Related Work L. Chen et al. [2] proposed a strategy for improving the performance of PACO algorithm, and avoiding early convergence. Their algorithm divided the ants into equally-sized sub-colonies each assigned to a different processor. Each sub colony searches for an optimal local solution and information is exchanged between processors from time to time. They focused on two factors that affect performance which are the method of information exchange between the ants and time interval between consecutive information exchange points. These two factors change adaptively to improve performance. Each processor doesn’t exchange information with a random processor or a neighboring one. Instead, a processor chooses another processor that is the most different and has high fitness to exchange information with. This allows a processor to evolve to an

79 International Journal of Research and Reviews in Computer Science (IJRRCS)

optimal solution by learning from its partner’s trail as well as maintaining the diversity of the solutions and thus avoiding early convergence. Also, the time interval changes adaptively depending on the diversity of the solutions. If there is no diversity in the solutions, the time interval is reduced to improve the search and increase the diversity and thus avoiding early convergence. If the solutions are diverse, the time interval is increased to reduce the communication overhead and improve performance. They tested their PACO algorithm on TSP benchmarks using MPI. H. Liu et al. [3] presented a PACO algorithm that is based on construction graph decomposition (PACOCGD). The graph is decomposed into parts and each part is assigned to a different processing unit. To explore a graph and find a complete solution, an ant will move from processing unit to the other and messages are sent to update the pheromone levels. They showed that this approach reduces the computation complexity and memory needed in each processing unit and therefore improves the computation efficiency. They also used MPI to perform their experiment. As in [3], Y. Lin et al. [4] argued that having one solution may become inefficient as the solution space grows. Therefore, they proposed dividing the problem into sub-components corresponding to two solution vectors. They divide the problem into two parts and have two colonies where each colony focuses on the optimization of one solution vector and the other is fixed. When migration occurs, the best vector in one sub-colony migrates to replace the fixed vector in the other subcolony. They compare two migration mechanisms. The first performs migration every fixed number of iterations while the other can adapt the migration rate to the evolution state such that migrations are more frequent at stagnation times. They showed that the first brings faster convergence but has high communication cost. The latter trades off between convergence speed and communication cost.

7. Experimental Setup The purpose of our experiment is to implement parallel ant colony optimization. To do so, we needed a sequential version of ACO to work on. We obtained the software package implemented by Thomas Stützle as our starting point. Stützle applied his implementation to the symmetric traveling salesman problem. This package is implemented using ANSI C under Linux. Since we needed to make changes to this code and since we didn’t have physical

Vol. 1, No. 2, June 2010

access to a Linux machine, we installed Fedora 9 operating system on a local virtual machine hosted by a Windows XP operating system. We have chosen Parallel Virtual Machine (PVM) as our software tool for parallelizing the ACO problem. Communication between tasks in PVM is achieved by explicit sending and receiving of information.

8. PACO Algorithm In our design of the basic parallelized ACO, we combine both the master slave and information exchange approaches. The master, however, is only responsible for spawning the slaves and pruning them whenever a slave returns the optimal to the master. All the processing, local search and pheromone matrix update is done by the slaves. Slaves also exchange solutions independently from the master, so as to decrease the overhead of the communication with the master. The master collects the findings of several sequential ACO programs. Therefore, we adjusted Stützle’s sequential ACO implementation so that it would act as a slave to the master program. Each slave running on a processor represents a sub-colony. The master would spawn a number slaves and each slave would try to find the best solution and send it back to the master program. During its operation, each slave periodically exchanges information with its neighbor. Each slave uses this information to update its pheromone matrix. The best solution is determined by the tour length. Each sub-colony chooses the tour with the minimum length as its best and reports it to the master program at the end of its search. The master program would then find the best solution it received from its slaves and present this as the proposed solution. In our approach, each processor selects its partner such that slaves with even IDs exchange their information with their successors, and those with odd IDs exchange theirs with their predecessors. Therefore, each processor exchanges information with the same processor at the end of each time interval. Slaves exchange three parameters: their pheromone matrices, the tour length, and the best tour obtained so far. For each processor, the best ant represents its sub-colony and its best tour and tour length are sent to its partner processor. Whenever the master receives the optimal solution, it multicasts to all slaves that this solution has been found and the slaves consequently prune themselves. This reduces the time wasted in other slaves who would not

80 International Journal of Research and Reviews in Computer Science (IJRRCS)

know otherwise that another slaves has found the optimal solution. Tables 1 and 2 show the pseudo codes for the master and slave programs, respectively. Algorithm 1: The Master begin Configure PVM; Initialize number of processes N; Start Timer; Spawn N processes; Multicast to all slave processes N and the task ids of all slaves; for each slave do Send a number between 0 and N that identifies this task inside the program; end for while not all slaves have sent back solution Wait for a solution; If a slave returns a solution that is better than any solution received before, multicast this tour length to all the slaves; end while Stop timer; Print elapsed time and best solution received; End Table 1 Pseudo code for the master program Algorithm 2: The slave (ACO) begin Receive N and task IDs of all slaves from the master; Receive its ID inside program from master; Initialize program (pheromone matrices, etc.) ; for each try while the termination condition is not met (max allowed time or number of tours reached or optimal solution found) Check if the master is send a new optimal solutions and if this solution is optimal, prune this slave; Construct solutions; Update pheromone matrix; if it is now time for information exchange Identify neighbor for information exchange; Send to neighbor best tour found, its length and the pheromone matrix; Receive from neighbor its best tour found, its length and the pheromone matrix; Update pheromone matrix using information

received from neighbor; end if end while end for Send to master the best solution found; end Table 2 Pseudo code for slave program

Vol. 1, No. 2, June 2010

After information exchanging takes place, each slave updates its pheromone matrix according to the following equation:

tau_g(t+1) = rho * tau_g(t) + (fit_g/(fit_g delta_tau_g(t) + (fit_h/(fit_g delta_tau_h(t)

+

fit_h))*

+

fit_h))*

The processor itself is denoted by g and its partner is denoted by h. tau_g(t+1) is the new pheromone element after update and tau_g(t) is the pheromone element before update. fit is the fitness of the processor’s best tour, measured as a function of the inverse of the tour length. If an the edge/arc is in a processor’s best tour, delta_tau is the processor’s fitness, otherwise delta_tau is set to tau_min which is:

tau_min = 1.0 / (n_cities * nn_tour ) where nn_tour is the tour length of a randomly generated nearest neighbor’s tour. tau_min is also used at the beginning of each try to initialize the pheromone trails, due to the fact that pheromones have to be initialized differently – randomness effect. Randomness is the key for the success of the ACO algorithms, randomness enables the exploration of new tours, that might have better tour lengths, while exploitation on the other hand makes use of the meta-hueristics provided by other ants.

9. Experimental Results The TSP problem instance we used is lin318.tsp, 318-city problem (Lin- Kernighan). The ant colony optimization used is Ant Colony system (ACS). The sequential ACO may run several tries whereas in PACO, we allowed each processor to run only a single try. To be able to evaluate the differences speed, we measured the time it took to reach a specific optimal solution and therefore, we put no limit on the maximum number of tours or time that the processor may run. The time taken to find the optimal solution using PACO is much shorter than that using sequential ACO. To demonstrate this effect, we compared between sequential ACO that runs x tries on a single processor and PACO that runs 1 try on x processors. Apart from losing the benefit of running in parallel, sequential ACO also doesn’t benefit from the pruning that we implemented in PACO. Figure 1 demonstrated the difference in the time taken by sequential ACO and PACO to find the optimal solution.

81 International Journal of Research and Reviews in Computer Science (IJRRCS)

Parallel

16 14

Sequential

Time(seconds)

600

Time (seconds)

500 400

Vol. 1, No. 2, June 2010

300

12 10 8 6 4 2 0

200 100

4

6

10

Number of processors

0 4

6

8

Figure 1 The time taken to find the optimal solution by sequential ACO and PACO Table 3 and Figure 2 show the effect of the increase in the number of processors on the average time taken to find the optimal solution. As the number of processors increases, the average time taken to find the optimal solution decreases. This is because as the number of processors increases, the chances that one of them gets fortunate and finds the optimal solution sooner increases. Table 3 also shows that for the same number of processors, the time taken changes dramatically from one run to the next. This is because the behavior of ACO is random and therefore it may get lucky in one run and unlucky in the next. That is why we decided to run each number of processors several times and take their average to plot in Figure 2 and allow us to compare their performances. Information exchange was disabled when taking these measures. Number of 2 4 6 8 Processors 12 2 12 1 Time (seconds) 10 1 5 1 for each 18 1 0 1 run 27 44 13 1 16

10

1

5

8

17

12

6

13

21

13

7

14.86

13.71

8

3.14

Figure 2 The average time taken to find the optimal solution versus the number of processors Time interval (iterations) Time (seconds) for each run

10

25

50

100

200

54

33

15

24

10

25

10

8

50

70

101 26

33 2

15 35

7 26

44 1

Average

51.5 19.5 18.25 26.75 31.25 Table 4 Time to find optimal solution for each time interval The time interval between information exchanges was not implemented using real time. Instead, they were represented by a number of iterations which are basically the number of tours each processor runs. Therefore, processors exchange information after they complete a certain number since the last exchange of tours. 60 50 Time (seconds)

2

Average

8

40 30 20 10 0 10

Table 3 Time to find optimal solution for each number of processors

25

50

100

200

Ti me i nterval (number of iterations)

Figure 3 The time taken to find the optimal solution versus the time interval

82 International Journal of Research and Reviews in Computer Science (IJRRCS)

Table 4 and Figure 3, demonstrate that as the time interval increases from 50 to 200, the time interval for finding the optimal solution increases since the benefit gained from exchanging with neighbors diminishes. On the other hand, as the interval decreases from 25 to 10, the time taken to find the optimal solution increases. Two factors may contribute to this observation. The first is that as the time interval decreases, the overhead of communication between the slaves becomes too high that it overweighs the benefit gained from the exchange. The other reason may be that when the interval is too short, the processor has not have by that time run enough tours to deposit a meaningful pheromone distribution that its partner can benefit from and there would be no significant difference in the pheromone distribution from one interval to the next.

10.

Conclusion and Future Work

The results of our experiment showed that PACO can produce the optimal solution significantly faster than sequential ACO. Also, as the number of processors (sub-colonies) increases, the time for finding the solution decreases. The time interval between information exchange points has an effect on the time for finding the optimal. An interval must be chosen so that it is not too big that the sub-colonies do not benefit enough from each other, and on the other hand, not too small that each sub-colony does not get a change to formulate a decent pheromone matrix to exchange and the communication overhead overweighs the benefit gained from the exchange. A possible source of overhead in our exchange mechanism is that since the time intervals are not real times but rather are intervals of completion of sets of tasks, one processor may finish its tasks earlier and have to stop and wait for its designated partner. A solution to this problem that we may investigate is that when a processor is ready for exchange, it would announce that to the master which would pair it up with the next ready processor instead of having the processor pick its partner by itself. This may cause a bottleneck at the master which may be overcome by using a hierarchy of masters for this purpose.

Vol. 1, No. 2, June 2010

We would like to refine the current methodology of information exchange, where a circular exchange of locally best solutions is used, so a virtual neighborhood is established between the colonies such that they form a directed ring. In every information exchange step, every colony sends its locally best solution to its successor colony in the ring. We would also like to experiment having the time interval for information change adaptively depending on the diversity of the presented solutions. The neighboring processor to exchange with can also be changed adaptively depending on the neighbor’s fitness and the dissimilarity between the two processor’s best solutions. These ideas have been discussed in [2]. Another idea is the “Research of Multi-path Routing Protocol Based on Parallel Ant Colony Algorithm Optimization in Mobile Ad hoc Networks” which applies the parallel ant colony algorithm to establish multi-path routing between source node and destination node, leading to improvements in the packet delivery performance. Parallel ACO can greatly improve the probability of route discovery and increasing the speed of constringency in ad-hoc networks.

References [1] M. Craus and L. Rudeanu, “Parallel Framework for Ant-like Algorithms”, Proceedings of the ISPDC/HeteroPar’04.

[2] L. Chen, H. Sun, and S. Wang, “Parallel Implementation of Ant

[3] [4] [5] [6] [7]

Colony Optimization on MPP”, Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008. H. Liu, P. Li and Y. Wen, “Parallel Ant Colony Optimization Algorithm”, Proceedings of the 6th World Congress on Intelligent Control and Automation, June 21 - 23, 2006, Dalian, China. Y. Lin, H. Cai, J. Xiao and J. Zhang, “Pseudo Parallel Ant Colony Optimization for Continuous Functions”, Third International Conference on Natural Computation (ICNC 2007). Thomas Stützle, “Parallelization Strategies for Ant Colonies Optimization” Marcus Randall and Andrew Lewis, “A Parallel Implementation of Ant Colony Optimization” Journal of Parallel and Distributed Computing 62, 1421-1432 (2002) R. Michel and M. Middendorf, An island based Ant System with lookahead for the shortest common subsequence problem, in ‘‘Proceedings of the Fifth International Conference on Parallel Problem Solving from Nature,’’ Vol. 1498, pp. 692–708, SpringerVerlag, Berlin (1998)