MALLBA: Towards a Combinatorial Optimization Library for

1 downloads 0 Views 298KB Size Report
mented in C++ under this approach. MALLBA offers three families of generic resolution methods: exact, heuristic and hybrid. Moreover, for each resolution.
MALLBA: Towards a Combinatorial Optimization Library for Geographically Distributed Systems∗ E. Alba3 F. Almeida2 J. Gonz´alez2 C. Le´on2

M. Blesa1 C. Cotta3 L. Moreno2 J. Petit1

Abstract—Problems arising in different areas such as numerical methods, simulation or optimization can be efficiently solved by parallel super-computing. However, it is not always possible to buy and maintain parallel super-computers. A geographically distributed network of PC clusters is an interesting low-cost alternative. The possibility of connecting different clusters of PCs through Internet opens a new approach to distributed and massive computing. The MALLBA project tackles the resolution of combinatorial optimization problems using algorithmic skeletons implemented in C++ under this approach. MALLBA offers three families of generic resolution methods: exact, heuristic and hybrid. Moreover, for each resolution method it offers three implementations: sequential, LAN and WAN. This paper surveys the current state of the MALLBA project. Keywords— Combinatorial optimization, geographically distributed environments, algorithmic skeletons, exact/heuristic/hybrid methods, middleware.

I. Introduction Combinatorial optimization problems arise in various fields such as control theory, operational research, biology, and computers science. Traveling Salesman, Frequency Assignment, Graph Partitioning, and Sequence Alignment are examples of such problems [1]. Exact methods like divide and conquer, branch and bound and dynamic programming have been traditionally used to tackle these problems. However, the computational requirements of these algorithms may be prohibitive, since they usually need to enumerate a significant part of the solution space in order to pick the best solution. As an alternative, there exist heuristic methods providing high-quality solutions in a reasonable running time [2], [3]. In spite of the absence of warranty on the absolute quality of the solutions provided by heuristic methods, their utilization is mandatory when the size of the problem increases. Common examples of heuristic methods are local search techniques, spectral methods or evolutionary algorithms. The quality of the solution obtained is dependent ∗ Project TIC1999-0754-C03 granted by CICYT. [email protected], http://www.lsi.upc.es/∼mallba 1 Departament de Llenguatges i Sistemes Inform` atics, Universitat Polit` ecnica de Catalunya, Campus Nord C6, E-08034 Barcelona. 2 Departamento de Estad´ ıstica, Investigaci´ on Operativa y Computaci´ on, Universidad de La Laguna, Edificio F´ısica/Matem´ aticas, Calle Astrof´ısico Fco. Sanchez s/n, E-38271 La Laguna, Tenerife. 3 Departamento de Lenguajes y Ciencias de la Computaci´ on, Universidad de M´ alaga, ETSI Inform´ atica, Campus de Teatinos, E-29071 M´ alaga.

M. D´ıaz3 J. Roda2

I. Dorta2 A. Rojas2

J. Gabarr´o1 F. Xhafa1

on the time that the algorithm is running, and — essentially and most importantly— on the amount of problem dependent knowledge inserted into the algorithm [4]. Broadly speaking, this knowledge insertion is called hybridization, and can be done in different ways (e.g., see [5]). It is a well known fact that parallel computing constitutes a true alternative to reduce the running time of all the above algorithms and to increase the size of tractable problem instances [6], also constituting an important element when combining algorithms from the hybridization point-of-view [7]. However, parallel computers are still far from being used by many organizations due to hardware, maintenance and programming costs. Thus, the typical scientific-engineer programmer has some knowledge about UNIX, C, C++ programming, but is not usually proficient in the use or programming of parallel computers. Personal computers (PCs) offer a consolidated lowcost hardware alternative. Lots of parallel scientific applications offering generic heuristics have been developed with good performances for local area networks (LANs) [8], [9], [10]. The current development of Internet allows to connect super-computers and geographically distributed PC clusters, hence allowing the use of this big source of resources as a single computational parallel computer. This potential computational power introduces new options for reducing the running time of the applications. However, the experience of running applications in Wide Area Networks (WANs) is still limited. The complexity of this new environment represents a challenge for the developers of parallel applications. The efficiency of Internet applications will be strongly dependent of dynamical network parameters (latency, bandwidth, network overhead). The mallba research network connects PC clusters from Barcelona, La Laguna and M´ alaga through RedIRIS. Here we describe an integrated library for combinatorial optimization problems running on a geographically distributed environment connecting PC clusters through Internet. This library will make easier the development of sequential and parallel algorithms to those users of scientific and engineering fields needing more computational power. Algorithmic skeletons constitute a way to reduce the development effort for sequential and parallel applications. The user just describes the elements defining the problem and then instantiates the pro-

cedures that parameterize the selected technique. In the parallel case, the skeletons contribute to reduce the gap between the user and the parallel architecture: the skeleton hides the implementation details and allows to execute parallel algorithms by just writing sequential code. Sequential and parallel skeletons have been developed for exact, heuristic, and hybrid methods. At present, and to the best of our knowledge, no library integrates the whole set of techniques in sequential and parallel distributed environments. In Section II we present the mallba library and its architecture. Section III describes the mallba Network and tools related to it. Some experiments developed show how communications in WAN may constitute a bottleneck for the performance of the parallel algorithms. The easiness of use of the mallba library is illustrated with the Tabu Search algorithm [11] and an hybridization between Spectral Sequencing and Simulated Annealing techniques. The performance of these techniques are contrasted for 0-1 Multidimensional Knapsack Problem and Minimum Linear Arrangements in Section IV. The mallba project has become also a successful place holder for basic research on distributed and parallel computation. Section V briefly lists some of the research topics around mallba. II. The mallba skeletons The mallba library offers a set of resolution techniques to solve optimization problems. Each resolution technique is encapsulated into a “skeleton”. At present, the following skeletons for exact techniques are available: Divide and Conquer, Branch and Bound (BnB) and Dynamic Programming. Also the following skeletons for heuristic techniques are available: Hill Climbing, Metropolis, Simulated Annealing (SA), Tabu Search (TS) and Genetic Algorithms (GA). Moreover, some hybrid techniques have been implemented combining skeletons, e.g., GA+TS, GA+SA, BnB+SA. The above skeletons have been used for solving some well-known problems. Rather than offering code for concrete problems, mallba provides generic code that users particularize to their own problems. A skeleton is a unique, generic, abstract implementation that can be used to solve many different problems. The basic idea behind the skeleton is to allow the user to instantiate any combinatorial optimization problem of interest by only defining the most important problemdependent features. Elements related to the inner algorithmic functionality of the method are hidden to the user. Skeletons are based on the separation of two concepts: the concrete problem to be solved, and the general resolution method to be used. The glue between them is a fixed generic interface, which defines their interaction. As a consequence, a unique skeleton can be used to instantiate different problems; on the other hand, a skeleton can also have different

INSTANTIATION OF A CONCRETE PROBLEM • Instantiation of problem dependant parts • Idenpendent of the architecture • Examples available • Written by the user

GENERIC INTERFACE OF THE RESOLUTION METHOD

GENERIC IMPLEMENTATION OF THE RESOLUTION METHOD

• Definition of problem independent parts • Object oriented • Documentation availabe • Public part provided by MALLBA

• Implementation (SEQ, LAN or WAN) of the resolution algorithm • Dependent of the architecture • Private part provided by MALLBA

Fig. 1. Structure of a mallba skeleton: A unique generic interface is given for each resolution method, with different generic implementations for each environment (sequential, LAN, WAN); the user has to instantiate the problemdependent parts of his problem.

implementations. The problems to be solved have to be described by the user. The mallba library will provide three different implementations for the considered target environments: sequential, LAN and WAN. The structure of a mallba skeleton is shown in Fig. 1. In mallba, skeletons are represented by a set of C++ classes. When using an skeleton, the user does not need to have any knowledge about parallelism because all the related constructs are included inside the library and remain hidden to him. Some of the advantages are re-usability, genericity, portability and easiness of use. To provide them, mallba uses a particular approximation to generic programming using objects. Moreover, in most of the cases, efficient implementation are obtained. User interface The classes forming the skeleton are structured according to their dependence on the problem. The classes implementing inner functionalities of the method (e.g. the main procedure) are completely provided by the skeleton, whereas there are some classes whose implementation is required to be instantiated by the user. Therefore, the classes forming the skeleton are classified into two groups: provided and required . Provided classes. These “private” classes implement the resolution method itself. Mainly, there are two provided classes in each skeleton: the Solver and the Setup. The former implements a resolution method and maintains the state of the exploration. The later contains the setup parameters needed to tune the execution (e.g. number of independent runs to perform, number of iterations per independent run, etc.). The user can consult the state of the search and inquire other information related to the exploration process. Required classes. These “public” classes abstract the problem-dependent entities involved by the resolution method. The requirements on these entities depend on the problem. We have been

able to abstract the necessities of each entity but the way they are carried out when solving a concrete problem depends strongly on the problem itself. Therefore, the final behavior of these classes must be implemented by the user. This separation allows us to define C++ classes with a fixed interface but without implementation, so the Solver can use the required classes in a generic way. In order to instantiate a concrete problem, the user must give a particular implementation to the required classes. More precisely, this means: (1) introducing data types for representing the required classes and, (2) implementing the abstract methods of the class using the chosen data types. In order to enforce the preceding approach, each mallba skeleton has its classes separated into three parts: (1) the definition (interface) of the classes; (2) the implementation of the provided classes and, (3) the implementation of the required classes. We have avoided the use of inheritance and virtual methods in order to provide better efficiency. For more details, see [12]. A short example In order to illustrate the previous concepts, let us consider the skeleton for Tabu Search. Roughly speaking, Tabu Search [11] is a local search algorithm where some historical information related to the evolution of the search (the itinerary through the last visited solutions) is kept in order to improve the efficiency of the exploration process. Such an information will be used to guide the movement from one solution to the next one avoiding cycling. The provided classes in this skeleton are Solver and Setup; the required classes are Problem, Solution, Movement and TabuList. To instantiate the Tabu Search skeleton for the 0-1 Multidimensional Knapsack Problem the user would need to represent the benefits, the capacities and the constraint matrix into the Problem class. He would also indicate whether each item is included or not in the knapsack into the Solution class. This class also requires methods to compute the fitness, generate and apply Movements, intensify and diversify the search using the TabuList, etc. Hybridization interface The hybridization mechanism we consider in this work is the combination of several algorithms. Such hybrid algorithms will try to join the positive properties of the different techniques combined. Skeletons can be used in two ways: by instantiations and by internal components of the library. To deal with this, a standard and uniform interaction is needed between the skeleton and the environment. The concept of state has been added to the skeleton as a part of its internal implementation. The state of a skeleton describes the current position of the search, and can in turn be consulted or modified. It is composed of some relevant attributes involving current and historical information, such as the number of iterations performed, the current solution, the

Fig. 2. The architecture of MALLBA library

best solution found so far, etc. The existence of the state allows tracing the evolution of the search performed, and deciding new behavioral patterns for the remaining part of the search. A user may want to consult the state in order to analyze the behavior of the method or even to plot it. A skeleton may want to cooperate with other skeletons in order to solve a common problem. This kind of cooperation is known as hybridization. When hybridizing skeletons, the state of the searches may need to be modified in order to change their future behavior. III. The mallba network The infrastructure used in the mallba project is made of communication networks and clusters of computers located in M´ alaga, La Laguna and Barcelona. This infrastructure is intended to be considered by the project as a single parallel computational resource. In the following, we present a short description of this network, some measurements to analyze its performance, and some software we have developed to ease the parallel implementations of our skeletons. One of the objectives of the project is precisely to study the behavior of Internet in this context. Network description The three sites are connected through RedIRIS, the national network for the interconnection of computer resources of universities and research centers. This wide area network is formed by a set of nodes distributed throughout the country (see Fig 3). These nodes are interconnected by ATM circuits on 34/155 Mbps ATM accesses. Each university manages its own network to connect the Local Area Networks of the faculties and departments with RedIRIS. The technology used and the management policies are not necessary the same for the three sites. The computers on every cluster vary from one site to the other. For instance, the cluster in La Laguna is mainly made of Pentium III PCs, the cluster in Barcelona is made of PCs with AMD-K6 processors, and the cluster in M´ alaga provides a multiprocessor Digital Aphaserver and a set of Sun UltraSparc workstations. These facts introduce several levels of difficulty on the heterogeneous configuration of the mallba network.

BCN local

MPI (pingpong)

ULL local MPI-ULL-BCN

100,0000

Time (secs.)

10,0000 1,0000 0,1000 0,0100 0,0010

4

6

30

57

94

48 10

41

6

44 21 26

4

53 65

96

38

40

16

6

24 10

64

25

4

Size (bytes)

Red IRIS

Nodo RedIRIS Canarias

16

0,0001

ba5

Red UPC

Nodo RedIRIS Catalunya

C5 Building

ba6 Otros

LSI CESCA

Red Canaria

Building C6

Anella Científica ba7 ba8 ba9 Others

ba1 ba2 ba3 b4

ll1 ll2 ll3 ... ll9 010 Mbps 100 Mbps 155 Mbps

Switch Router

Fig. 3. The mallba Library uses computers connected through Internet. The local area networks of the mallba project communicate through RedIRIS, the R+D Spanish network. Date

Min/Avg/Max

Packet Loss

M´ alaga—Barcelona 09 March 12 March 14 March 15 March 16 March Avg ± Dev

27/45/625 29/88/839 77/121/732 28/113/803 79/124/767 98.2 ± 29.46

31% 51% 40% 27% 28% 35.4 ± 9.05

M´ alaga—La Laguna 09 March 12 March 14 March 15 March 16 March Avg ± Dev

Fig. 4. Local and remote communications between Barcelona and La Laguna

61/203/429 56/288/726 51/182/568 68/315/652 54/245/595 246.6 ± 49.94

53% 56% 26% 34% 44% 42.6 ± 11.31

Connection times (ms). TABLE I

Network performance For a better understanding of the communications on the WAN, we have been monitoring the average performance of the communications on the mallba network during the last year. We have developed several experiments that provide useful information when executing algorithms on a WAN. They have been performed during a typical workable week (differences with holidays are remarkable). Table I shows the trip time of packets (ping) between mallba clusters. A high sending time can be observed for the packets in the mainland (M´ alaga— Barcelona) and also a big increase of these times when the communications go from the mainland to the island. The substantial delay and high rate of error observed imply that the algorithms need to be coded using a reliable communication service. Figure 4 shows a comparison of local and remote communications between Barcelona and La Laguna clusters using a simple ping-pong communication pattern. Local times for the Barcelona cluster are smaller than La Laguna local times due to the faster computers and LAN used. The factor La Laguna local time / Barcelona local time is almost 2.27 for the small messages and 1.37 for the large ones. With

the remote communications, there is a ratio remote / local time of 200 for small messages (4 bytes) and 100 for messages up to 4 Mb. Middleware We have developed a middleware communication tool (NetStream) so that the skeletons can be indistinctly executed either on a LAN or a WAN. Since the algorithmic skeletons are implemented using a high level abstraction, the communication tool had to be endowed of such abstraction in order to permit an easy interaction of different algorithmic skeletons. The middleware was developed through three steps: (1) evaluation of existing tools as PVM, MPI, Java RMI, CORBA and Globus [13]; (2) identification of the main services of our system. Our system had to be efficient, standard (as much as possible) and easy to maintain, and, (3) its implementation in C++. We decided to use MPI as a basis to develop this middleware. This choice was done for efficiency reasons and because it seems to be a widely accepted standard of communication, and because it has been also included in recent systems. NetStream allows the skeletons to efficiently interchange information. To this end, we have minimized the number of parameters of the methods of the NetStream. The main available services can be classified as follows: • • • •

Send-receive of primitive data types. Synchronization services. Basic management of parallel processes. Miscellaneous.

By now, we have developed also some more advanced services such as management of processes in groups to be used by the parallel implementations and are currently completing other services. Monitoring tool The mallba project includes a software tool that monitorizes the status of the network on which mallba applications run. Periodically, the monitoring tool activates itself and gathers information relative to the state of the machines and their connections. This monitoring tool helps the user of the mallba network with the necessary information for an optimal execution. With respect to the machines,

n

m

best cost known

best cost found

OR5x100-00 OR5x100-29 OR10x100-00 OR10x100-29 OR30x100-00 OR30x100-29

100 100 100 100 100 100

5 5 10 10 30 30

24381 59965 23064 60633 21946 60603

24211 59667 22478 60629 21649 60503

OR5x250-00 OR10x250-00 OR10x250-29 OR30x250-00 OR30x250-29

250 250 250 250 250

5 10 10 30 30

59312 59187 149704 56693 149572

56759 56664 148414 54855 148617

OR5x500-00 OR5x500-29 OR10x500-00 OR10x500-29 OR30x500-00 OR30x500-29

500 500 500 500 500 500

5 5 10 10 30 30

120130 299904 117726 307014 115868 300460

114469 297707 112983 304122 111786 298746

114070.0 297453.8 112254.0 303849.2 111464.2 298687.6

0.0471 0.0073 0.0403 0.0094 0.0352 0.0057

n

m

best cost known

best cost found

average cost found

dev.

instance

average cost found

dev.

iters. (average)

total time (s)

Instances executed with max time = 600s 24162.2 59533.6 22419.4 60627.2 21623.0 60483.8

0.0070 0.0050 0.0254 0.0001 0.0135 0.0017

1192054.6 1633099.8 753246.6 1001095.8 275664.4 364593.2

619.8 620.8 630.2 625.8 624.9 624.3

Instances executed with max time = 900s 56490.2 56405.8 148257.0 54762.4 148482.2

0.0430 0.0426 0.0086 0.0324 0.0064

804849.0 490034.4 623402.4 151330.6 222286.4

939.1 926.0 932.2 936.5 931.4

Instances executed with max time = 1200s

Fig. 5. The mallba library offers tools for monitoring the execution of the programs and the state of the network.

the monitor gets data from the operating system, using in many cases the pseudo-files in the /proc directory. With respect to the connections, the impossibility to obtain meaningful values from the communication devices themselves (network cards, hubs, switches, etc.) forces us to probe the network using message passing. For this purpose, packets of different sizes are sent between machines using MPI primitives. The monitoring tool includes three different modes of operation: Batch: In batch mode, the monitoring tool is launched on the machines and links to obtain data of the system, saving it to external memory on a designated machine. Once launched, it is not possible to interact with the tool, except to stop it. Analysis: In analysis mode, a graphical tool presents to the user the data obtained in a previous execution in batch mode. Interactive: In interactive mode, the monitoring tool is launched on the selected machines, but rather than saving the data, it sends them to a specific machine, where the graphical tool displays the data. See Fig. 5. Currently, a fourth mode of operation is being added to the tool. This mode will integrate the monitor in the middleware and provide an interface layer to other programs that wish to query the status of the network. IV. Computational results Our skeletons have been instantiated for different interesting problems. These include Quadratic Assignment, Minimum Weighted k-Cardinality Tree, Maximum Satisfactibility, Maximum Cut, Minimum Linear Arrangement, 0-1 Multidimensional Knapsack and Bisection. In the following, we present some experimental results. Tabu Search for Knapsack Table II shows some results obtained with the Tabu Search skeleton instantiated for 0-1 Multidimensional MultiKnapsack Problem in an Independent Runs model and Master-Slave model. These results are obtained for six instances from the ORLibrary [14]. The execution is done with different

instance

526232.0 644637.2 326128.0 416856.2 87063.6 129835.6

iters. (average)

1348.8 1229.1 1243.2 1232.3 1248.9 1241.3

time total (s)

Instances executed with max time = 600s OR5x100-00 OR5x100-29 OR10x100-00 OR10x100-29 OR30x100-00 OR30x100-29

100 100 100 100 100 100

5 5 10 10 30 30

24381 59965 23064 60633 21946 60603

24282 59896 22717 60629 21616 60123

OR5x250-00 OR10x250-00 OR10x250-29 OR30x250-00 OR30x250-29

250 250 250 250 250

5 10 10 30 30

59312 59187 149704 56693 149572

58660 57674 148151 55323 148582

OR5x500-00 OR5x500-29 OR10x500-00 OR10x500-29 OR30x500-00 OR30x500-29

500 500 500 500 500 500

5 5 10 10 30 30

120130 299904 117726 307014 115868 300460

115552 295432 114161 303469 110133 297681

24052.2 59792.6 22670.0 60507.4 21493.8 60072.2

0.0041 0.0012 0.0150 0.0001 0.0150 0.0079

1838.0 1723.0 1079.8 1040.4 430.4 387.2

618.8 620.4 622.5 619.5 626.4 629.0

Instances executed with max time = 900s 58274.2 57227.4 147805.2 54937.8 148259.4

0.0110 0.0256 0.0104 0.0242 0.0066

771.0 492.4 368.2 234.5 199.0

921.5 929.1 937.7 937.5 960.2

Instances executed with max time = 1200s 112096.8 293429.4 113312.2 303322.6 109422.2 297614.4

0.0381 0.0149 0.0303 0.0115 0.0495 0.0092

281.4 191.0 253.6 187.2 141.0 97.6

1233.3 1272.5 1224.6 1272.8 1240.0 1289.4

Results obtained for big instances of the 0-1 MKNP under independent runs model (top) and master-slave model (bottom) with 8 processors. n and m denote the number of items and the number of restrictions. TABLE II

maximum execution times and different number of processors. We observe that the quality of the solutions obtained is very close to the best known one. Notice that the best known costs are obtained by ad hoc methods, while our solutions are found using a generic skeleton. Hybridization for Minimum Linear Arrangement In order to find good solutions for the Minimum Linear Arrangement problem, we have designed two versions of a hybrid heuristic which combines spectral methods and simulated annealing skeletons (see [15] for details). For the exact version, good speedups are observed but quality degrades as the number of processors increases. On the contrary, the chaotic version obtains an excellent speedup and maintains the solution quality (see Fig. 6). This heuristic runs in parallel in a cluster of PCs and obtains the best results known so far. V. Other research topics around mallba The mallba project brings together a team of researchers and gives them access to a geographically distributed environment. So it has become a successful place holder for basic research on distributed and parallel computation. Here we briefly list some of the research topics around mallba. Performance prediction of oblivious BSP programs. The BSP model can be extended with a zero cost synchronization mechanism, which can be used when the number of messages due to receives is known. This mechanism, usually known as oblivious synchronization. In [16], we presented an extension of the BSP complexity model to deal with oblivious barriers and

skeletons for WAN environments. Our results show that we will face difficult problems such as asynchronism, high rate of communication faults or the cooperation of algorithms being in very different degree of progress.

airfoil1 360000 350000

2 3 4 5 6 7 8 9

340000

la

330000 320000 310000

Sequential processors processors processors processors processors processors processors processors

300000

References

290000 280000 0

500

1000

1500

2000

2500

[1]

Time (seconds)

airfoil1

[2]

360000 350000

2 3 4 5 6 7 8 9

340000

la

330000 320000 310000

Sequential processors processors processors processors processors processors processors processors

[3]

300000 290000 280000 0

500

1000

1500

2000

2500

Time (seconds)

Fig. 6. Exact and chaotic parallel SS+SA on the airfoil1 graph: Cost in function of time depending on the number of processors.

shows its accuracy. Refining BSP into explicit two-sided communications. The aim of [17] was to show how a verified BSP program may be transformed into a guaranteed correct conventional message-passing implementation which may, in certain circumstances, avoid the expense of providing a synchronization barrier. The tuning problem on pipelines. Considerable efforts have been made both in pure theoretical analysis and practical automatic profiling. In [18], we introduce a general performance prediction methodology based on the integration of analytical models and profiling tools. The accuracy of the proposal has been tested on the CRAY T3E for pipeline algorithms solving combinatorial optimization problems. Support to alggen. The algorithmic and genetics group (alggen, part of LSI at UPC) is dedicated to research in computational biology and bioinformatics [19]. One of the main research fields is the DNA assembly problem: given a set of many short sequences (15000 sequences with length 1000 base pairs) extracted from a long DNA chain, the goal is to recover the original sequence. This problem is highly time consuming and the mallba network allows us to solve it for complex real biological cases. Software for the learning of Algorithmic Techniques. Software tools play an important role in teaching and learning process. In [20] two tools are compared for the teaching of Algorithmic Techniques: Lingo and mallba.

[4] [5] [6] [7]

[8] [9] [10] [11] [12]

[13] [14] [15] [16]

[17]

[18]

VI. Conclusions and future work

[19]

We have presented a brief overview of the mallba project. At the end of this project we should be able to offer a user-friendly geographically distributed (Internet based) generic library to solve optimization problems. So far, we have built skeletons and run them in sequential and LAN environments. We also have started to study the behavior of our underlying network in order to guide the design of the

[20]

M.R. Garey and D.S. Johnson, Computers and Intractability. A Guide to the Theory of NP-Completeness, W.H. Freeman and Co., 1979. C. Cotta and J.M. Troya, “A hybrid genetic algorithm for the 0-1 multiple knapsack problem,” in Artificial Neural Nets and Genetic Algorithms 3, G.D. Smith, N.C. Steele, and R.F. Albrecht, Eds., Wien New York, 1998, pp. 251– 255, Springer-Verlag. E. Alba and S. Khuri, “Applying evolutionary algorithms to combinatorial optimization problems,” in Proceedings of the International Conference on Computational Science (ICCS’01). 2001, vol. 2074 (Part II) of Lecture Notes in Computer Science, pp. 689–700, Springer-Verlag. D.H. Wolpert and W.G. Macready, “No free lunch theorems for optimization,” IEEE Transactions on Evolutionary Computation, vol. 1(1), pp. 67–82, 1997. L. Davis, Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York, 1991. E. Alba and J. M. Troya, “Improving flexibility and efficiency by adding parallelism to genetic algorithms,” Statistics and Computing, 2001 (to appear). C. Cotta, J.F. Aldana, A.J. Nebro, and J.M. Troya, “Hybridizing genetic algorithms with branch and bound techniques for the resolution of the tsp,” in Artificial Neural Nets and Genetic Algorithms 2, D.W. Pearson, N.C. Steele, and R.F. Albrecht, Eds., Wien New York, 1995, pp. 277–280, Springer-Verlag. David Levine, “Pgapack, parallel genetic algorithm library,” http://www.mcs.anl.gov/pgapack.html, 1996. S. Tsch¨ oke and T. Polzer, “Portable parallel branchand-bound library,” http://www.uni-paderborn.de/cs/agmonien/SOFTWARE/PPBB/introduction.html, 1997. K. Klohs, “Parallel simulated annealing library,” http://www.uni-paderborn.de/fachbereich/AG/monien/ SOFTWARE/PARSA/, 1998. F. Glover and M. Laguna, Tabu Search, Kluwer Academic Publishers, 1997. M. J. Blesa, Ll. Hern` andez, and F. Xhafa, “Parallel Skeletons for Tabu Search Method,” Tech. Rep. LSI-00-81-R, Dept. de LSI, UPC, 2000, To appear in the IEEE Proceedings of the 8th International Conference on Parallel and Distributed Systems. E. Alba, C. Cotta, M. D´ıaz, E. Soler, and J.M. Troya, “Mallba: Middleware for a geographically distributed optimization system,” Tech. Rep., Dept LCC, UM, 2000. J.E. Beasley, “OR-Library: Distributing Test Problems by Electronic Mail,” Journal of Operational Research Society, vol. 11, no. 41, pp. 1069–1072, 1990. J. Petit, Layout Problems, PhD thesis, Universitat Polit` ecnica de Catalunya, 2001, http://www.lsi.upc.es/∼jpetit/Publications. J. A. Gonzalez, C. Leon, F. Piccoli, M. Printista, J. L. Roda, C. Rodriguez, and F. Sande, “Performance prediction of oblivious BSP programs,” in EuroPar. 2001, Springer-Verlag. A. Stewart, M. Clint, J. Gabarr´ o, and M. J. Serna, “Towards Formally Refining BSP barriers into Explicit twosided Communications,” in EuroPar. 2001, SpringerVerlag. L. M. Moreno, F. Almeida, D. Gonzalez, and Rodriguez C., “The Tuning Problem of Pipelines,” in EuroPar. 2001, Springer-Verlag. X. Messeguer, “Algorithmics and Genetics Group,” http://www.lsi.upc.es/∼alggen, 2001. I. Dorta, P. Dorta, A. Rojas, and C. Le´ on, “Utilizaci´ on de software en la docencia de t´ ecnicas algor´ıtmicas,” in JENUI 2001, 2001.