Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

A Combinatorial Auction Framework for Decentralised Task Allocation Pau Segui-Gasco⇤ , Hyo-Sang Shin⇤ , Antonios Tsourdos⇤ , and V.J. Segu´ı† ⇤ Cyberphysical Systems Centre, Institute of Aerospace Science, School of Aerospace, Transport and Manufacturing Cranfield Univesity, MK43 0AL, United Kingdom, Email: [email protected]

† Dept. of Mechanical Engineering and Material Science, Escuela Polit´ecnica Superior de Alcoy (EPSA)

Universidad Polit´ecnica de Valencia (UPV), Plaza Ferrandiz y Carbonell 1, 03801, Alcoi, Spain

Abstract—This paper proposes a framework for decentralised task allocation in which agents find a global solution sharing local performance metrics. Leveraging the PAUSE procedure for Combinatorial Auctions we outline a decentralised version of in order to solve a task allocation problem. The algorithm has polynomial communications overheads and it can also accommodate a variety of objective functions and task dependencies. To investigate its performance, some numerical experiments were carried out and their results were compared to those of an optimal solver and a representative alternative algorithm, namely CBBA, solving the problem with three representative objective functions: MinSum, MinMax and MinAve. The results show that, overall, the proposed algorithm outperforms the current state of the art alternative, moreover the results for MinSum show that it always finds optimal solutions while it remains close to the optimal for MinMax and MinAve.

Keywords: Decentralised Task Allocation, Groups of Multiple UAVs, Combinatorial Auction, Distributed Robotic Systems, PAUSE. I. I NTRODUCTION Teams of multiple UAVs are gaining increasing popularity as an alternative to single asset solutions thanks to its versatility, resilience and its distributed nature. In order to maximise the operational capability of this new concept, researchers are steering their operation towards a network-based system where a plurality of vehicles with increased autonomy perform tasks as part of a team [1]. In such a system, remote pilots would be replaced by network operators who could inject tasks in the network and supervise their execution, rather than dealing with the trouble of individually piloting each of the vehicles which hampers scalability. The fundamental strength of this approach would hinge on: the flexibility provided by the availability of a variety of payloads in different spatial locations at the same time; the resilience offered by the multiplicity of low cost vehicles that are able to take over each other in case of a fault; and the ability to fly outside the communication range of their ground operator by using team vehicles as relays to communicate data back to the base. The successful and effective assignment of the available resources will be the key enabler of such vision in which operational advantages could be maximised. The vehicles should be able to find an answer quickly, reliably, and effectively to the question: “given the resources available in the network and the tasks that ought to be carried out, what is the best allocation of these tasks among us?”. This is known as the task allocation problem, and features the following characteristics: heterogeneous vehicles with different kinds of information available; some vehicles can communicate with some other vehicles but not all; different agents are at different locations, carry heterogeneous payloads and have different flight characteristics. These characteristics provide the power to the system, but, at the same time, they make the task allocation problem very difficult to solve.

978-1-4799-7470-2/14/$31.00 ©2014 IEEE

In general, the task allocation problem is NP-Hard in all but its simplest incarnations [2], [3]. There exist tools in the Operations Research and Combinatorial Optimisation literature that enable the centralised solution of instances of practical interest. However, the centralised solution of the problem involves having to communicate all the agents and environment data to a centralised entity. This may not be the most appropriate approach for some scenarios because the central entity removes resilience by introducing single point of failure and because the bandwidth to communicate all the information to the central entity from every agent may not be available in some communication environments. Recently, researchers are exploring ways to solve the task allocation problem in a decentralised manner to overcome the limitations of the centralised solution [4] by relying on the local assessment of the objective functions to avoid sharing all the information globally. However the field is very incipient and guaranteed optimal solutions in polynomial time with polynomial communication costs exist only for the simplest instances [5]. There is a limited amount of works that have approximation guarantees and polynomial communications costs [6]. Nevertheless, these decentralised approaches impose restrictive conditions on the objective functions that can be used. This research develops a framework that has comparable communications overheads but does not restrict the kind of cost functions that can be used. II. TASK A LLOCATION P ROBLEM In order to describe the task allocation problem tackled in this study, let us first define the set of the agents participating in the allocation as A, and the set of tasks to be allocated as T . The objective is to find the allocation of non overlapping bundles in P(T ) to agents in A spanning all the tasks in T , and optimises the objective function. P(T ) = {b|b ✓ T } denotes the power set of T , which is the set of all possible subsets or bundles of T including ; and T . The objective function J : (R+ ⇥ {0, 1})|P(T )|·|A| ! R+ maps the performance metrics of each bundle of each agent and the allocation variables to a positive real number quantifying the allocation cost or score. The performance metric (or valuation) of each agent is individual to the agent itself, as different agents might have different payloads and/or capabilities as well as different information about the tasks. Thus, each agent, a 2 A , has its own function, ca : P(T ) ! R+ , to evaluate the score of a bundle of tasks b ✓ T . This evaluation is entirely local and is based only on the information available to each individual agent. A binary decision variable referred to as xab has a value of 1 if bundle b 2 P(T ) is allocated to agent a 2 A or 0 if otherwise. Summing up, the problem can be formulated as follows:

1424

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

optimize

J((ca (b), xab ), ...), 8a 2 A, 8b ✓ T

subject to: y \ z = ;, 8y, z 2 S, y 6= z [ b=T b2S

xab 2 {0, 1}, 8a 2 A, 8b ✓ T

S = {h|a 2 A, h ✓ T , xah = 1}

where S represents the set of allocated bundles, while the first constraint means that no two allocated bundles can have overlapping tasks, and the second constraint means that the union of all allocated bundles must be equal to the tasks set T . In other words, all tasks should be allocated and each exclusively to one agent. To illustrate this formulation, an example of a problem that could be casted as an instance of this general framework could be the Multiple Traveling Salesman Problem, mTSP, with the tasks being cities or waypoints, the agents performance metrics being the distance traveled to visit a given bundle of cities, the objective function as the sum of the distances travelled in the allocated bundles, and the optimisation being a minimisation. A. Solution Approaches 1) Problem Space: This general framework can accommodate most of the relevant task allocation problems. The variety of situations that can arise from the definition of both the agents’ performance metrics and the objective function give rise to a variety of combinatorial problems. A widely accepted taxonomy that maps the nature of each problem to a well known combinatorial problem was introduced in [2] and was extended to support task dependencies in [3]. These two works provide a map of the task allocation problem space. With reference to the taxonomy in [2] the framework we propose aims to tackle problems with: both Single-Task robots (ST) and Multi-Task robots (MT); Single-Robot tasks (SR); and both Instantaneous Assignment (IA) and Time Extended Assingment (TA). With respect to the task dependencies taxonomy in [3] the framework proposed in this work can accommodate: No Dependencies, InSchedule Dependencies (ID) and some instances of Cross-Schedule Dependencies (XD). 2) Decentralised Solution Approaches: Over the past decades researchers have developed a variety of fast an efficient decentralised task allocation algorithms for the simplest instance of the problem, that is, where each agent is only allocated at most one task [7]. The first distributed Task Allocation strategy was proposed by [8], where an auction algorithm was proposed based on the idea of a shared memory model. However, the shared memory model required a topology of the networked system that is not always achievable in real scenarios. To address this issue in [5] an algorithm is proposed to handle a networked system in which agents interact with its neighbours, rather than having access to a shared database. Another approach presented in [9] based on task swapping also yields a global optimal solution. In the work [10], a recent application of a qualitatively similar algorithm for UAV task allocation in a dynamic environment is described alongside an account of its performance in real flight. Similar to [5], in [6] the authors present the ConsensusBased Auction Algorithm CBAA. This algorithm uses the concept of maximum consensus to distribute a series of single item auctions over the network, achieving guarantees of both convergence (if the network is connected) and of an optimal assignment. To solve a more complex instance of the task allocation problem in which agents are allocated bundles of tasks instead of individual tasks,

in [6] Choi et al present the Consensus Based Bundle Algorithm. It is based on a decentralisation of a greedy Combinatorial Auction algorithm. CBBA gives guarantees on convergence and approximation to the optimal and it is distributed across the network with no central entity needed. To achieve that it imposes a sub modularity, or Diminishing Marginal Gains, condition on the bids of the agents. Each agent submits a bid for each given task that is computed as the marginal reward that it obtains for being added to its current bundle. Several authors have extended CBBA to overcome some of its limitations. In [11] the authors adapt CBBA for a realistic scenario with obstacles and measurement noise. In [12] the authors introduce tasks with time windows by using a decaying reward function, handling for changing communication networks, and fuel cost reward awareness. Recently, the same group [13], [14], [15] tackles the limitation of DMG task scoring by using wrapping functions so that the bids appear (sic) as if they were submodular in the consensus space while they are handled as non-DMG in the agent own domain, consequently allowing for improved synergies within the bundles, however at the cost of surrendering all performance guarantees. B. Proposed Framework In this research, we aim to find task allocation approaches that remove the restrictions of approaches such as CBBA, and keep the fundamental advantages of its decentralised nature. CBBA leverages a simple auction model: agents add tasks to their bundles and send single task bids with the marginal value that each of the tasks creates. The auctioneer simply awards the task to the agent whose marginal gain is the highest. Given such simple auctioneer mechanism, it can then be naturally translated into a set of consensus rules, that perform the role of the auctioneer, and distribute the algorithm across the network. The key lesson is: find a framework with a very efficient auctioneer role and then substitute it by a consensus rule to achieve natural decentralisation through consensus. Unfortunately, combinatorial auctions are characterised, precisely, by a very hard auctioneer role, in fact the general Winner Determination Problem is NP-Hard [16]. To get around this, in this research we use a framework called the Progressive Adaptive User Selection Environment (PAUSE) conceived for government auction of telecom licenses [17]. It was intended as a way to simplify the auctioneer role in combinatorial auctions, with the aim that all companies involved could easily see the decisions being taken and could understand them, in order to give confidence to the bidders by guaranteeing the transparency and fairness of the final decision. Naturally, the computational burden cannot be avoided. The computational load is not removed but rather transferred from the auctioneer to the agents. This is accomplished by requiring them to submit a composite bid that encompasses not only their own bundles but also those of other agents, in such a way that all the tasks are assigned, and each task is assigned to only one agent. Hence, the role of the auctioneer is simply to keep a record of the composite bids submitted and award the tasks to those agents contained in the composite bid whose payoff is the highest. With such an efficient role for the auctioneer, the decentralisation of the algorithm is natural with consensus rules. In fact, there have already been some attempts to use the PAUSE framework for distributed P2P auctioning in e-commerce and business procurement applications [18] [19], with the view of enabling decentralised internet auctions by removing the room for the malicious bias of a centralised, and possibly non-trustworthy, internet auctioneer. Now, we give an intuitive description of the PAUSE mechanism following that of Land et al [20]. A PAUSE auction for m items is

1425

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

a multi-stage auction of m stages. In each phase, each agent must send to the auctioneer a composite bid that covers all the items. In the first stage, each agent submits a composite bid composed only of single item bids of its own. In the second stage, each agent submits a composite bid based only on a combination of the single item bids, previously submitted by all agents at stage 1, and two item bids of their own. In the third stage, each agent submits a composite bid that contains one item and two item bids and their own three items bids. The process continues and, in the general nth stage, each agent submits a composite bid that is a combination of the previously (1, 2 ... n 1)-items bids, submitted in the previous rounds by all the agents, and the agents own nitem bid. In each round, each agent can either submit a bid, if it improves the current best composite bid by a design threshold, or just wait and listen to the other agents if it cannot improve it. Note that with this mechanism, if an agent improves the bid in the final round, it will be because it is submitting a composite bid that contains only a single bid that spans all the items. With this mechanism, the job of the auctioneer simply is: to record all the bids with their valuations; and to keep a database available to the agents with the submitted bids. The winners then, are simply determined by the agents contained in the composite bid with the highest score. This mechanism was devised to improve the sense of fairness and transparency in government licensing. However, the fact that the auctioneer role is so efficient makes it naturally decentralisable. Leveraging the key ideas in CBBA and in the PAUSE framework, in this paper we propose a decentralised framework which finds the optimal allocation for a wider set of task allocation scenarios than current approaches such as CBBA.

the pairs of agent’s valuations and decision variables. A composite bid is what agents exchange among themselves and it is the result of each agent solving the allocation problem at each stage with the information available at that point in time to each of them. The information that an agent a 2 A keeps about the other agents is kept in a set Ska that contains all the bids in the composite bids previously exchanged by S other agents in the stages 1, 2, ..., k 1, more formally, Ska = Ska 1 [ ( i2A CBki 1 ) with S0a = ;. This set Ska is i6=a

used in conjunction with the agent’s own bids to find the subsequent solution to the allocation problem to produce the composite bid CBka that is exchanged with fellow agents. A set Dka is kept by agent a 2 A, containing the bids exchanged previously by other agents and any of its own that have a size smaller or equal to the corresponding bid stage, i.e. Dka = Ska [ {(b, a, ca (b))|8b ✓ T , |b| k}. Intuitively Dka is the pool from which agent a 2 A draws bids to find the optimal allocation resulting in the the composite bid of stage k. With all this, the bidding problem for agent i 2 A at stage k, given a set of bids Dki is to find a non overlapping set of bids that contains each task in T exactly once and optimises the objective function of the allocation. This problem can be formulated as: optimize subject to: y \ z = ;, 8y, z 2 S, y 6= z [ b=T b2S xab 2

{0, 1} 8(b, a, ca (b)) 2 Dki

S = {h|(h, a, ca (h)) 2 Dki , xah = 1}

III. D ECENTRALISED TASK A LLOCATION A LGORITHM Now, we describe the proposed algorithm formally. But first, we shall introduce some notation used in the exposition. Recall that we defined a function ca : P(T ) ! R+ , thus ca (b) 2 R+ implements the performance metric (or valuation) of agent a 2 A executing the bundle of tasks b 2 P(T ). Examples of this metric could be simple parameters like travelled distance, flying time or more elaborate ones such as information collected, or a combination of agent dependent rewards and costs among other things. A bid Bba is a tuple (b, a, ca (b)) 2 P(T ) ⇥ A ⇥ R+ that contains, respectively, the set of tasks involved b ✓ T , the agent who is responsible a 2 A, and its score ca (b) 2 R+ . The evaluations of these bids is fully local and is produced by each individual agent and shared as part of a composite bid. For example, consider the simple case where the performance index were the flight time and the tasks waypoints to visit. Then, a bid (b, a, ca (b)) 2 P(T ) ⇥ A ⇥ R+ would be computed locally by agent a 2 A. Agent a, in this case, would go about this by calculating that the time (in, say, seconds) needed to visit each task contained in b from its position is tb (that is, it would evaluate tb = ca (b) in our notation). Subsequently, in order to inform other agents about its capability to perform the tasks in b (i.e. visit the waypoints in b) it would aggregate this information with the task set b and its own identifier a, producing the bid (b, a, tb ). Hence, all the other agents in the set A \ {a} will “know” about the performance of agent a visiting the tasks in b is that it takes a time (in, say, seconds) given by the real number tb , rather than being able to evaluate the function by themselves. Now, a composite bid, CBka is the solution to the bidding problem at stage k by agent a 2 A, and it is the set of bids that spans all the tasks and does not contain overlapping bids while optimising the objective function J : (R+ ⇥ {0, 1})|P(T )|·|A| ! R+ that depend on

J((ca (b), xab ), ...) 8(b, a, ca (b)) 2 Dki

Having defined the bidding problem, we can formally define a composite bid as a set CBki that contains all the bids that are selected in the solution of the bidding problem that each agent i solves in the stage k, i.e. CBki = {(b, a, ca (b))|(b, a, ca (b)) 2 Dki , xab = 1}. We will assume that agent i 2 A solves optimally the bidding problem in round k by calling the routine C OMPUTE B ID(Ski , ci (), k). In practice this routine must be adapted to each individual problem with its performance metrics and objective function. The algorithm used to solve the bidding problem is not relevant as long as it provides the optimal solution for it. With this definitions we can now introduce the proposed task allocation algorithm. Initially the agents have a copy of the set T containing all the tasks that must be allocated and the set A of all the participating agents in the network. The algorithm proceeds in k 2 {1, 2, ..., |T |} stages. In the first round each agent i has not received any information from the other agents hence S1i is empty and since k = 1, D1i contains only bids of agent i itself, thus there can only be|T | bundles and consequently there can be only one possible solution to the bidding problem: CB1i contains single task bids for each of the tasks in T . At stage k = 2 agent’s i bid-set S2i contains all the single task bids exchanged at the end of round k = 1 for each of the tasks in T by each of the agents in A and finds the best allocation among them and 2-task bids of its own. The process continues until k = |T | when finally the best allocation is the composite bid with the best objective. The algorithm for an agent i 2 A is outlined in figure 1. IV. D ISCUSSION We have developed this algorithm bearing two main objectives in mind: to be decentralised with a communication overhead comparable

1426

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

A. Communication Requirements

1: procedure TASK A LLOCATION(A, T , i, ci ()) 2: for a 2 A if a 6= i do 3: CB0a ; 4: end for 5: S0i ; 6: for k 2 {1, 2, ..., |T |} do 7: for a 2 A if a 6= i and k > 1 do 8: CBki 1 R ECEIVED B ID(a, k 1) 9: end for S 10: Ski Ski 1 [ 8i2A CBka 1

In this algorithm it is assumed that all the agents receive the composite bids of all the other agents, be it through direct communication or through a mesh protocol. Using this algorithm each agent exchanges |T | composite bids. With the agents in a connected network with diameter Dm the number of messages exchanged is #msg = O(|A| · |T | · Dm ). The message size is proportional to the number of tasks |T |. Thus, it has the same scalability as CBBA with respect to communications overheads. B. Computational Complexity

a6=i

11: CBki C OMPUTE B ID(Ski , ci (), k) 12: S END B ID(CBki ) 13: end for 14: for a 2 A if a 6= i do a 15: CB|T R ECEIVED B ID(a, |T |) | 16: end for a 17: CB ⇤ OPTa2A CB|T | ⇤ 18: return CB 19: end procedure Fig. 1.

Task Allocation Algorithm for agent i 2 A

with the state of the art; and to remove the restrictions in cost metrics that current state of the art algorithms have. Now, let us make some remarks on the locality of the valuation functions and its implications. Bundle valuations by each agent do not change over the execution of the algorithm. It is assumed that once an agent has shared a bid (b, a, ca (b)) in a composite bid, the valuation ca (b) that agent a makes of bundle b does not change in subsequent rounds. This is a reasonable assumption as it only requires to have differentiated time scales for the allocation procedure and for the mission execution. This is usually the case in practice because the task execution implies that the agent must travel to different spatial location taking much longer than the allocation procedure. The valuation of the bids by each agent a 2 A, namely, the functions ca : P(T ) ! R+ , is fully local. That is, given a bundle of tasks b 2 P(T ), only agent a can compute the valuation (or performance index) ca (b) of the bundle b, all the other agents in A \ {a} only know the valuation given by agent a in the bid, not the valuation function, and cannot infer the valuation for other bundles that have not been previously shared by a. This locality is fundamental to enable cooperation of both heterogeneous agents with different capabilities and different levels of situational awareness. This is because each agent can value locally in their function ca : P(T ) ! R+ whether they have enough information to perform a task, whether they have the payload needed, whether they can fly to the task in fast enough etc. This enables the network controller to trade off between delaying the start of the allocation procedure to spend more time exchanging situational awareness in order to have more accurate valuations and tolerating some level of situational awareness discrepancy in exchange of an earlier start of the allocation procedure. Now, we shall discuss briefly the communication requirements of our algorithm and its computational complexity. Then, we shall present experimental results of its performance comparing it against CBBA with three representative objective function that are common in multi-robot routing problems: MinSum, MinMax, and MinAve, followed by a discussion of the numerical results.

Solving task allocation problem has been shown to be NP-Hard [2], [4]. In this case, the computational complexity of the algorithm is determined by the bidding problem which is NP-Hard since it is an instance of the task allocation problem of smaller size, thus solving this problem is what takes most of the computational effort. In the formulation we have outlined in the previous section the problem has, because of its generality, an exponential number of variables and constraints and it is not easily solvable in practice. In reality this is not necessarily the case because depending on the specific score functions it can be reformulated as an adapted version to well known problem. These problems have proven solutions methods in practice, examples are: traveling salesman, set packing or scheduling problems (to name but a few). These methods can be used by including an extra constraint on the bundle size to match the stage and by incorporating the solutions of the other agents. In doing so although the problem remains NP-Hard there are algorithmic tools available, such as state of the art MIP Solvers [21] or problem specific algorithms, that would enable the solution of instances of practical relevance. C. Numerical Results In order to explore the performance of our algorithm we have chosen the widely studied field of multi robot routing, that is, how to find optimal paths for a group of robots. Given a set of agents and locations to visit (tasks), there are three main objective functions that can be considered the eigenvectors of the performance metrics used in routing problems: • • •

MinSum: minimises the total sum of the costs (distances) over all agents; MinMax: minimises the maximum cost (distance) incurred by the agents; and MinAve: minimises the average cost (distance) incurred by the agents from start to the visit of each location.

As baselines to compare the performance of our algorithm we take a global optimal solution and warped CBBA. The optimal solution was computed using the commercial MIP Solver Gurobi [21]. CBBA [6], was chosen as a baseline option because it has similar communication scalability, however, because CBBA does not converge when the objective function does not satisfy the DMG we used the warped extension provided by Johnson et al [13]. Gurobi was also used to solve the bidding subproblems of each agent. We conducted a Monte Carlo simulation with 400 runs for each task number and metric to investigate the trends in the solution performance for each metric as the size of the problem increases. To this end a random scenario was created each time by placing 4 agents in uniformly random locations of a 1000m by 1000m area, the tasks were also uniformly randomly placed. The performance metrics were the distances travelled by the agents. Once an scenario was solved with the three algorithms the objective values of CBBA were normalised with respect to the optimal by:

1427

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

PAUSE

CBBA

MinSum MinMax MinAve MinSum MinMax MinAve

5 0. 0. 0. 1.94 9.84 4.15

6 0. 0.08 0.04 2.95 9.65 3.45

Task Number 7 8 0. 0. 0.08 0.17 0.04 0.09 3.45 3.68 12.9 12.31 5.01 5.77

9 0. 0.34 0.2 4.35 13.36 5.83

TABLE I M EAN R ELATIVE S CORES (%) OF PAUSE AND CBBA WRT THE OPTIMAL .

CBBA OP T 100(%) OP T and the objective values of PAUSE by: scCBBA =

scPAUSE =

P AU SE OP T 100(%) OP T

(1)

(2)

where OPT, PAUSE, and CBBA represent the objective scores attained for the optimal, PAUSE and CBBA respectively. A summary of relative scores for each of the objectives are shown in figures 2, 3 and 4 and the means of the relative scores for each case is shown in table I. While the number of tasks is small only from 5 to 9, some conclusions can be extracted regarding the underlying trends. In the MinSum metric, the PAUSE algorithm gave in each and everyone of the cases a relative score of 0%, i.e., it found the optimal solution every time, whereas for the same metric CBBA’s performance was gradually degrading as the task number increased. In the cases of the MinMax and MinAve metric, PAUSE most of the time found an optimal solution, but not always and that its performance degraded gradually as the task number increased. In all three objectives we can see that PAUSE provides significantly better results. This is probably due to the fact that PAUSE is able to consider a richer description synergies within the bundles, whereas CBBA only considers the marginal values. V. C ONCLUSIONS AND F UTURE W ORK We have presented a decentralised algorithm for the Multi Robot Task Allocation Problem with communication costs comparable to those of the state of the art such as [6]. To demonstrate its performance we have conducted numerical experiments with representative performance metrics from the routing domain: MinSum, MinMax and MinAve. The results show that the proposed algorithm is equal to the optimal for the MinSum objective and very close for the MinMax and MinAve. Our algorithm yields an overall superior performance over a comparable state of the art decentralised task allocation algorithm such as CBBA [6]. The fact that PAUSE always found an optimal solution for the MinSum objective is very interesting and could indicate that the algorithm could be proved optimal for some objective functions of practical interest. The study of the performance of the algorithm for different structural features of the objective functions in the objective space from a theoretical point of view is an open research area. This area should be explored with the aim of stabilising optimality and approximation guarantees for relevant objective functions. More work should also be devoted to the development of a consensus algorithm that enables the support for dynamic communication networks to bring its applicability closer to real world scenarios.

Fig. 2. Comparison of the distribution of the costs of CBBA using warping functions with the proposed algorithm for the MinSum metric.

Fig. 3. Comparison of the distribution of the costs of CBBA using warping functions with the proposed algorithm for the MinMax metric.

Fig. 4. Comparison of the distribution of the costs of CBBA using warping functions with the proposed algorithm for the MinAve metric.

R EFERENCES [1] H. Oh, S. Kim, A. Tsourdos, and B. A. White, “Coordinated road-network search route planning by a team of UAVs,” International Journal of Systems Science, vol. 45, no. 5, pp. 825–840, Dec. 2014.

ACKNOWLEDGMENT This work was funded by DSTL National PhD Programme.

1428

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

[2]

[3]

[4]

[5]

[6] [7]

[8]

[9]

[10]

[11] [12]

[13]

[14]

[15]

[16] [17] [18]

[19]

[Online]. Available: http://www.scopus.com/inward/record.url?eid=2s2.0-84892910329&partnerID=tZOtx3y1 B. P. Gerkey and M. J. Mataric, “A Formal Analysis and Taxonomy of Task Allocation in Multi-Robot Systems,” The International Journal of Robotics Research, vol. 23, no. 9, pp. 939–954, Sep. 2004. [Online]. Available: http://ijr.sagepub.com/cgi/doi/10.1177/0278364904045564 G. a. Korsah, a. Stentz, and M. B. Dias, “A comprehensive taxonomy for multi-robot task allocation,” The International Journal of Robotics Research, vol. 32, no. 12, pp. 1495–1512, Oct. 2013. [Online]. Available: http://ijr.sagepub.com/cgi/doi/10.1177/0278364913496484 M. Dias, R. Zlot, N. Kalra, and a. Stentz, “Market-Based Multirobot Coordination: A Survey and Analysis,” Proceedings of the IEEE, vol. 94, no. 7, pp. 1257–1270, Jul. 2006. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1677943 M. M. Zavlanos, L. Spesivtsev, and G. J. Pappas, “A distributed auction algorithm for the assignment problem,” 2008 47th IEEE Conference on Decision and Control, pp. 1212–1217, 2008. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4739098 H.-l. Choi, L. Brunet, J. P. How, and S. Member, “Consensus-Based Decentralized Auctions for Robust Task Allocation,” IEEE Transactions on Robotics, vol. 25, no. 4, pp. 912–926, 2009. X. Jia and M. Q.-H. Meng, “A survey and analysis of task allocation algorithms in multi-robot systems,” in 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, Dec. 2013, pp. 2280–2285. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6739809 D. P. Bertsekas and D. A. Casta˜non, “Parallel synchronous and asynchronous implementations of the auction algorithm,” Parallel Computing, vol. 17, no. 6-7, pp. 707–732, Sep. 1991. [Online]. Available: http://dl.acm.org/citation.cfm?id=1746086.1746168 L. Liu and D. A. Shell, “An anytime assignment algorithm: From local task swapping to global optimality,” Autonomous Robots, vol. 35, no. 4, pp. 271–286, Jul. 2013. [Online]. Available: http://link.springer.com/10.1007/s10514-013-9351-2 S. Moon, E. Oh, and D. H. Shim, “An Integral Framework of Task Assignment and Path Planning for Multiple Unmanned Aerial Vehicles in Dynamic Environments,” Journal of Intelligent & Robotic Systems, vol. 70, no. 1-4, pp. 303–313, Sep. 2012. [Online]. Available: http://www.springerlink.com/index/10.1007/s10846-012-9740-3 L. F. Bertuccelli, H.-l. Choi, P. Cho, and J. P. How, “Real-time MultiUAV Task Assignment in Dynamic and Uncertain Environments,” in AIAA Guidance, Navigation, and Control Conference and Exhibit, 2009. S. Ponda, J. Redding, H. Choi, J. P. How, M. Vavrina, and J. Vian, “Decentralized planning for complex missions with dynamic communication constraints,” in American Control Conference, 2010. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=5531232 L. Johnson, H.-l. Choi, S. Ponda, and J. P. How, “Allowing NonSubmodular Score Functions in Distributed Task Allocation,” in Proceedings of the IEEE Conference on Decision and Control, no. 1, 2012, pp. 4702–4708. L. Johnson, H. L. Choi, and J. P. How, “Hybrid information and plan consensus in distributed task allocation,” in AIAA Guidance, Navigation, and Control (GNC) Conference, 2013. [Online]. Available: http://www.scopus.com/inward/record.url?eid=2s2.0-84883679525&partnerID=tZOtx3y1 L. Johnson, H.-L. Choi, and J. P. How, “Convergence analysis of the Hybrid Information and Plan Consensus Algorithm,” in 2014 American Control Conference. IEEE, Jun. 2014, pp. 3171–3176. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6859325 S. D. Vries and R. Vohra, “Combinatorial auctions: A survey,” INFORMS Journal on computing, 2003. [Online]. Available: http://joc.journal.informs.org/content/15/3/284.short F. Kelly and R. Steinberg, “A combinatorial auction with multiple winners for universal service,” Management Science, 2000. [Online]. Available: http://mansci.journal.informs.org/content/46/4/586.short B. Mendoza and J. M. Vidal, “Bidding algorithms for a distributed combinatorial auction,” Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems - AAMAS ’07, p. 1, 2007. [Online]. Available: http://portal.acm.org/citation.cfm?doid=1329125.1329251 J. M. V. Benito Mendoza Garc´ıa, “On bidding algorithms for a distributed combinatorial auction.” Multiagent and Grid Systems, vol. 7, pp. 73 – 94, 2011.

[20] A. Land, S. Powell, and R. Steinberg, “Chapter 6 PAUSE : A Computationally Tractable Combinatorial Auction,” in Combinatorial Auctions, P. Cramton, Y. Shoham, and R. Steinberg, Eds., 2006, ch. 6, pp. 139–157. [21] I. Gurobi Optimization, “Gurobi Optimizer Reference Manual,” 2014. [Online]. Available: http://www.gurobi.com

1429

A Combinatorial Auction Framework for Decentralised Task Allocation Pau Segui-Gasco⇤ , Hyo-Sang Shin⇤ , Antonios Tsourdos⇤ , and V.J. Segu´ı† ⇤ Cyberphysical Systems Centre, Institute of Aerospace Science, School of Aerospace, Transport and Manufacturing Cranfield Univesity, MK43 0AL, United Kingdom, Email: [email protected]

† Dept. of Mechanical Engineering and Material Science, Escuela Polit´ecnica Superior de Alcoy (EPSA)

Universidad Polit´ecnica de Valencia (UPV), Plaza Ferrandiz y Carbonell 1, 03801, Alcoi, Spain

Abstract—This paper proposes a framework for decentralised task allocation in which agents find a global solution sharing local performance metrics. Leveraging the PAUSE procedure for Combinatorial Auctions we outline a decentralised version of in order to solve a task allocation problem. The algorithm has polynomial communications overheads and it can also accommodate a variety of objective functions and task dependencies. To investigate its performance, some numerical experiments were carried out and their results were compared to those of an optimal solver and a representative alternative algorithm, namely CBBA, solving the problem with three representative objective functions: MinSum, MinMax and MinAve. The results show that, overall, the proposed algorithm outperforms the current state of the art alternative, moreover the results for MinSum show that it always finds optimal solutions while it remains close to the optimal for MinMax and MinAve.

Keywords: Decentralised Task Allocation, Groups of Multiple UAVs, Combinatorial Auction, Distributed Robotic Systems, PAUSE. I. I NTRODUCTION Teams of multiple UAVs are gaining increasing popularity as an alternative to single asset solutions thanks to its versatility, resilience and its distributed nature. In order to maximise the operational capability of this new concept, researchers are steering their operation towards a network-based system where a plurality of vehicles with increased autonomy perform tasks as part of a team [1]. In such a system, remote pilots would be replaced by network operators who could inject tasks in the network and supervise their execution, rather than dealing with the trouble of individually piloting each of the vehicles which hampers scalability. The fundamental strength of this approach would hinge on: the flexibility provided by the availability of a variety of payloads in different spatial locations at the same time; the resilience offered by the multiplicity of low cost vehicles that are able to take over each other in case of a fault; and the ability to fly outside the communication range of their ground operator by using team vehicles as relays to communicate data back to the base. The successful and effective assignment of the available resources will be the key enabler of such vision in which operational advantages could be maximised. The vehicles should be able to find an answer quickly, reliably, and effectively to the question: “given the resources available in the network and the tasks that ought to be carried out, what is the best allocation of these tasks among us?”. This is known as the task allocation problem, and features the following characteristics: heterogeneous vehicles with different kinds of information available; some vehicles can communicate with some other vehicles but not all; different agents are at different locations, carry heterogeneous payloads and have different flight characteristics. These characteristics provide the power to the system, but, at the same time, they make the task allocation problem very difficult to solve.

978-1-4799-7470-2/14/$31.00 ©2014 IEEE

In general, the task allocation problem is NP-Hard in all but its simplest incarnations [2], [3]. There exist tools in the Operations Research and Combinatorial Optimisation literature that enable the centralised solution of instances of practical interest. However, the centralised solution of the problem involves having to communicate all the agents and environment data to a centralised entity. This may not be the most appropriate approach for some scenarios because the central entity removes resilience by introducing single point of failure and because the bandwidth to communicate all the information to the central entity from every agent may not be available in some communication environments. Recently, researchers are exploring ways to solve the task allocation problem in a decentralised manner to overcome the limitations of the centralised solution [4] by relying on the local assessment of the objective functions to avoid sharing all the information globally. However the field is very incipient and guaranteed optimal solutions in polynomial time with polynomial communication costs exist only for the simplest instances [5]. There is a limited amount of works that have approximation guarantees and polynomial communications costs [6]. Nevertheless, these decentralised approaches impose restrictive conditions on the objective functions that can be used. This research develops a framework that has comparable communications overheads but does not restrict the kind of cost functions that can be used. II. TASK A LLOCATION P ROBLEM In order to describe the task allocation problem tackled in this study, let us first define the set of the agents participating in the allocation as A, and the set of tasks to be allocated as T . The objective is to find the allocation of non overlapping bundles in P(T ) to agents in A spanning all the tasks in T , and optimises the objective function. P(T ) = {b|b ✓ T } denotes the power set of T , which is the set of all possible subsets or bundles of T including ; and T . The objective function J : (R+ ⇥ {0, 1})|P(T )|·|A| ! R+ maps the performance metrics of each bundle of each agent and the allocation variables to a positive real number quantifying the allocation cost or score. The performance metric (or valuation) of each agent is individual to the agent itself, as different agents might have different payloads and/or capabilities as well as different information about the tasks. Thus, each agent, a 2 A , has its own function, ca : P(T ) ! R+ , to evaluate the score of a bundle of tasks b ✓ T . This evaluation is entirely local and is based only on the information available to each individual agent. A binary decision variable referred to as xab has a value of 1 if bundle b 2 P(T ) is allocated to agent a 2 A or 0 if otherwise. Summing up, the problem can be formulated as follows:

1424

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

optimize

J((ca (b), xab ), ...), 8a 2 A, 8b ✓ T

subject to: y \ z = ;, 8y, z 2 S, y 6= z [ b=T b2S

xab 2 {0, 1}, 8a 2 A, 8b ✓ T

S = {h|a 2 A, h ✓ T , xah = 1}

where S represents the set of allocated bundles, while the first constraint means that no two allocated bundles can have overlapping tasks, and the second constraint means that the union of all allocated bundles must be equal to the tasks set T . In other words, all tasks should be allocated and each exclusively to one agent. To illustrate this formulation, an example of a problem that could be casted as an instance of this general framework could be the Multiple Traveling Salesman Problem, mTSP, with the tasks being cities or waypoints, the agents performance metrics being the distance traveled to visit a given bundle of cities, the objective function as the sum of the distances travelled in the allocated bundles, and the optimisation being a minimisation. A. Solution Approaches 1) Problem Space: This general framework can accommodate most of the relevant task allocation problems. The variety of situations that can arise from the definition of both the agents’ performance metrics and the objective function give rise to a variety of combinatorial problems. A widely accepted taxonomy that maps the nature of each problem to a well known combinatorial problem was introduced in [2] and was extended to support task dependencies in [3]. These two works provide a map of the task allocation problem space. With reference to the taxonomy in [2] the framework we propose aims to tackle problems with: both Single-Task robots (ST) and Multi-Task robots (MT); Single-Robot tasks (SR); and both Instantaneous Assignment (IA) and Time Extended Assingment (TA). With respect to the task dependencies taxonomy in [3] the framework proposed in this work can accommodate: No Dependencies, InSchedule Dependencies (ID) and some instances of Cross-Schedule Dependencies (XD). 2) Decentralised Solution Approaches: Over the past decades researchers have developed a variety of fast an efficient decentralised task allocation algorithms for the simplest instance of the problem, that is, where each agent is only allocated at most one task [7]. The first distributed Task Allocation strategy was proposed by [8], where an auction algorithm was proposed based on the idea of a shared memory model. However, the shared memory model required a topology of the networked system that is not always achievable in real scenarios. To address this issue in [5] an algorithm is proposed to handle a networked system in which agents interact with its neighbours, rather than having access to a shared database. Another approach presented in [9] based on task swapping also yields a global optimal solution. In the work [10], a recent application of a qualitatively similar algorithm for UAV task allocation in a dynamic environment is described alongside an account of its performance in real flight. Similar to [5], in [6] the authors present the ConsensusBased Auction Algorithm CBAA. This algorithm uses the concept of maximum consensus to distribute a series of single item auctions over the network, achieving guarantees of both convergence (if the network is connected) and of an optimal assignment. To solve a more complex instance of the task allocation problem in which agents are allocated bundles of tasks instead of individual tasks,

in [6] Choi et al present the Consensus Based Bundle Algorithm. It is based on a decentralisation of a greedy Combinatorial Auction algorithm. CBBA gives guarantees on convergence and approximation to the optimal and it is distributed across the network with no central entity needed. To achieve that it imposes a sub modularity, or Diminishing Marginal Gains, condition on the bids of the agents. Each agent submits a bid for each given task that is computed as the marginal reward that it obtains for being added to its current bundle. Several authors have extended CBBA to overcome some of its limitations. In [11] the authors adapt CBBA for a realistic scenario with obstacles and measurement noise. In [12] the authors introduce tasks with time windows by using a decaying reward function, handling for changing communication networks, and fuel cost reward awareness. Recently, the same group [13], [14], [15] tackles the limitation of DMG task scoring by using wrapping functions so that the bids appear (sic) as if they were submodular in the consensus space while they are handled as non-DMG in the agent own domain, consequently allowing for improved synergies within the bundles, however at the cost of surrendering all performance guarantees. B. Proposed Framework In this research, we aim to find task allocation approaches that remove the restrictions of approaches such as CBBA, and keep the fundamental advantages of its decentralised nature. CBBA leverages a simple auction model: agents add tasks to their bundles and send single task bids with the marginal value that each of the tasks creates. The auctioneer simply awards the task to the agent whose marginal gain is the highest. Given such simple auctioneer mechanism, it can then be naturally translated into a set of consensus rules, that perform the role of the auctioneer, and distribute the algorithm across the network. The key lesson is: find a framework with a very efficient auctioneer role and then substitute it by a consensus rule to achieve natural decentralisation through consensus. Unfortunately, combinatorial auctions are characterised, precisely, by a very hard auctioneer role, in fact the general Winner Determination Problem is NP-Hard [16]. To get around this, in this research we use a framework called the Progressive Adaptive User Selection Environment (PAUSE) conceived for government auction of telecom licenses [17]. It was intended as a way to simplify the auctioneer role in combinatorial auctions, with the aim that all companies involved could easily see the decisions being taken and could understand them, in order to give confidence to the bidders by guaranteeing the transparency and fairness of the final decision. Naturally, the computational burden cannot be avoided. The computational load is not removed but rather transferred from the auctioneer to the agents. This is accomplished by requiring them to submit a composite bid that encompasses not only their own bundles but also those of other agents, in such a way that all the tasks are assigned, and each task is assigned to only one agent. Hence, the role of the auctioneer is simply to keep a record of the composite bids submitted and award the tasks to those agents contained in the composite bid whose payoff is the highest. With such an efficient role for the auctioneer, the decentralisation of the algorithm is natural with consensus rules. In fact, there have already been some attempts to use the PAUSE framework for distributed P2P auctioning in e-commerce and business procurement applications [18] [19], with the view of enabling decentralised internet auctions by removing the room for the malicious bias of a centralised, and possibly non-trustworthy, internet auctioneer. Now, we give an intuitive description of the PAUSE mechanism following that of Land et al [20]. A PAUSE auction for m items is

1425

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

a multi-stage auction of m stages. In each phase, each agent must send to the auctioneer a composite bid that covers all the items. In the first stage, each agent submits a composite bid composed only of single item bids of its own. In the second stage, each agent submits a composite bid based only on a combination of the single item bids, previously submitted by all agents at stage 1, and two item bids of their own. In the third stage, each agent submits a composite bid that contains one item and two item bids and their own three items bids. The process continues and, in the general nth stage, each agent submits a composite bid that is a combination of the previously (1, 2 ... n 1)-items bids, submitted in the previous rounds by all the agents, and the agents own nitem bid. In each round, each agent can either submit a bid, if it improves the current best composite bid by a design threshold, or just wait and listen to the other agents if it cannot improve it. Note that with this mechanism, if an agent improves the bid in the final round, it will be because it is submitting a composite bid that contains only a single bid that spans all the items. With this mechanism, the job of the auctioneer simply is: to record all the bids with their valuations; and to keep a database available to the agents with the submitted bids. The winners then, are simply determined by the agents contained in the composite bid with the highest score. This mechanism was devised to improve the sense of fairness and transparency in government licensing. However, the fact that the auctioneer role is so efficient makes it naturally decentralisable. Leveraging the key ideas in CBBA and in the PAUSE framework, in this paper we propose a decentralised framework which finds the optimal allocation for a wider set of task allocation scenarios than current approaches such as CBBA.

the pairs of agent’s valuations and decision variables. A composite bid is what agents exchange among themselves and it is the result of each agent solving the allocation problem at each stage with the information available at that point in time to each of them. The information that an agent a 2 A keeps about the other agents is kept in a set Ska that contains all the bids in the composite bids previously exchanged by S other agents in the stages 1, 2, ..., k 1, more formally, Ska = Ska 1 [ ( i2A CBki 1 ) with S0a = ;. This set Ska is i6=a

used in conjunction with the agent’s own bids to find the subsequent solution to the allocation problem to produce the composite bid CBka that is exchanged with fellow agents. A set Dka is kept by agent a 2 A, containing the bids exchanged previously by other agents and any of its own that have a size smaller or equal to the corresponding bid stage, i.e. Dka = Ska [ {(b, a, ca (b))|8b ✓ T , |b| k}. Intuitively Dka is the pool from which agent a 2 A draws bids to find the optimal allocation resulting in the the composite bid of stage k. With all this, the bidding problem for agent i 2 A at stage k, given a set of bids Dki is to find a non overlapping set of bids that contains each task in T exactly once and optimises the objective function of the allocation. This problem can be formulated as: optimize subject to: y \ z = ;, 8y, z 2 S, y 6= z [ b=T b2S xab 2

{0, 1} 8(b, a, ca (b)) 2 Dki

S = {h|(h, a, ca (h)) 2 Dki , xah = 1}

III. D ECENTRALISED TASK A LLOCATION A LGORITHM Now, we describe the proposed algorithm formally. But first, we shall introduce some notation used in the exposition. Recall that we defined a function ca : P(T ) ! R+ , thus ca (b) 2 R+ implements the performance metric (or valuation) of agent a 2 A executing the bundle of tasks b 2 P(T ). Examples of this metric could be simple parameters like travelled distance, flying time or more elaborate ones such as information collected, or a combination of agent dependent rewards and costs among other things. A bid Bba is a tuple (b, a, ca (b)) 2 P(T ) ⇥ A ⇥ R+ that contains, respectively, the set of tasks involved b ✓ T , the agent who is responsible a 2 A, and its score ca (b) 2 R+ . The evaluations of these bids is fully local and is produced by each individual agent and shared as part of a composite bid. For example, consider the simple case where the performance index were the flight time and the tasks waypoints to visit. Then, a bid (b, a, ca (b)) 2 P(T ) ⇥ A ⇥ R+ would be computed locally by agent a 2 A. Agent a, in this case, would go about this by calculating that the time (in, say, seconds) needed to visit each task contained in b from its position is tb (that is, it would evaluate tb = ca (b) in our notation). Subsequently, in order to inform other agents about its capability to perform the tasks in b (i.e. visit the waypoints in b) it would aggregate this information with the task set b and its own identifier a, producing the bid (b, a, tb ). Hence, all the other agents in the set A \ {a} will “know” about the performance of agent a visiting the tasks in b is that it takes a time (in, say, seconds) given by the real number tb , rather than being able to evaluate the function by themselves. Now, a composite bid, CBka is the solution to the bidding problem at stage k by agent a 2 A, and it is the set of bids that spans all the tasks and does not contain overlapping bids while optimising the objective function J : (R+ ⇥ {0, 1})|P(T )|·|A| ! R+ that depend on

J((ca (b), xab ), ...) 8(b, a, ca (b)) 2 Dki

Having defined the bidding problem, we can formally define a composite bid as a set CBki that contains all the bids that are selected in the solution of the bidding problem that each agent i solves in the stage k, i.e. CBki = {(b, a, ca (b))|(b, a, ca (b)) 2 Dki , xab = 1}. We will assume that agent i 2 A solves optimally the bidding problem in round k by calling the routine C OMPUTE B ID(Ski , ci (), k). In practice this routine must be adapted to each individual problem with its performance metrics and objective function. The algorithm used to solve the bidding problem is not relevant as long as it provides the optimal solution for it. With this definitions we can now introduce the proposed task allocation algorithm. Initially the agents have a copy of the set T containing all the tasks that must be allocated and the set A of all the participating agents in the network. The algorithm proceeds in k 2 {1, 2, ..., |T |} stages. In the first round each agent i has not received any information from the other agents hence S1i is empty and since k = 1, D1i contains only bids of agent i itself, thus there can only be|T | bundles and consequently there can be only one possible solution to the bidding problem: CB1i contains single task bids for each of the tasks in T . At stage k = 2 agent’s i bid-set S2i contains all the single task bids exchanged at the end of round k = 1 for each of the tasks in T by each of the agents in A and finds the best allocation among them and 2-task bids of its own. The process continues until k = |T | when finally the best allocation is the composite bid with the best objective. The algorithm for an agent i 2 A is outlined in figure 1. IV. D ISCUSSION We have developed this algorithm bearing two main objectives in mind: to be decentralised with a communication overhead comparable

1426

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

A. Communication Requirements

1: procedure TASK A LLOCATION(A, T , i, ci ()) 2: for a 2 A if a 6= i do 3: CB0a ; 4: end for 5: S0i ; 6: for k 2 {1, 2, ..., |T |} do 7: for a 2 A if a 6= i and k > 1 do 8: CBki 1 R ECEIVED B ID(a, k 1) 9: end for S 10: Ski Ski 1 [ 8i2A CBka 1

In this algorithm it is assumed that all the agents receive the composite bids of all the other agents, be it through direct communication or through a mesh protocol. Using this algorithm each agent exchanges |T | composite bids. With the agents in a connected network with diameter Dm the number of messages exchanged is #msg = O(|A| · |T | · Dm ). The message size is proportional to the number of tasks |T |. Thus, it has the same scalability as CBBA with respect to communications overheads. B. Computational Complexity

a6=i

11: CBki C OMPUTE B ID(Ski , ci (), k) 12: S END B ID(CBki ) 13: end for 14: for a 2 A if a 6= i do a 15: CB|T R ECEIVED B ID(a, |T |) | 16: end for a 17: CB ⇤ OPTa2A CB|T | ⇤ 18: return CB 19: end procedure Fig. 1.

Task Allocation Algorithm for agent i 2 A

with the state of the art; and to remove the restrictions in cost metrics that current state of the art algorithms have. Now, let us make some remarks on the locality of the valuation functions and its implications. Bundle valuations by each agent do not change over the execution of the algorithm. It is assumed that once an agent has shared a bid (b, a, ca (b)) in a composite bid, the valuation ca (b) that agent a makes of bundle b does not change in subsequent rounds. This is a reasonable assumption as it only requires to have differentiated time scales for the allocation procedure and for the mission execution. This is usually the case in practice because the task execution implies that the agent must travel to different spatial location taking much longer than the allocation procedure. The valuation of the bids by each agent a 2 A, namely, the functions ca : P(T ) ! R+ , is fully local. That is, given a bundle of tasks b 2 P(T ), only agent a can compute the valuation (or performance index) ca (b) of the bundle b, all the other agents in A \ {a} only know the valuation given by agent a in the bid, not the valuation function, and cannot infer the valuation for other bundles that have not been previously shared by a. This locality is fundamental to enable cooperation of both heterogeneous agents with different capabilities and different levels of situational awareness. This is because each agent can value locally in their function ca : P(T ) ! R+ whether they have enough information to perform a task, whether they have the payload needed, whether they can fly to the task in fast enough etc. This enables the network controller to trade off between delaying the start of the allocation procedure to spend more time exchanging situational awareness in order to have more accurate valuations and tolerating some level of situational awareness discrepancy in exchange of an earlier start of the allocation procedure. Now, we shall discuss briefly the communication requirements of our algorithm and its computational complexity. Then, we shall present experimental results of its performance comparing it against CBBA with three representative objective function that are common in multi-robot routing problems: MinSum, MinMax, and MinAve, followed by a discussion of the numerical results.

Solving task allocation problem has been shown to be NP-Hard [2], [4]. In this case, the computational complexity of the algorithm is determined by the bidding problem which is NP-Hard since it is an instance of the task allocation problem of smaller size, thus solving this problem is what takes most of the computational effort. In the formulation we have outlined in the previous section the problem has, because of its generality, an exponential number of variables and constraints and it is not easily solvable in practice. In reality this is not necessarily the case because depending on the specific score functions it can be reformulated as an adapted version to well known problem. These problems have proven solutions methods in practice, examples are: traveling salesman, set packing or scheduling problems (to name but a few). These methods can be used by including an extra constraint on the bundle size to match the stage and by incorporating the solutions of the other agents. In doing so although the problem remains NP-Hard there are algorithmic tools available, such as state of the art MIP Solvers [21] or problem specific algorithms, that would enable the solution of instances of practical relevance. C. Numerical Results In order to explore the performance of our algorithm we have chosen the widely studied field of multi robot routing, that is, how to find optimal paths for a group of robots. Given a set of agents and locations to visit (tasks), there are three main objective functions that can be considered the eigenvectors of the performance metrics used in routing problems: • • •

MinSum: minimises the total sum of the costs (distances) over all agents; MinMax: minimises the maximum cost (distance) incurred by the agents; and MinAve: minimises the average cost (distance) incurred by the agents from start to the visit of each location.

As baselines to compare the performance of our algorithm we take a global optimal solution and warped CBBA. The optimal solution was computed using the commercial MIP Solver Gurobi [21]. CBBA [6], was chosen as a baseline option because it has similar communication scalability, however, because CBBA does not converge when the objective function does not satisfy the DMG we used the warped extension provided by Johnson et al [13]. Gurobi was also used to solve the bidding subproblems of each agent. We conducted a Monte Carlo simulation with 400 runs for each task number and metric to investigate the trends in the solution performance for each metric as the size of the problem increases. To this end a random scenario was created each time by placing 4 agents in uniformly random locations of a 1000m by 1000m area, the tasks were also uniformly randomly placed. The performance metrics were the distances travelled by the agents. Once an scenario was solved with the three algorithms the objective values of CBBA were normalised with respect to the optimal by:

1427

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

PAUSE

CBBA

MinSum MinMax MinAve MinSum MinMax MinAve

5 0. 0. 0. 1.94 9.84 4.15

6 0. 0.08 0.04 2.95 9.65 3.45

Task Number 7 8 0. 0. 0.08 0.17 0.04 0.09 3.45 3.68 12.9 12.31 5.01 5.77

9 0. 0.34 0.2 4.35 13.36 5.83

TABLE I M EAN R ELATIVE S CORES (%) OF PAUSE AND CBBA WRT THE OPTIMAL .

CBBA OP T 100(%) OP T and the objective values of PAUSE by: scCBBA =

scPAUSE =

P AU SE OP T 100(%) OP T

(1)

(2)

where OPT, PAUSE, and CBBA represent the objective scores attained for the optimal, PAUSE and CBBA respectively. A summary of relative scores for each of the objectives are shown in figures 2, 3 and 4 and the means of the relative scores for each case is shown in table I. While the number of tasks is small only from 5 to 9, some conclusions can be extracted regarding the underlying trends. In the MinSum metric, the PAUSE algorithm gave in each and everyone of the cases a relative score of 0%, i.e., it found the optimal solution every time, whereas for the same metric CBBA’s performance was gradually degrading as the task number increased. In the cases of the MinMax and MinAve metric, PAUSE most of the time found an optimal solution, but not always and that its performance degraded gradually as the task number increased. In all three objectives we can see that PAUSE provides significantly better results. This is probably due to the fact that PAUSE is able to consider a richer description synergies within the bundles, whereas CBBA only considers the marginal values. V. C ONCLUSIONS AND F UTURE W ORK We have presented a decentralised algorithm for the Multi Robot Task Allocation Problem with communication costs comparable to those of the state of the art such as [6]. To demonstrate its performance we have conducted numerical experiments with representative performance metrics from the routing domain: MinSum, MinMax and MinAve. The results show that the proposed algorithm is equal to the optimal for the MinSum objective and very close for the MinMax and MinAve. Our algorithm yields an overall superior performance over a comparable state of the art decentralised task allocation algorithm such as CBBA [6]. The fact that PAUSE always found an optimal solution for the MinSum objective is very interesting and could indicate that the algorithm could be proved optimal for some objective functions of practical interest. The study of the performance of the algorithm for different structural features of the objective functions in the objective space from a theoretical point of view is an open research area. This area should be explored with the aim of stabilising optimality and approximation guarantees for relevant objective functions. More work should also be devoted to the development of a consensus algorithm that enables the support for dynamic communication networks to bring its applicability closer to real world scenarios.

Fig. 2. Comparison of the distribution of the costs of CBBA using warping functions with the proposed algorithm for the MinSum metric.

Fig. 3. Comparison of the distribution of the costs of CBBA using warping functions with the proposed algorithm for the MinMax metric.

Fig. 4. Comparison of the distribution of the costs of CBBA using warping functions with the proposed algorithm for the MinAve metric.

R EFERENCES [1] H. Oh, S. Kim, A. Tsourdos, and B. A. White, “Coordinated road-network search route planning by a team of UAVs,” International Journal of Systems Science, vol. 45, no. 5, pp. 825–840, Dec. 2014.

ACKNOWLEDGMENT This work was funded by DSTL National PhD Programme.

1428

Globecom 2014 Workshop - Wireless Networking and Control for Unmanned Autonomous Vehicles

[2]

[3]

[4]

[5]

[6] [7]

[8]

[9]

[10]

[11] [12]

[13]

[14]

[15]

[16] [17] [18]

[19]

[Online]. Available: http://www.scopus.com/inward/record.url?eid=2s2.0-84892910329&partnerID=tZOtx3y1 B. P. Gerkey and M. J. Mataric, “A Formal Analysis and Taxonomy of Task Allocation in Multi-Robot Systems,” The International Journal of Robotics Research, vol. 23, no. 9, pp. 939–954, Sep. 2004. [Online]. Available: http://ijr.sagepub.com/cgi/doi/10.1177/0278364904045564 G. a. Korsah, a. Stentz, and M. B. Dias, “A comprehensive taxonomy for multi-robot task allocation,” The International Journal of Robotics Research, vol. 32, no. 12, pp. 1495–1512, Oct. 2013. [Online]. Available: http://ijr.sagepub.com/cgi/doi/10.1177/0278364913496484 M. Dias, R. Zlot, N. Kalra, and a. Stentz, “Market-Based Multirobot Coordination: A Survey and Analysis,” Proceedings of the IEEE, vol. 94, no. 7, pp. 1257–1270, Jul. 2006. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1677943 M. M. Zavlanos, L. Spesivtsev, and G. J. Pappas, “A distributed auction algorithm for the assignment problem,” 2008 47th IEEE Conference on Decision and Control, pp. 1212–1217, 2008. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4739098 H.-l. Choi, L. Brunet, J. P. How, and S. Member, “Consensus-Based Decentralized Auctions for Robust Task Allocation,” IEEE Transactions on Robotics, vol. 25, no. 4, pp. 912–926, 2009. X. Jia and M. Q.-H. Meng, “A survey and analysis of task allocation algorithms in multi-robot systems,” in 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, Dec. 2013, pp. 2280–2285. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6739809 D. P. Bertsekas and D. A. Casta˜non, “Parallel synchronous and asynchronous implementations of the auction algorithm,” Parallel Computing, vol. 17, no. 6-7, pp. 707–732, Sep. 1991. [Online]. Available: http://dl.acm.org/citation.cfm?id=1746086.1746168 L. Liu and D. A. Shell, “An anytime assignment algorithm: From local task swapping to global optimality,” Autonomous Robots, vol. 35, no. 4, pp. 271–286, Jul. 2013. [Online]. Available: http://link.springer.com/10.1007/s10514-013-9351-2 S. Moon, E. Oh, and D. H. Shim, “An Integral Framework of Task Assignment and Path Planning for Multiple Unmanned Aerial Vehicles in Dynamic Environments,” Journal of Intelligent & Robotic Systems, vol. 70, no. 1-4, pp. 303–313, Sep. 2012. [Online]. Available: http://www.springerlink.com/index/10.1007/s10846-012-9740-3 L. F. Bertuccelli, H.-l. Choi, P. Cho, and J. P. How, “Real-time MultiUAV Task Assignment in Dynamic and Uncertain Environments,” in AIAA Guidance, Navigation, and Control Conference and Exhibit, 2009. S. Ponda, J. Redding, H. Choi, J. P. How, M. Vavrina, and J. Vian, “Decentralized planning for complex missions with dynamic communication constraints,” in American Control Conference, 2010. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=5531232 L. Johnson, H.-l. Choi, S. Ponda, and J. P. How, “Allowing NonSubmodular Score Functions in Distributed Task Allocation,” in Proceedings of the IEEE Conference on Decision and Control, no. 1, 2012, pp. 4702–4708. L. Johnson, H. L. Choi, and J. P. How, “Hybrid information and plan consensus in distributed task allocation,” in AIAA Guidance, Navigation, and Control (GNC) Conference, 2013. [Online]. Available: http://www.scopus.com/inward/record.url?eid=2s2.0-84883679525&partnerID=tZOtx3y1 L. Johnson, H.-L. Choi, and J. P. How, “Convergence analysis of the Hybrid Information and Plan Consensus Algorithm,” in 2014 American Control Conference. IEEE, Jun. 2014, pp. 3171–3176. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6859325 S. D. Vries and R. Vohra, “Combinatorial auctions: A survey,” INFORMS Journal on computing, 2003. [Online]. Available: http://joc.journal.informs.org/content/15/3/284.short F. Kelly and R. Steinberg, “A combinatorial auction with multiple winners for universal service,” Management Science, 2000. [Online]. Available: http://mansci.journal.informs.org/content/46/4/586.short B. Mendoza and J. M. Vidal, “Bidding algorithms for a distributed combinatorial auction,” Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems - AAMAS ’07, p. 1, 2007. [Online]. Available: http://portal.acm.org/citation.cfm?doid=1329125.1329251 J. M. V. Benito Mendoza Garc´ıa, “On bidding algorithms for a distributed combinatorial auction.” Multiagent and Grid Systems, vol. 7, pp. 73 – 94, 2011.

[20] A. Land, S. Powell, and R. Steinberg, “Chapter 6 PAUSE : A Computationally Tractable Combinatorial Auction,” in Combinatorial Auctions, P. Cramton, Y. Shoham, and R. Steinberg, Eds., 2006, ch. 6, pp. 139–157. [21] I. Gurobi Optimization, “Gurobi Optimizer Reference Manual,” 2014. [Online]. Available: http://www.gurobi.com

1429