THE INTERPLAY BETWEEN NETWORKS AND

0 downloads 0 Views 2MB Size Report
We validate our algorithm with physical robots in an indoor environment and demonstrate that we ...... [40] R. Jonker and A. Volgenant. A Shortest Augmenting ...
THE INTERPLAY BETWEEN NETWORKS AND ROBOTICS: NETWORKED ROBOTS AND ROBOTIC ROUTERS

by Marcos Augusto Menezes Vieira

A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE)

December 2010

Copyright 2010

Marcos Augusto Menezes Vieira

Acknowledgements I would like to thank Professor Ramesh Govindan and Professor Gaurav S. Sukhatme for being excellent advisors, giving me the opportunity to get involved in a wide range of projects, and guiding me throughout those projects. I also thank Professors Bhaskar Krishnamachari, Nenad Medvidovic, and Milind Tambe for serving on my committee, and for their valuable feedback and suggestions. I would like to acknowledge my collaborators. Tenet was a joint work with Omprakash Gnawali, Jeongyeup Paek, Ki-Young Jang, Ben Greenstein, John Hicks, August Joki, Prof. Eddie Kohler, and Prof. Deborah Estrin. Thanks to Lamia Chouaieb and Niklas Goddemeier for the contribution to the wall-following component in the Pursuit-Evasion Game project. The micromotion work was a joint work with Matthew E. Taylor, Prateek Tandon, Manish Jain. Although not part of this dissertation, the Collective Transport of Robots project was a joint work with Megha Gupta, Jnaneshwar Das, Hordur Heidarsson, Harshvardhan Vathsangam. I am also grateful to Professor David Kempe for letting me be the USC Programming Coach for many years. I would like to thank my labmates from whom I also learned from their experience in the PhD process. Finally, thanks to my family and friends for their support.

ii

Table of Contents

Acknowledgements

ii

List Of Tables

v

List Of Figures

vi

Abstract Chapter 1: Introduction 1.1 Design Space . . . . . . . . . . . . . . . 1.2 Dissertation Overview and Contributions . 1.3 Dissertation Statement . . . . . . . . . . 1.4 Dissertation Outline . . . . . . . . . . . .

viii . . . .

1 3 4 6 7

Chapter 2: Literature Review 2.1 Pursuit-Evasion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Robotic Routers: Micro-Motion . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Robotic Routers: Macro-Motion . . . . . . . . . . . . . . . . . . . . . . . . . .

8 8 11 14

Chapter 3: Scalable and Practical Pursuit-Evasion with Networked Robots 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Assumptions, Terminology and Definitions . . . . . . . . . . . . . . . 3.3 Optimal Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Equipotent players . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Superior Evader . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Optimal number of players . . . . . . . . . . . . . . . . . . . . 3.4 Partition Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Network-Assisted Localization . . . . . . . . . . . . . . . . . . 3.5.3 Wall-following . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4 Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17 17 20 23 24 29 31 35 38 39 40 40 43 45 47

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . . . . . . . .

. . . .

. . . . . . . . . . . . . .

. . . .

. . . . . . . . . . . . . .

. . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

iii

3.6 3.7 3.8

Simulation . . . . . . . . 3.6.1 Optimal Strategy 3.6.2 Partition Strategy Experiments . . . . . . . Conclusions . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

Chapter 4: Mitigating Multi-path Fading in a Mobile Mesh Network 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Can Micro-Motion Improve Throughput? . . . . . . . . . . . . 4.3.1 The Robot Platform . . . . . . . . . . . . . . . . . . . 4.3.2 Configuration . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Throughput Improvement . . . . . . . . . . . . . . . . 4.4 Using Distributed Reasoning for Micro-Motion Based Throughput Improvement . . . . . . . . . . . . . . . . . . . . . 4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Local Metrics . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 Optimizing with More Positions . . . . . . . . . . . . . 4.5.3 Temporal Variation . . . . . . . . . . . . . . . . . . . . 4.5.4 An experiment with more robots . . . . . . . . . . . . . 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

48 48 51 52 56

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

58 58 61 62 63 64 65

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

68 71 71 73 76 77 78

Chapter 5: Towards Autonomous Wireless Backbone Deployment in Highly-Obstructed Environments 79 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.3 Solution Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.3.1 Radio Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.3.2 Attenuation in the Presence of Walls . . . . . . . . . . . . . . . . . . . . 86 5.4 The PFWD Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.4.1 Potential Fields in PFWD . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.4.2 The Deployment Decision in PFWD . . . . . . . . . . . . . . . . . . . 92 5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.5.1 Validating the PFWD Tool . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.5.2 Validating the PFWD Algorithm . . . . . . . . . . . . . . . . . . . . . . 96 5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Chapter 6: Conclusion

100

References

102

iv

List Of Tables

2.1

Related Work in Pursuit-Evasion . . . . . . . . . . . . . . . . . . . . . . . . . .

9

3.1

Instances of Games and their properties . . . . . . . . . . . . . . . . . . . . . .

48

3.2

Games and their properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

5.1

Necessary SNR for required data rate . . . . . . . . . . . . . . . . . . . . . . .

93

5.2

SNR Error Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

v

List Of Figures

1.1

Design Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

3.1

Topology with three corridors where the worst-case capture time varies with the number of pursuers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

3.2

Capture time as a function of the number of pursuers . . . . . . . . . . . . . . .

35

3.3

An instance of a game which will terminate with huge capture time if pursuers play a greedy strategy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

3.4

Team of Creates which depicts our robot platform - the iRobot Create with an Ebox 42

3.5

Layout of the two-tier network testbed which is composed of Stargates (uppertier) and wireless sensor motes (lower-tier) . . . . . . . . . . . . . . . . . . . . .

43

3.6

Network-assisted localization estimates: unfiltered estimates (bottom) are noisy, filtered estimates (top) eliminate noise. . . . . . . . . . . . . . . . . . . . . . . .

45

3.7

Localization Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

3.8

A game where pursuer greedily follow the evader will not terminate in a cylinder Grid Topology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

3.9

Given our definition of capture, a 2-2 game will not terminate in a ring topology.

50

3.10 The communication latency for detecting the robots. . . . . . . . . . . . . . . .

53

3.11 An illustration of 4 pursuer, 2 evader game sequence. The nodes and solid undirected edges connecting them represent the topological map. The solid directed lines (lines with arrows) show the pursuer path as determined by the localization system. The dashed lines illustrate the evader path. The edge labels represent the time sequence of the robot. The pursuer and evader’s initial positions are indicated by the corresponding icons. . . . . . . . . . . . . . . . . . . . . . . .

54 vi

3.12 The Mean Capture Time for three game configurations played with centralized and distributed location updates. The third bar is the average capture time from simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

4.1

This picture shows part of the experiment set up, which has a team of iRobot Creates with an Ebox. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

4.2

Initial Configuration of Team of Creates . . . . . . . . . . . . . . . . . . . . . .

64

4.3

UDP Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

4.4

TCP Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

4.5

Sorted Total UDP Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

4.6

Sorted Total TCP Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

4.7

This figure depicts a three agent DCOP. . . . . . . . . . . . . . . . . . . . . . .

69

4.8

Improvement per metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

4.9

Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

4.10 Static Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

4.11 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

5.1

A sketch of the problem and a solution over a floorplan . . . . . . . . . . . . . .

81

5.2

Triangle inequality does not hold for path loss . . . . . . . . . . . . . . . . . . .

86

5.3

The space partition of wireless signals might result in disjointed cells . . . . . . .

87

5.4

Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

5.5

Robot platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

5.6

Environment with 3 clients and 3 robots . . . . . . . . . . . . . . . . . . . . . .

97

5.7

Environment with 4 clients and 2 robots . . . . . . . . . . . . . . . . . . . . . .

98

vii

Abstract

In this work, we explore the interplay between robotics and networks. Robots can benefit from an embedded network and also the network can benefit from the mobility of the robots. We investigate the design space where the application is oriented towards robotics or networking, considering how much information about the environment is provided and what the sensing capabilities are. At one end, the robots can benefit from the resources of an environment-embedded network, in which robot sensing and communication is enhanced by the network. We consider the design and implementation of practical pursuit-evasion games with networked robots, where a communication network provides sensing-at-a-distance as well as a communication backbone that enables tighter coordination between pursuers. Using the theory of zero-sum games, we develop an algorithm that computes the minimal completion time strategy for multi-pursuit multi-evasion when all players make optimal decisions based on complete knowledge. We then describe the design of a real-world mobile robot-based pursuit evasion game. We validate our algorithms by experiments in a moderate-scale testbed in a challenging office environment. We then show that the network can also benefit from Robotics by taking advance of micro and macro motion. Robots can mitigate multi-path fading. We design a system that allows robots viii

to cooperate and improve the real-world network throughput via a distributed cooperation framework. A mobile wireless network can also be quickly and autonomously deployed in urban search and rescue efforts, forming a communication substrate. We study the problem of determining the minimum number of mobile robots and how to position them so all clients are connected. Our approach to the problem is based on virtual potential fields where we treat each client as a virtual charged particle. We validate our algorithm with physical robots in an indoor environment and demonstrate that we are able to get feasible solutions.

ix

Chapter 1

Introduction

With the advances in processor, memory, sensor and radio technology, COTS components enable the design of robots with wireless capabilities and also embedded wireless networks with computational, sensing and actuation capabilities. An embedded network can provide sensing, communication and computational resources to the robots. The robots make use of the resources of an environment-embedded network, in which robot sensing and communication is enhanced by the network. The embedded network allows simple robots to perform complex tasks such as pursuit-evasion wherein robots must pursue and catch evaders. On the other hand, by enabling wireless components with actuation capabilities, we can have robots acting as routers. A team of networked robots can provide a communication substrate, establishing a wireless mesh network. Robots might mitigate many of the communications problems present in urban settings, such as relaying signals into shadows and making small adjustments to reduce multi-path effects. The mobile mesh network can autonomously optimize its configuration, increasing performance. Robots as routers have applications in various settings 1

such as forming a connection backbone in an infrastructure-less setting [1] or venturing where humans can not go to search for survivors in disasters.

In this dissertation, we explore the space where robots can benefit from an embedded network and also when the network benefits from the mobility of the robots.

At one end, the robots can benefit from the resources of an environment-embedded network, in which robot sensing and communication is enhanced by the network. We consider the design and implementation of practical pursuit-evasion games with networked robots, where a communication network provides sensing-at-a-distance as well as a communication backbone that enables tighter coordination between pursuers.

We then show that the Network can also benefit from Robotics by taking advance of micro and macro motion. One of the main sources of radio signal fading in urban settings is multi-path propagation. Robots can mitigate multi-path fading by making small movements. We design a system that allows robots to cooperate and improve the real-world network throughput via a distributed cooperation framework. A mobile wireless network can also be quickly and autonomously deployed in urban search and rescue efforts, allowing searchers to communicate even when no other infrastructure exists. We study the problem of determining the minimum number of mobile robots and how to position them so all clients are connected. Our approach to the problem is based on virtual potential fields where we treat each client as a virtual charged particle. We validate our algorithm in an indoor environment. 2

1.1 Design Space Here, we describe the design space of the iteration between robotics and networks. The design space of the combination of robots and routers can be classified across three dimensions, as depicted in Figure 1.1.

Design Space Wireless Backbone Deployment

Sensing Capabilities Complete

Minimal

Pursuit-Evasion

Goal Coarse-grained

Fine-grained

Location

Robotics

Networks Mitigating Multi-path fading with Robotic Routers

Figure 1.1: Design Space

• Goal: the application is oriented towards Networking or Robotics. Robots can make use of network capabilities for communication, computation and sensing. Networks can make use of actuation capabilities from robots to improve wireless performance by avoiding multipath fading. Networks can also take advantage of mobility to increase its coverage space, connecting a set of clients togethers by forming a backbone. • Sensing Capabilities: indicates the set of sensors the robots need. Robots might have minimal capabilities. Robots with odometer (and nothing else) will not be able to maintain 3

accurate metric estimates of location and might thus be operating virtually “blind”. Fortunately, a wireless sensor network can increase their sensing-at-distance capabilities. On the other hand, robots might have a complete set of sensors. For instance, with a GPS, or with a camera/scanner coupled with an accurate prior map, a robot can pinpoint its position precisely. • Location: it indicates how much “useful" information a robot has about the region within which its movement will occur. Robots can have coarse-grained or fine-grained information. In a coarse-grained system, the environment is modelled as a topological map, whose nodes correspond to coarse-grained regions and whose links connect neighboring regions. Systems with fine-grained location have a more accurate prior map and might also include the position of obstacles and building structures. Our work investigates the design of systems represented by triangles in Figure 1.1. The Pursuit-Evasion is designed to use minimal sensing capabilities and coarse-grained location. Mitigating multi-path fading is also design to use minimal sensing capabilities and coarse-grained location but it is networked oriented. Finally, to guarantee the communication of a set of clients by deploying a wireless backbone, we explore the space where robots have a map of the environment that includes building structures and obstacles and are capable of navigating using enough sensors.

1.2 Dissertation Overview and Contributions The underlying theme in our research is studying, understanding and exploring the interplay between Networking and Robotics. 4

Initially, we show that the robots can benefit from the resources of an environment-embedded network, in which robot sensing and communication is enhanced by the network. To this end, we provide the following main contributions. First, we describe an optimal capture time algorithm to minimize the time to capture of a multi-pursuer multi-evader game, even with optimal adversarial evader behavior. Second, we extend this algorithm to the case when the evaders are infinitely faster than the pursuers. Although these algorithms are optimal, they are exponential in the number of players. Thus, our third contribution is a scalable algorithm which guarantees performance close to optimal. We then describe the design of a real-world mobile robot-based pursuit evasion game. Finally, we validate our algorithms by experiments on a moderate-scale real-world testbed. At the other end, we show that the network can also benefit from Robotics by taking advance of micro and macro motion. Our main contribution when networks take advantage of micromotion are as follows. First, we show experimentally that, at least on our testbed, up to a factor of three improvement in TCP throughput can be obtained by robotic micro-motion. This is encouraging, since our experiments were fairly adversarial, suggesting that similar gains could be had in other environments. Second, we present the design of a practical system for coordinated robotic micro-motion. This system contains a novel use of a distributed constraint optimization framework: in this framework, robots make local measurements of a wireless performance metric, then decide, in coordinated fashion, which robot should move, and in what direction. This computation is executed iteratively, until the network converges to an improved throughput state. An important component of this framework is the choice of wireless performance metric: we empirically explore four different metrics, and show that a carefully chosen local metric can achieve a near-optimal performance. Finally, we evaluate our system with physical robots in an indoor 5

environment and demonstrate that we are able to get an average global throughput improvement of 30% while maximizing only local metrics and with no a priori knowledge of the environment. Finally, we explore the space where Networking can benefit from macro motions of mobile robots. We study the problem of determining the minimum number of mobile robots and how to position them so all clients are connected and the each link can provide some throughput requirements. Our approach to the problem is based on virtual potential fields.A team of mother ship robots deploys robots while following the combination of virtual forces. The Potential Fields determine where the mother ship robot should move, guaranteing that they will meet at a rendezvous point. While moving, the mother ship robots execute the connectivity algorithm, which guarantees all clients will be connected. The main contributions of this work are as follows. First, we describe an algorithm based on virtual potential fields to solve the problem even in a non-metric space or in the present of obstacles. Secondly, we present analysis for the algorithm performance. We show that the algorithm is optimal for simple cases using our simulator. Finally, we evaluate our system with physical robots in an indoor environment and demonstrate that we are able to get feasible solutions.

1.3 Dissertation Statement

Simple robots acting as a part of Mobile Networked Systems can perform multi-robot complex tasks efficiently, benefiting from sensing, communication and computational network resources. Moreover, Network Systems can also benefit from autonomous mobility to improve network performance. 6

1.4 Dissertation Outline This dissertation is organized as follows. We provide the literature review in Chapter 2. In Chapter 3, we describe our work on using wireless sensor network to enable simple robots to perform complex tasks such as Pursuit-Evasion Game in Chapter 3. Next, in Chapter 4, we describe the design and implementation of a distributed system composed of robotic routers that mitigates multi-path fading in a mobile mesh network by moving small steps ( we call it micro-motion). In Chapter 5, we study the case when robotic routers move in bigger steps (we call Macro-Motion) so that the minimum number of robots can guarantee connectivity in Indoor Environments. In Chapter 6, we summarize our work and discuss future work.

7

Chapter 2

Literature Review

We discuss research that falls into Pursuit-Evasion and Robotic Router categories.

2.1 Pursuit-Evasion Our broad interest is in the theme of using pervasive sensing and communication infrastructure with actuation. In the robotics and sensor networking communities this area has burgeoned recently. Examples include investigations on sensor-network guided robot localization [19], sensornetwork guided robot navigation [46, 59, 6, 8] and exploration [53], robot-assisted sensor network localization [19], robot-assisted sensor network deployment [32, 18, 53] and sensor-network guided robot pursuit-evasion [58]. Like with body of work discussed above, we are interested specifically in the coordination and control of actuation using pervasive sensing and communication infrastructure. We distinguish our work (both in the choice of problem and the solution space we explore) from these approaches by 1. choosing a problem (PEG) which is only beginning to be addressed in the embedded network and robot context, yet has a rich history in the classical literature (using robots alone), 2. 8

choosing an experimental testbed based on unmodified commodity hardware (the Creates from iRobot and the TelosB motes). We note that the sensor networking community has championed (for a decade) several generations of the “mote” devices as minimal yet useful sensing and communication nodes. These have served as a baseline on which to implement and test ideas on sensor network routing, datadissemination, localization, and many others. Pursuer to Evader Ratio Pursuer Visibility Evader Visibility Information Environment Capture Faster Entity Capture Time Robot Implementation

[74] ≥1

[5] ≥1

[71] 1

[29] ≥1

[36] ≥1

[76] ≥1

[58] 1

Our work À1

full

full

full

local

local

local

full

full

full

full

full

full

full

local

none

full

full Graph touch any yes

full Graph touch same no

full Polygon touch same no

full Polygon see same no

full Polygon touch evader no

no Polygon touch any no

pursuer full Polygon touch same no

full Graph touch any yes

None

None

None

None

None

Partial

Partial

Full

Table 2.1: Related Work in Pursuit-Evasion To situate our work in the existing literature, we classify the type of PEGs using seven criteria: the ratio of the number of pursuers to the number of evaders; whether pursuers and evaders have full and/or global visibility, or whether they can only see within a threshold distance or until occluded by an obstacle (usually modelled by the edge of a polygon in 2D); what additional information robots have with respect to the opponents’ strategy or planning algorithm; whether the environment is modelled as a graph (discrete) or a polygon (continuous half-space with lines in 2D as boundaries); how the evader is captured, whether by being surrounded, seen or sensed by the pursuer, or approached within a certain distance, or physically contacted; the relative speed between the pursuer and evader;and, if the time to capture is important. 9

Table 2.1 shows a classification of the related work in the literature along these dimensions. Our work is distinct from several pieces of prior work. Its novelty is clear in several dimensions: while other work has explored theoretical bounds on eventual capture [5, 71], or pursuitevasion under constrained geometries [29, 54, 10, 15], or has examined sophisticated control strategies [76, 58], our proposed work attempts to minimize the time of capture of a multi-pursuer multi-evader under the pragmatic realization of physical multi-robot games. Complementary to our work, [74] discusses a dynamic programming algorithm to maintain connectivity in a team of robotic routers in the case where there is one user moving in an adversarial trajectory. The problem is modeled as a pursuit-evasion game, with the goal of finding the shortest escape trajectory. There is no discussion on how to initially detect the evader. The algorithm has complexity O(v3(p+1) ). They presented simulation results and were seeking an improved running time. In [9], an algorithm to determine if K pursuers are sufficient to capture an evader is presented. They also showed that every graph is topologically equivalent to a graph with pursuer number at most two. In the survey presented by Alspach [7], a number of references on the necessary number of pursuers for a given graph class can be found. Aigner and Fromme [5] proved that in a planar graph G, 3 pursuers are sufficient for the pursuers to win the game. Quilliot [65] extended this result, giving an upper bound to the number of pursuers depending on the genus of the graph G. In [56], the necessary number of pursuers is studied under three graph product operations. In [27], given certain conditions, the complexity of pursuit on a graph is Exptime-complete. For superior evaders, Seymour [70] showed that finding the necessary number of pursuers to capture a single evader with infinite velocity in a graph G when pursuers and evader move 10

simultaneously is equivalent to finding the treewidth of G. Other works [78][77] focus on an Euclidean open plane and provide sub-optimal real-time approaches. We are the first to present a scalable algorithm to minimize the time to capture of a multipursuer multi-evader game, and to validate the complete system in a real-world implementation.

2.2 Robotic Routers: Micro-Motion Using small movements to combat the multi-path fading effects in complex environments has promise and this paper is not the first to examine such effects. In [48] [72], the authors showed that, for a pair of nodes, micro-motion can increase receive signal power and improve packet reception more than any coding scheme could achieve. We, on the other, focus on an approach to improve global network throughput using explicit coordinated micro-motion. Other work includes [49], where the authors propose a methodology for exploiting multi-path fading by controlling the robot according to radio signal strength. They solve the problem of how many samples are needed for given communications performance and how they should be spaced. They provide lower bounds on the number of samples for a single robot. Using 802.15.4 radio, they also show there is room for improvement (as much as 20 dB in RSSI). Other approaches have leveraged more general forms of mobility (beyond micro-motion) for network throughput improvement or to build and configure mobile mesh networks. Early theoretical work [28] shows that mobility increases capacity with random source-destination pairs with loose delay constraints. Other work [35] considers the problem of controlling a team of robots to ensure end-to-end communication. To mitigate environmental interference, they propose two 11

different metrics, point-to-point signal strength and data throughput, to monitor the network connectivity of the system. Even ad-hoc communication protocols pose difficult challenges during multi-robot experimentation, as shown by the authors of [82]. However, their focus is not on micro-motions, they need a map of the environment and optimizing network throughput is not one of their goals. Complementary to our work, [74] discusses a game-theoretic dynamic programming algorithm to guarantee that a single mobile user is connected to a base station by moving a chain of robotic routers. Multiple-input multiple-output (MIMO) [61] techniques with multiple antennas[30] take advantage of spatial diversity and spatial multiplexing and can improve the performance by avoiding deep fades through diversity. Our approach is complimentary, since it uses explicit micro-motions to improve performance. A Delay-tolerant networking (DTN) [22][38] is a computer network that may lack continuous network connectivity but is still operable. DTNs can take advantage of mobility to deliver messages. Unlike DTNs, where nodes may only have intermittent connectivity, our work applies in the context of mesh networks where a communication backbone of this exists in the network. The distributed constraint optimization framework has been studied extensively in the multiagent literature. In [37], the D-C EE framework is presented to study the problem of how to coordinate mobile nodes to maximize the cumulative RSSI. The paper’s focus is on algorithms to study the trade-off between exploration and exploitation. We, on the other hand, focus on different local metrics (SNR, PRR) and how it affects the overall network. We quantify how much gain the network can benefit from small movements and how we can design a system to improve the real-world network throughput. 12

In addition to the DCOP work discussed in earlier sections, previous work in distributed constraint reasoning in sensor networks [45, 83] uses a precursor method to the DCOP formulation which does not handle unknown reward matrices. Marder et al. [51] formulate dynamic sensor coverage as a “potential game," which is similar to a DCOP. However, like other DCOP work, the reward matrix is known, there is no time limit, and only final reward is considered. Cheng et al. [14] suggest an approach for coordinating a set of robots based on swarm intelligence, however the objective of the work is to disperse the robots evenly within a specified shape, and not to optimize the signal strengths across the network. Correll et al. [20] look at optimizing a wireless network of mobile robots using a distributed swarm optimization, but are concerned with changing the topology (i.e., neighbors) of the network rather than optimizing signal strength. Gerkey et al. [24] address a similar problem, but use auction mechanism and the goals of agents are significantly different (agents modify the topology of the network and on-line reward is not emphasized). Farinelli et al [23] perform decentralized coordination on physical hardware using factor graphs, however, rewards are known and cumulative reward is not considered.

2.3 Robotic Routers: Macro-Motion In [16], a similar problem is presented. Given a set of clients, robots move following a bioinspired algorithm to increase communication coverage. Their work is designed for an unknown environment so, unfortunately, the proposed algorithm has no guarantees that the network will be connected. We, on the other hand, take advantage of a known global map to guarantee connectivity. We also do not focus on coverage but forming a backbone, which requires less number of 13

robots. Their work consider node failures, which we left as future work. Finally, we evaluated on a more significative network metric (end to end TCP throughput) instead of RSSI signals. Our work is inspired by three prior pieces of work on area coverage, where two of them make uses of Potential Fields. Howard et al. [34] present a distributed and scalable approach to deploy robots such that the area covered by the network is maximized. Each node is repelled by both obstacles and by other nodes, thereby forcing the network to spread itself throughout the environment. Poduri et al. [64] extended the previous work to maximize the area coverage with the additional constraint that each node has at least K neighbors. Another important work on deployment to increase coverage is [33], which uses a more simplistic radio model, considering only if radios have line-of-sight or not. In contrast to area coverage, PFWD focuses on guaranteeing connectivity between clients. A body of work has examined the problem of guaranteeing connectivity between a single mobile user and a base station, using a team of robots. Tekdas and Isler [74] presents a game theoretic approach: the problem is modeled as a pursuit-evasion game, with the goal of finding the shortest escape trajectory. Stump et al. [73] show that the quality of connectedness can be represented by the Fiedler value (second-smallest eigenvalue of a weighted Laplacian) from algebraic graph theory; by increasing the Fiedler value, the connectedness is indirectly improved. Our problem setting is qualitatively different, that of deploying a team of robots to form a backbone for a collection of static clients. A related problem that has received some attention is that of maintaining connectivity among a team of mobile robots. Esposito and Dunbar [21] study the problem of maintaining connectivity for a team of robots in the presence of obstacles. The achieve this goal by adding potential fields corresponding to line-of-sight and distance range, guiding robot navigation. Their approach is 14

different from PFWD in two ways: they are not constrained by fixed client positions; moreover, by constraining wireless propagation to line-of-sight they cannot leverage wireless propagation through walls, as we do. Hsieh et al. [35] study a complementary problem to ours, that of driving a team of robots to specific locations along a parameterized curve while maintaining point-to-point communication links. Two different metrics, point-to-point signal strength and data throughput, were used to monitor the network connectivity of the system. Their approach receives as input specific goal positions for each robot in the team; in PFWD, the goal positions are the output of the system. Several other tangentially-relevant pieces of work exist. Zeiger et al.[82] explore the performance of ad-hoc routing protocols by moving a single robot in a field of static wireless nodes. Hseih et al. [81] address the problem of constructing radio signal strength maps with multiple robots. More precisely, they determine what sequence of moves a team of robots must perform to sample all edges in a given graph. The goal is to minimize the total number of robots’ moves such that all edges in a given graph are sampled. Fall and others [22][38] discuss a delay-tolerant network architecture which uses mobile elements that may be intermittently connected, a goal different from that of PFWD. As discussed previously, it is possiblel to use geometric techniques based on Voronoi Diagrams for deployment of robotic networks [11] but this techniques are not designed for non-metric spaces. A related problem in optimization is the Steiner Tree Problem. Given a weighted graph G = (V, E) and a subset R ⊆ V , the goal is to find the smallest tree connecting all the vertices in R. This problem differs from the Minimum Spanning Tree in the sense that it allows to select intermediate connection points to reduce the cost of the tree. The Steiner Tree Problem is useful 15

for wire networks but is not designed for wireless networks, where the radio signal propagates in all directions. Finally, our problem bears superficial resemblance to the facility location problem in operations research. The objective is to open a facilities so that each client can be connected to a facility while minimizing the total connectivity cost (there is a cost for connecting each client to a facility). In a non-metric space, Hochbaum [31] developed a O(log n) approximation. Unfortunately, this formulation does not apply to our multi-hop network, since we also need to consider the connectivity between facilities (robots in our case).

16

Chapter 3

Scalable and Practical Pursuit-Evasion with Networked Robots

3.1 Introduction We are motivated by practical problems in security and monitoring for large, structured, spaces (e.g., to ensure the integrity of a large building or complex). The problem we focus on is pursuitevasion wherein robots must pursue and catch evaders. In Pursuit-Evasion Games (PEGs), multiple robots (the pursuers) collectively determine the location of one or more evaders, and try to corral them. The game terminates when every evader has been corralled by one or more robots. Since the definition of the discrete pursuit-evasion game by Parsons [60], PEGs have received significant attention [5][56][65][29][71]. The main difference between existing approaches and our approach is that we play the multi-pursuer multi-evader game with full knowledge of the game. Pursuers make use of an embedded sensor network to have complete knowledge of the game. The main contributions of our work are as follows. First, we describe an optimal capture time algorithm (Algorithm 1) to minimize the time to capture of a multi-pursuer multi-evader game, 17

even with optimal adversarial evader behavior. Second, we extend this algorithm to the case (Algorithm 2) when the evaders are infinitely faster than the pursuers. Although these algorithms are optimal, they are exponential in the number of players. Thus, our third contribution is a scalable algorithm (Algorithm 3) which guarantees performance close to optimal. Finally, we validate our algorithms by experiments on a moderate-scale real-world testbed. Several versions of PEG exist. In certain frameworks, it is acceptable to merely “sight” an evader for it to be “located” [29], in others, a precise coordinate must be reported [58]. Other formulations insist on a certain speed of convergence with fewer constraints on accuracy [5, 71]. Finally, formulations vary depending on whether the multi-robot control algorithm is required to have provably correct behavior, whether the number of evaders is known a priori, and whether they are malicious or benign. Each variation of the problem brings with it a different set of challenges, and several of these variations have been solved to varying degrees. In section 2, we provide a detailed review of the literature. We focus on PEGs in bounded, spatially complex environments similar to today’s office environments. In such environments, we can assume imperfect geometric regularity (e.g., the presence of corridors and 90-degree turns, but possibly a regular placement of doorways or elevator exits). However, it is increasingly true that such environments are well provisioned with wireless communication capability, and that many such environments will likely have dense embedded sensing (for surveillance or environmental control). This environmental embedded sensing can provide, for the players, full knowledge of the game. We consider the class of PEGs played on a discrete graph [5][56][65]. Specifically, we use a topological map of the environment, whose nodes correspond to coarse-grained regions and whose links connect neighboring regions [44, 52, 74]. Discrete graph-based games are acceptable 18

for many uses of pursuit-evasion (e.g., surveillance, finding survivors). Of course, the actual games execute in a continuous space, but the discrete graph is used to define our game model and to localize the participants. The pursuers make use of the resources of an embedded network. This network provides sensing, communication and computational resources to the robots: Sensing: In the traditional setting the only sensing resources available to robots are the ones they carry. In our setting robots, via communication, effectively sense-at-a-distance. Networked sensing can provide proprioceptive as well as exteroceptive sensing (e.g. network nodes can track robot pose as well as environmental features) and potentially provides sensing redundancy. Communication: In addition to effectively increasing the sensing range of robots, the presence of the embedded network effectively extends the inter-robot communication range and improves the quality of inter-robot communication. Computation: The sensor network could potentially be viewed as a computational resource by the robots — this is particularly true since sensor networks are no longer confined to large numbers of very simple networked processors. This means that the robot planning and coordination algorithms could run on one (or more) of the more capable nodes in the network, while the robots’ onboard computation is restricted to low-level control loops and sensor filtering. In this work, we consider a version of the game in which p pursuers collectively attempt to capture r evaders (Section 3.2). We are interested in the convergence time of the game (i.e., the minimum number of steps for the pursuers to capture the evaders). We start by presenting the first provably minimal convergence time algorithm for multiple pursuers and multiple evaders(Section 3.3). In our formulation, all players have complete knowledge of the state of 19

other players, and the evaders also pursue an optimal evasion strategy. Based on the optimal policy that pursuers should use in order to capture evaders with the minimum number of steps, we design an assignment algorithm that optimally decomposes the game (Section 3.4). Unfortunately, the computation required for this optimal algorithm does not scale beyond a small number of participants (less than 10, with today’s technology). To provide a practical and scalable system, we consider an approximate strategy: we decompose the multi-player game into multiple multi-pursuer single-evader games. We prove that our approximation terminates, has bounded captured time, is robust, and is scalable in the number of robots, being suited for practical applications. Clearly, convergence time depends on the topology as well as on the number of pursuers and evaders: we explore these dependencies for a few canonical topologies (Section 3.6). Finally, we present results from running an implementation of our algorithm on a physical robot testbed (Section 3.7). The experiments confirm the feasibility of our algorithm, even on a worst-case initial configuration.

3.2 Assumptions, Terminology and Definitions In this section, we start by stating the sensing and communication assumptions for our PEGs, then discuss the class of games we are interested in. We then lay down some terminology, and formally define the objective of our PEGs. This sets the stage to present the optimal capture time algorithm and also a scalable near-optimal algorithm for PEG, which are discussed in the next sections. We focus on PEGs in bounded, spatially complex, environments similar to today’s office environments. In such environments, we can assume imperfect geometric regularity (e.g., the 20

presence of corridors and 90-degree turns, but possibly a regular placement of doorways or elevator exits). Because such environments are obstructed, they present limited line-of-sight visibility. However, it is increasingly true that such environments are well provisioned with wireless communication capability, and that many such environments will likely have dense embedded sensing (for surveillance or environmental control). In this work, we assume such network-assisted environments; these environments provide sensing-at-a-distance to circumvent line-of-sight limitations. Moreover, they provide a network communication capability that enables much tighter coordination than would have been possible otherwise. More specifically, the network a) contains sensors that are able to approximately localize all participants, and b) provides a communication backbone that enables participants to exchange game state. Based on these assumptions, we consider the class of PEGs in which all participants have complete (but possibly imprecise) knowledge of the positions of all participants. Furthermore, our PEGs are played on a discrete space: we model the environment as a topological map, whose nodes correspond to coarse-grained regions and whose links connect neighboring regions [44, 52]. Our discrete-space assumption is acceptable for many uses of pursuit-evasion (e.g., surveillance, finding survivors) and we argue that more precise capture can be implemented with the addition of a simple proximity sensor (e.g., a low resolution camera) if necessary. We are interested in the class of games where enough pursuers (we make this more precise in the next section) exist to guarantee termination. Initially, we also assume that pursuers and evaders move at the same speed (more precisely, we assume that they move exactly one hop in the topology at each time step). We later relax this assumption and consider even the case where evader can move faster than pursuers. 21

Within this framework, we are interested in the optimal strategy that pursuers and evaders should play, where our measure of optimality is the capture time (defined below). Before we discuss this, we lay down some terminology. Let G = (V, L) be a finite connected undirected graph with V vertices and L links or edges. There are two sets of players called pursuers P and evaders E. Initially, P and E occupy some vertices of G. In describing the algorithm, we assume that time is discrete and increments at steps of 1; in our implementation, of course, we make no such assumption. At each time step, all pursuers and evaders are given the positions of all participants. Both teams play a game on G according to the following rule. At each step, each pursuer chooses a neighboring vertex of G to move to, the evaders do the same. They then move to the corresponding vertex in G, as defined in [57], and repeat the previous step. The team of pursuers P wins if it “captures” all evaders. If an evader can avoid capture indefinitely, then the evader team wins the game. In the literature [9], the necessary number of pursuers to capture an evader in a graph G is denoted by (c(G)). Let p=|P|, r=|E|, v=|V |. Let Pi be the current position of the ith pursuer and Ei be the current position of the ith evader. The tuple a =< P0 , . . . , Pp , E0 , . . . , Ee > represents the current position of all participants. We define a boolean variable T (turn) to denote if it is the pursuers’ turn to move or not (recall that, in our algorithm, pursuers and evaders alternate at each time step 1 ). We say that the tuple < a, T > encodes the state s of a game. In general, a game can have 2 ∗ v p+r states, because each pursuer and evader can be at one of v positions, and for each configuration of pursuers and evaders, there are 2 turns (evader and pursuer moves). Each game can be represented by a sequence of transitions through this state space. Each pursuer and evader executes a deterministic algorithm (called its policy) for determining, given 1 This assumption enables us to analyze our algorithm. Of course, in real world experiments, it is difficult to ensure this synchrony.

22

the current state, what the next move should be. We call ρ the pursuer policy and ε the evader policy. Since we consider deterministic policies, if in a particular game, a state is repeated, the game will not terminate and the evaders win. We can now define our game as follows: Input Coarse estimated positions of p robots and r evaders in a bounded environment E. Output Motion commands for p robots. Goal Minimize the capture time of the evaders. Restriction No motion model for the evaders available to pursuers. The game terminates when a capture state is reached. In a capture state, at least one pursuer occupies the same vertex in which an evader resides. There exists a different definition of termination: if, during the evolution of the game, a pursuer reaches an evader’s position, the evader exits the game. It is easy to see that our definition results in a game that is strictly harder than this variant. When evaders exit the game, the remaining pursuers can always, and more quickly, capture the remaining set of evaders. Indeed, there exist cases where our game might not terminate (a 2-pursuer 2-evader game, in some topologies) but this variant will.

3.3 Optimal Strategy

In defining the game, we have avoided mention of the particular strategy that the pursuers and evaders use. In this section, we discuss the optimal strategy. 23

3.3.1

Equipotent players

In this subsection, we assume that pursuers and evaders move at the same speed (more precisely, we assume that they move exactly one hop in the topology at each time step). In our game, we have assumed that both pursuers and evaders have complete information about the positions of all players. Consider now the optimal strategy for the pursuers and evaders: for the former, to capture the evaders in the shortest time, and for the latter, to avoid capture for the longest possible time. To formalize this intuition, we turn to zero-sum games. Pursuit-Evasion is a zero-sum game since the pursuers’ gain or loss is exactly balanced by the losses or gains of the evader. The evader’s goal is to escape as long as possible whereas the pursuers have to capture the evaders as fast as possible. Zero-sum games have been extensively studied in the game theory literature, and our solution models a PEG as a zero-sum game that uses the minimax algorithm [68]. This algorithm minimizes the maximum possible loss for each player in the game. To describe this algorithm, consider first that the evolution of any PEG can be represented by a game graph, a directed graph with possible cycles. The start state (as defined by the starting configuration of the pursuers and evaders) has a directed edge from itself to all possible next states that the pursuers can make from the start state. (In our game, we assume that pursuers and evaders move alternatively). In turn, from each of these states, there is a directed edge to all possible next states resulting from evaders’ moves from that state. The graph can thus be recursively defined. In general, a game is a traversal on this graph. If this traversal ends in a capture state, the pursuers win the game. However, it is also possible for the traversal to repeat states: such a traversal will result in a non-terminating game and the evaders win. 24

Suppose now, that in the game graph, we assign to each state S, a cost function C. • When an evader moves, C(s) denotes the maximum distance from state s to a capture state, and • When a pursuer moves, C(s) represents the minimum distance from state s to a capture state. In our game, we consider the following policies: • The pursuers’ policy ρ is: choose that neighboring state in the game graph which has the smallest C(s). Intuitively, this moves the game at each step as close as possible to a capture state. • The evaders’ policy ε is: choose that neighboring state in the game graph which has the largest C(s). Intuitively, this moves the game at each step as far away as possible from a capture state. Thus, the evader is truly adversarial. In what follows, we first show how to compute the game graph efficiently. Then, we prove that these policies are optimal from the pursuer’s perspective: if the game terminates, they reach a capture state in the shortest possible number of moves. We construct the game graph by generating all states and all possible transitions between states. For each state, we can calculate the cost to reach a capture state using a bottom-up approach. Using this, pursuers and evaders follow the strategy discussed above. We construct the game graph by generating all states and all possible transitions between states. Initially, all states have cost ∞, except the initial set of capture states which have cost 0 (Algorithm 1). 25

Let us define F0 as a set of capture states, in other words, the states where at least one pursuer occupies each vertex in which an evader resides. These states have cost value 0 since no move is necessary for the termination of the game. Now define F1 as the set of states which can reach one of the capture states in one pursuer move. F1 consists of both states where it is the pursuer’s turn to move, and states where it is the evader’s turn to move. A state where it is the evader’s turn to move belongs to F1 where irrespective of the next move made by the evader, the pursuer can still reach the capture state in a single step. Similarly, we can define inductively the F i+1 set of states as the states from which the pursuers only needs i + 1 steps to terminate the game, irrespective of evader’s movements. For any state s = (a, 1), when it is the pursuer’s turn to move, the cost C(s) = minimum (C(s0 ))+1 where s0 is any state that the pursuer can transition to from s. For any state s = (a, 0), when it is the evader’s turn to move, the cost C(s)= maximum C(s 0 ), where s0 is any state that the evader can transition to from s. For a pursuer, the goal is to reach a capture state, i.e. minimize C. For an evader, if C(s) is finite, where s is the current state, then irrespective of what the evader does, it will be captured within C(s) pursuer’s move. The evader should then choose a transition a to a 0 which maximizes the time of capture. If any state has a cost ∞, the number of pursuers is not sufficient to capture the evaders, and it is necessary to add more pursuers to the game. Algorithm 1 computes the state transition diagram, and the associated costs. Pursuers and evaders can use this to implement the optimal strategy ρ and ε : during a pursuer turn, ρ implies selecting the neighboring state with the smallest C(s) and during an evader turn, ε implies choosing that neighboring state that has the largest C(s). The algorithm loops while we add more states to Fi . Eventually, when there are no more states to be added, the algorithm terminates. Since each state can only be added once, the algorithm terminates. 26

Algorithm 1 Algorithm for computing the game graph. 1: {Initialization} 2: Generate all states 3: Generate all possible transitions 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36:

for all state s do if s is a capture state then add s to F0 C(s) ← 0 {cost function} else C(s) ← ∞ end if end for i←0 repeat i ← i+1 change ← f alse U ← set of all unmarked states that have a transition to a marked state. for all s in U do if s is a pursuer move then if s has at least one transition to a marked state then add s to Fi {mark s} C(s) ← min (over all marked neighbors)+1 {count this move} add transition to ρ change ← true end if else {evader move} if all transition from s reach a marked state then add s to Fi {mark s} C(s) ← max (over all marked neighbors) add transition to ε change ← true end if end if end for until not change

27

To prove the optimality of our formulation, it suffices to prove the following theorem:

Theorem 3.3.1. For any given topological graph G, if l(s) is the minimal number of steps required to reach a capture state for any state s, then C(s) = l(s).

Proof. We will prove this by induction on i, the length of the optimal number of steps till capture. Basis: If i = 0, the game is over (l(s) = 0) without requiring a state transition. This means that there must be at least one pursuer in each node occupied by an evader (by our definition of capture). This is exactly our definition of a capture state (lines 5-12 Algorithm 1) for which C(s) = 0. Inductive Step: Assume that, for all states s for which l(s) = i, C(s) = i. We now try to show that for any state s0 for which l(s0 ) = i + 1, C(s0 ) = i + 1. Proving by contradiction, assume exists a state s for which l(s) = i + 1 but C(s) 6= i + 1. We divide into two cases: (1)C(s) < i + 1 or (2)C(s) > i + 1. Case 1: This case results into a contradiction because l(s) is optimal. C(s), which is the number of steps our game would take, cannot be shorter than this. Case 2: In state s, there are two possibilities: either it is the pursuers’ turn to move, or the evaders’. Case 2.1: Suppose it is the pursuers’ turn. Since we have l(s) = i + 1, there is a transition from state s to some state s0 , where l(s0 ) = i. We also have C(s0 ) = i by the induction hypothesis. From line 22 in Algorithm 1, C(s) is the minimum of all C(s”) + 1, where s” is a neighboring state of s. Remember, s0 is a neighboring state of s. Thus, C(s) ≤ i + 1, but we have already discussed that C(s) cannot be less than i + 1. So, C(s) = i + 1, a contradiction. 28

Case 2.2: Suppose it is the evaders’ turn. For every possible move from s to some s 0 , s0 is already a capture state (by construction in line 28 of Algorithm 1). ∀s 0 C(s0 )