A Cooperative Architecture for Target Localization ... - Semantic Scholar

0 downloads 0 Views 2MB Size Report
A Cooperative Architecture for Target Localization using Multiple AUVs ..... Location represents a timeline and Going(WaterPlace1,. WaterPlace1) from 10 to 20 ...
Noname manuscript No. (will be inserted by the editor)

A Cooperative Architecture for Target Localization using Multiple AUVs Assia Belbachir · F´elix Ingrand · Simon Lacroix

Received: date / Accepted: date

Abstract A recent concern in marine robotics is to consider the deployment of fleets of Autonomous Underwater Vehicles (AUVs) and Autonomous Surface Vehicles (ASVs). Multiple vehicles with heterogeneous capabilities have several advantages over a single vehicle system, and in particular the potential to accomplish tasks faster and better than a single vehicle. This paper addresses in this context the problem of underwater targets localization. A systematic and exhaustive coverage strategy is not efficient in terms of exploration time: it can be improved by making the AUVs share their information to cooperate, and optimize their motions according to the state of their knowledge on the target localization. We present techniques to build environment representations on the basis of which adaptive exploration strategies can be defined, and define an architecture that allows information sharing and cooperation between the AUVs. Simulations are carried out to evaluate the proposed architecture and the adaptive exploration strategies. Keywords Multiple AUV cooperation · Adaptive exploration · control architecture

1 Introduction and approach Most of the operational robotic systems such as spacecraft, rovers and underwater vehicles execute predefined commands that are sequenced and monitored by remote operators. In such cases, the communications between the control station and the vehicles are highly constrained, and the presence of operators in the control loop affects the efficiency of the mission. Improving this efficiency calls for the development of CNRS ; LAAS ; 7 avenue du Colonel Roche, F-31077 Toulouse, France Universit´e de Toulouse ; UPS , INSA , INP, ISAE ; LAAS ; F-31077 Toulouse, France E-mail: [email protected]

autonomous decisional abilities to plan the robot activities, control the execution of goals and monitor the state of the system. The problem of exploring an unknown area is a common one in mobile robotics. Strategies that ensure the coverage of the entire area are not optimal in terms of resource costs, namely energy consumption and execution time. Adaptive exploration strategies that control the vehicles trajectory in order to maximize the information gain are more efficient [1]. In general, several vehicles bring robustness and allow for faster and more efficient missions. When it comes to information gathering missions (e.g. exploration, surveillance, target detection and localization) synergies occur when the vehicles effectively communicate to merge information gathered on the environment and to coordinate their observation plans. The robotics community has produced a fairly big amount of work in this context, and there are numerous contributions in the literature on communicating robot fleets (e.g. [2]). This paper proposes an adaptive exploration strategy to address the problem of underwater target detection and localization, using a fleet of Autonomous Underwater Vehicles (AUVs) and an Autonomous Surface Vehicle (ASV). Using an ASV as a communication hub reduces the AUVs energy consumption and the overall mission duration, as the AUVs are not required to surface to exchange data. Furthermore, the ASV can correct the AUV position drift that eventually occurs, and underwater communications are better established when both vehicles are within a vertical acoustic cone with respect to each other [3]. In our approach, each AUV gathers information to locate targets, while the ASV acts as a communication hub between all AUVs and refines their localization estimate. Every AUV has autonomous control abilities that enables itself to achieve a given pre-planned sequence of tasks and motions, and rendezvous with the

2

ASV are required to exchange data between the AUVs – and for the operator to monitor the mission execution.

Related work. Zhang et al. [4] propose an adaptive sampling strategy for a team of ASVs. Their approach relies on the partition of “equal gain” areas, that are then explored by individuals. Popa et al. [5] developed an adaptive sampling algorithm, that uses information measured to direct the vehicle to gather the most relevant information about the target. The authors use a routing algorithm to minimize the motion cost of the vehicle and maximize the information gain (reduce the entropy). Meliou et al. [6] introduce a non-myopic algorithm, that uses the collected information. They use a greedy algorithm and modify it into a non-myopic one by adding values on each vehicle path. Low et al. [7] drive the AUVs according to the sensed measurements, directing the vehicles towards the detected targets: the approach yields finer localization of targets compared to systematic sampling. Low et al. [8] describe an adaptive multi-robot exploration strategy for performing both wide-area coverage and hotspot sampling using non-myopic path planning, based on a dynamic programming formulation. They apply Gaussian and log-Gaussian processes, and analyze if the resulting strategies are adaptive and maximize wide-area coverage and hotspot sampling. The robot chooses the next cell to explore by maximizing the information gain. Only one robot can choose to sample a new location at each stage while the rest of the robots remain still. CoDA (Cooperative Distributed Autonomous oceanographic sampling networks control [9]) defines a cooperative protocol for intelligent control. Several autonomous underwater vehicles (AUVs) and other instrument platforms are gathering data over long term in an area, and two levels are designed: MLO (meta-level organization) and TLO (tasklevel organization). The MLO is a loose organization of multiple AUVs that self-organize to analyze the mission and the resources available, and the TLO is focused on efficient organization to carry out the mission. So the MLO will most of the time look like a consensus-based group of vehicles, while the TLO might be a hierarchy, a committee, a team, a market, or any other organizational type that seems to fit the situation. This protocol design lets the freedom to choose the desired protocol of cooperation and is scalable, but each vehicle has to broadcast its organization when it enters in the MLO. Following a similar idea, Johnson et al. [10] defined a mapping strategy for AUVs. A low resolution map is related to an AUV position, and a high resolution map contains more precise location for MLO. In this work, the authors use one AUV as a leader to precise the location of the other AUVs in MLO.

Assia Belbachir et al.

A knowledge-based approach is also used for task allocation in multi-robot teams, such as in ALLIANCE [11], in which robots model the ability of team members to perform the tasks of the system by perceiving operating team members and collecting relevant task quality, such as time to task completion. Robots then use these models to choose tasks to achieve, with the objective to benefit the group of robots. In general, these approaches need permanent, long range and large bandwidth communications, properties that are not satisfied with underwater vehicles. The MOOS architecture [12] is a behavior based architecture, that aims at extending the mono-robot work of MAUVS [13] for sampling, ocean floor survey, etc. with multiple AUVs. Shared data is explicit between vehicles while using a permanent acoustic communication. The objective of the MOOS architecture is to optimize the path of each AUV, knowing the sampling location. In our case, we are trying to optimize the motion of a fleet of AUVs while they explore an unknown area. Approach. Most of these existing contributions propose purely adaptive strategies [8][6][4][5] [6][9], or approaches that need a permanent/long range and large communication bandwidth [11]. However, from an operational point of view, an AUV cannot be freewheeling under the surface, depending entirely on the data it gathers to define its motions. To ensure safe and monitored execution of sampling missions, information-driven strategies have to be integrated within a pre-planned scheme that contains communication rendezvous, i.e. spatio-temporal constraints. In this sense, to take into account these spatio-temporal constraints and to control the execution of the mission, we rely on the T-ReX architecture (Teleo-Reactive EXecutive [14]), that embeds a planner and a control mission execution, and extend it to take into account both cooperative and adaptive exploration strategies (CoT-ReX architecture). The contributions of this article are: – Introduction of a data-driven approach which aims at maximizing the number of detected targets and optimizing their localization precision, – Integration of this approach with a task planner and an execution controller that takes into account temporal constraints for each vehicle. Our work is quite different from Johnson et al. [10], in the sense that we use an ASV, properly localized, to refine the AUVs localization. Sections 2 to 4 introduce the models and algorithms used to localize the targets, and how their results are used to adapt the motions of the AUVs. The cooperative architecture, that embeds these algorithms and allows adaptive behaviors while satisfying time constraints initially planned, is presented in

A Cooperative Architecture for Target Localization using Multiple AUVs

3

section 5, and section 6 discusses the results of the simulations in different case studies. 2 Target and target detection models We consider the case in which the target to localize ais a vent on the seabed that emits hot water. The emission of the vent expands as it raises higher from the sea floor. We defined two target models, depending on whether the maximal temperature is known or not.

Fig. 2: Evolution of the temperature as a function of the horizontal distance ρ for the two depths z1 and z2 defined in figure 1. Dashed lines represent the dispersion of the temperature.

( Tmean =

(z)−T0 Tmax (z) − ρ Tmax ρmax (z) if ρ ≤ ρmax (z) T0 if ρ > ρmax (z)

(2)

where

Fig. 1: Illustration of the temperature evolution within a thermal plume in stationary waters: the temperature, here represented in red (the redder the hotter), decreases with the elevation and with the distance to the vertical of the emitting vent.

2.1 Target detection model with known maximal temperature The density of the temperature T within the plume is a decreasing function of the horizontal distance ρ with respect to the plume center and of the elevation z above the seabed (figure 1). This function is the model of the plume, which is an approximation of the actual diffusion phenomenon: the model is probabilistic and expresses the probability density function (pdf) of the temperature T as a function of the distance ρ and the elevation z: P(T = t|ρ, z)

(1)

Figure 2 shows the behavior of the model at two different elevations: the dispersion of the temperature is also an increasing function of the distance and the elevation. Note that multiple sources do not interfere if they intersect, the water temperature being a function of the closest source (this assumption would not hold in the case of the emission of chemicals, whose concentration increases within the intersection of plumes: such cases call for a different modeling of the pdf).

ρmax (z) = ( αz −T0 TMAX − z TMAX if z ≤ zMAX zMAX Tmax (z) = T0 if z > zMAX

(3)

These equations model the plume as a cone, the parameters ZMAX and α respectively defining its height and aperture. This is certainly a simplification of the actual diffusion phenomenon. In particular, it makes the assumption that there is no current: a stationary current independent of the depth would simply generate an oblique cone, and the consideration of dynamic currents that are a function of depth would require a more complex parametrization. Finally, for a given position (ρ, z) in the plume, the probabilistic variations of the temperature is modeled by a Gaussian: T = N (Tmean , σ (ρ, z)), where σ (ρ, z) is an increasing function of ρ and z. The temperature sensor is modeled with a probability density function P(Tsensor |T ) that models its errors (e.g. a Gaussian). The overall source perception model is a convolution of the source and sensor models, which results in a pdf akin to (1), the associated variations being “blurred” by the sensor model – we however ignore the sensor errors, since most of the uncertainty comes from the model of the observed phenomenon. Note that when the target maximal temperature is known, this model is equivalent to a range-only sensor: one temperature measure provides an estimate of the distance to the target. 2.2 Target detection model with unknown maximal temperature When TMAX is not known, two temperature measures provide an estimate of the direction of the target – as if the

4

Assia Belbachir et al.

vehicle were equipped with a bearing-only sensor. The direction of the target is computed according to algorithm 1. Algorithm 1 Algorithm to compute the target direction.

T

T k:

T

Depth z1

Depth z2

Tmean

Tmean

Require: the current measured temperature; T k−1 : the formed measured temperature; Direction: the direction of the target; Posk : the actual position of the vehicle; Posk−1 : the precedent position of the vehicle. 1: δ T ← T k − T k−1 ; 2: if (δ T > 0) then −−−−−−−→ 3: Direction ← Posk−1 Posk 4: else 5: if (δ T < 0) then −−−−−−−→ 6: Direction ← Posk Posk−1 7: else −−−−−−−−−→ 8: Direction ← Posk−1 ⊥ Posk 9: end if 10: end if 11: return Direction

deg1

Direction (deg)

(a)

deg1

Direction (deg)

(b)

Fig. 4: Evolution of the temperature as a function of degrees “deg” for two depths z1 and z2 (defined in the figure 1).

3 Map building

The variable Direction contains the direction of the target. The bigger the distance to this direction is, the lower the probability to contain the target is (Figure 3). 1600

This section presents how the data gathered as the AUVs move are integrated in a map that represents the probabilities of target presence.

3.1 Map structure and update

800

2000

(a)

The environment is represented by a grid map, that is a collection of N ∗ M cells ordered along a regular Cartesian pattern. To each grid cell {xi, j }, i ∈ [0, N[, j ∈ [0, M[ is associated a probability value Pk (xi, j ) that represents the probability that a target is located within the cell at time k. A boolean value is also associated to each cell, which is set to one when one vehicle has visited the cell – this is to avoid fusing twice the data acquired from the same position. The cell probabilities are updated incrementally according to a classical bayesian paradigm under a markovian assumption:

Temperature

1600

1600

800

This model is specified by a simple Gaussian where T = N (Tmean , ρ(dir, z)).

(b)

2000 800

(c)

2000

(a)

Fig. 3: (a). Illustration of the target model at a given depth. The colors here indicate the temperature. (b) and (c): application of the bearing only and the range only target detection models. The colors represent here the probability of the target presence on a regular Cartesian grid (see section 3). This model is shown in figure 4: the direction is estimated with an error that augments with the distance from the target direction. The model is defined by the probability density function of the temperature T as a function of a direction (dir) and the elevation z: P(T = t|dir, z)

(4)

Pk (xi, j ) =

P(T k |xi, j = vent)Pk−1 (xi, j ) P(T k )

(5)

where P(T k |xi, j = vent) is the sensor model and Pk−1 is probability value of the target existence at the time k − 1. Note that the probability Pk (xi, j ) implicitly represents the precision of the source location: a probability equal to 1 meaning that the source is perfectly localized.

3.2 Multi robots map building Communication model. Communication is established between the ASV and the AUVs to properly monitor the mission execution, but especially to exchange data (target maps)

A Cooperative Architecture for Target Localization using Multiple AUVs

5

between the AUVs. Underwater communications are very constrained and complex to model [3]: we choose a conservative model, that states that an AUV and the ASV can communicate if and only if they are located within the same grid cell in the horizontal plane. [15] gives all different range and distance between vehicles that can be used to allow communications. In our case, we choose conservative communication figures, where the bandwidth is equal to 100 bps and the communication distance between two vehicles varies from 10 to 100 meters. Fusing maps among underwater vehicles. At each communication point with the ASV, the AUV only uploads the information gathered since the last communication, to avoid data redundancy and to save bandwidth. The ASV gather the latest obtained map of each AUV. At every communication with AUVs, the ASV send the latest explored cells of other AUVs. These information from various AUVs are fused, this fusion is performed on-board the AUVs, the ASV acting as a centralized information hub. This process is illustrated figure 5. Communication bandwidth and amount of data exchanged. The effectiveness of a cooperative approach depends on the way the sensed data is treated and forwarded. One way to send the strict sufficient minimum data to other vehicles is to let the vehicle take more time for exploration. Knowing that the size of the sensed data |d0 | for each cell is equal to 40bits and the bandwidth L is equal to 100 bps. nd represents the number of explored cells. The communication time (comt ) is computed as follow: nd ∗ |d0 | . L Where the data size is calculated as follows: |d0 | = |d1 | = · · · = |di | = | < x0 ; zx0 > | = |x0 | + |zx0 | = 2 ∗ 16bits + 8bits = 40bits . Replacing the data size by 40bits and the used bandwidth by 100bps, the communication time is equal to comt = 0.4 ∗ nd . If the communication time is bounded by 2 minutes, it means that the vehicle can explore less than 150 cells (each cell represents 100m2 ). In general, because of the localization error of the vehicle, the vehicle can explore at maximum a distance of 1.2 km (12 cells) with a speed of 1.5m/s. This limit indicates that the vehicle will communicate less than 2 minutes, which is a reasonable communication time.

Fig. 5: The evolution of the cell probability to contain a source by fusing data acquired by two AUVs. Top: AUV1 sends its new map to the ASV, receives the latest map (here empty) of AUV2 and updates its map. Bottom: when the AUV2 communicates with the ASV, it sends its new map to the ASV, receives the latest map of AUV1, and updates its own map using equation (5). The new map contains more precise information on the presence of the targets.

comt =

Algorithm complexity.To avoid data redundancy, the vehicle sends its observed grid from the last communication point to the newest communication point. The ASV keeps the latest map of each vehicle. To update the ASV map, a simple algorithm is used. The complexity of the algorithms for the AUVs and the ASV at each communication point comi is the following:

– The ASV get the map of one underwater vehicle: the maximal number of cells that can be exchanged is nd = 150 cells. After getting all the new cells information, the ASV updates the global map with at most N ∗ M ∗ Nrobots operations. Once this map is updated, the ASV sends to each AUV all the maps of other AUVs (Nrobots ∗nd cells). The total operations for the ASV are N ∗ M ∗ Nrobots + Nrobots ∗ nd . – Each AUV sends its new measured data (nd ) and get from the AUV Nrobots ∗ nd new cells. The AUV updates its map by using at most N ∗ M ∗ Nrobots operations.

4 Adaptive motion strategies The built maps are the basis on which the AUV motions can be adapted in order to augment the knowledge on the environment. We defined two strategies in which the AUVs select the next motions, depending on a source presence hypothesis (a local maximum of P(xi, j ) in the mapped vicin-

6

Assia Belbachir et al.

ity of the current AUV position). One greedy strategy aims at confirming the presence and localization of the closest source hypothesis, until its probability exceeds a threshold Ploc , whereas the other strategy aims only at confirming the presence of a target, assessed when its probability exceeds a threshold Pcon f (Pcon f < Ploc ).

4.1 Greedy Information Driven (G.I.D) The motion that maximizes the source detection is straightforwardly the one that drives the AUV towards the direction of this maximum. This strategy has predefined waypoints and communication points with the possibility to redirect the vehicle according to the temperature measure. The vehicle heads toward the closest source hypothesis until its probability exceeds Ploc , which means the source is faithfully localized. The algorithm 2 shows how the next cell to visit is selected every time new data is acquired. If the maximal temperature of the target is known (range-only sensor case), the vehicle follows the maximal value of the target: this is why in the algorithm 2 line 12 the next cell is chosen as follows: Maxai ∈{le f t,right,behind, f ront} P(xi, j /xi, j =NotExplored )

(6)

For equation 6, the maximization function is evaluated by one action look ahead.

Maxai ∈{le f t,right,behind, f ront} f (zτ(xi ,ai ) )

(7)

where, f (zτ(xi ,ai ) ) is a function that predicts the future values in the Map, if the chosen action is ai . For example, suppose that the target is on the left side of the vehicle. ai = le f t implies that the next values of the cells have the same values as before. However, if ai = f ront, the probability for next cells differs. Based on this reasoning the vehicle chooses its next explored cells. Rather than equation 6, equation 7 evaluates the maximization over a finite horizon of the whole map.

4.2 Global Information Gain driven (G.I.G) The vehicle heads toward the path that collects the more information about the source hypothesis until its probability exceeds Pcon f : with respect to the previous strategy, this strategy reduces the number of vehicle actions, at the cost of less precisely localized sources. The difference between GID and GIG is the number of actions. GID localizes the target with more actions than GIG. However, GID has more accuracy of the target than GIG. This difference is due to the threshold difference (Pcon f < Ploc ). The target direction is based on the probabilistic target model, which make our approach able to deal with the presence of noise.

Algorithm 2 Algorithm of the next cell to explore. Require: D: the diameter of the target ; Targetx,y : the coordinate of the target ; P(xi, j ): the probability that xi, j is the target; a: an action ; τ: a function of transition. From an action and a cell it can generate the next cell. 1: Update grid using equation (5). 2: if (∃i, j : i ∈]0, N], j ∈]0, M], P(xi, j ) ≥ Pcon f ) then 3: 4: for (∀i, j : i ∈]0, N], j ∈]0, M]) do 5: if (0 < |xi, j − Targetx,y | < D) then 6: P(xi, j ) ← 0; 7: P(Targetx,y ) ← 1; 8: end if 9: end for 10: end if 11: 12: a ← using the equation (6) or (7). 13: xi,0 j ← τ(a, xi, j ) 14: return xi,0 j

If the maximal temperature of the target is not known, the vehicle goes in another direction to collect more information for the exact target location. In the algorithm 2 line 12, the vehicle then uses this function to choose the next cell to explore:

4.3 ASV: Exploration strategy Due to the restricted energy supplies of each underwater vehicle and the use of energy when communicating, these rendezvous have to be reduced. This is why the vehicle can explore areas without communication until some maximal distance is reached. This maximal distance is defined by the motion model, and is set so that the position uncertainty remain below the cell size. The idea is to find a minimal number of these rendezvous to extend mission exploration. It is a complex problem, and we proposed a simple way to generate communication points for two orthogonal swathing patterns. For that, we define a matrix, that has the same dimensions as the exploration map, for each vehicle. This matrix Ak (i, j) represents the time that the vehicle “k” will be at the position (i, j). The communication point between two (or more) vehicles is defined when two (or more) vehicles are at the same place at the same time (i.e. if (Al (i, j) = Am (i, j)) then the vehicle “l” will meet the vehicle “m” at the place (i,j)). After defining all communication points between vehicles (i.e. Comx,y ), the second objective is to reduce the number of rendezvous. Taking into account that each vehicle cannot exceed a maximal distance of exploration without communicating, some communication points cannot be

A Cooperative Architecture for Target Localization using Multiple AUVs

removed. The main idea of the resolution is that a communication point is removed, when it is not needed by the other vehicle. For instance, consider the following two matrices of two AUVs.     056 012 A1 (x, y) = 5 4 3, A2 (x, y) = 1 4 7 238 678 then   000 Com(x, y) = 0 1 0 001 We suppose that the maximal exploration distance of the vehicle “1” is 8 moves and for the vehicle “2” is 5 moves. The intersection matrix Comx,y = 1 i f A1 (x, y) = A2 (x, y), otherwise Comx,y = 0. The proposed algorithm explores the Comx,y using the exploration strategy of the AUV1, and removes the additional communication points, but does not remove the necessary communication points like Com2,2 that is necessary for the AUV2. Sothe matrix  of communication 000 points becomes Com(x, y) = 0 1 0. 000

7

meets


Path


Inactive


Met_by

Location


At(WaterPlace1)


Going(WaterPlace1,
WaterPlace2)


Inactive


meets At(WaterPlace2)
 Around (WaterPlace2)


Fig. 7: An example that illustrates the use of timelines, activities and constraints in T-ReX.

T-REX incorporates planners that can cope with different planning horizon and deliberation time. A T-ReX agent is divided into several layers called “reactors” (R = {r1 . . . rn }). Each reactor can be deliberative or reactive (depending on the horizon and deliberation time). The basic data structures used in T-ReX are timelines (L). Those timelines represent the evolution of variables over time, which is a set of ordered activities called tokens (τ(L)) that are mutually exclusive. For instance,

5 Integration within the CoT-ReX architecture holds(Path, 10, 20, Going(WaterPlace1,WaterPlace2)) With a fleet of AUVs assisted by an ASV for the communications, three objectives have to be satisfied: 1. Predefined way-points: these pre-defined points constitute the initial mission plan – swathing exploration patterns. 2. Opportunistic goals: these points are generated according the built target maps, following two equations (6) or (7). 3. Predefined communication points with the ASV: these points are essential for the vehicle positioning and data exchanges. These goals can be prioritized in various manners, depending on whether one wants to fulfill the initial plan or adapt the AUVs motions to the gathered data. To enable the achievement of exploration strategies, all the involved processes must be organized and controlled. In the next section we explain how this is done.

5.1 T-ReX To endow each vehicle with some autonomy, we have used an architecture for planning and execution control called T-ReX [14], originally developed at MBARI (Monterey Bay Aquarium Research Institute). This architecture provides each vehicle the capability to plan and execute its plan on board.

means that the robot will move from the WaterPlace1 to the WaterPlace2 between 10 ut and 20 ut (ut = unit of time). Location represents a timeline and Going(WaterPlace1, WaterPlace1) from 10 to 20 is a token. Temporal constraints can be associated to tokens. These constraints can be compatibilities (temporal constraints taken among the thirteen temporal relations of Allen [16]) or guards (conditional compatibilities similar to the conditional statement in traditional programming languages). Figure 7 represents the timeline used for the Example 1. Temporal constraints can be associated to tokens. All those components (timelines, activities and constraints) are implemented in NDDL (New Domain Description Language) [17]. In T-ReX each reactor is composed of (see Figure 8) : – A database: it is a data structure which holds the current plan. This database provides to the planner all target goals and the constraints between timelines. – A planner: it is used to populate timelines with tokens according to what is defined in the database. Each reactor has a planner with different “planning horizon” λr and a “look-ahead” π. The lower we go in the architecture, the more the reactor is reactive and vice versa. In the current implementation, T-ReX uses EUROPA [18] as the planner.

8

Assia Belbachir et al.

800

2000

800

2000 800

2000 800

2000

Fig. 6: Illustration of the map updates resulting from the GID strategy in an environment with a single target. The left image represents the temperatures at a given depth, (1) shows the grid map probabilities evolution in the case where the maximal temperature is unknown, (2) shows the probabilities evolution when the maximal temperature is known.

Goals Observations Database

Mission Manager !"#"$"%&'

!"#"("%&' ,-.'(*/."0)*+

!"#$%&'()*+

T-REX Agent

!)1"2)*%&"3)+

Planner

!)1"2)*%&"3)+ Executive !"#"("%&' ,-.'(*/."0)*+

!"#$%&'()*+

)&*+,-&'./0#1/-'23$%4%#&5'

Fig. 8: The description of the T-ReX Architecture. Fig. 9: The used T-ReX Architecture at MBARI [19]. – A dispatcher: allows goals(Gr ) to be dispatched to other reactors. This management is done at each unit of time called a tick. – A synchronizer: coordinates observations (Or ) from other reactors. Those observations are done by an external timeline (Er ). The external timelines can be observed but cannot be modified by this reactor. T-ReX has been used extensively on one AUV at MBARI to plan its mission and to control it [19]. In their particular setup, the architecture is composed of two reactors (see Figure 9) : – The Mission Manager reactor provides high-level reasoning to satisfy all mission goals. This reactor is considered as the deliberative reactor, where its look-ahead is the whole mission. – The second reactor is the Executive reactor. It receives goals from the Mission Manager reactor and plans to

send behaviors to the functional level of the vehicle. The planning horizon of this reactor is smaller than the Mission Manager reactor, that is why it is considered as a reactive reactor.

5.2 CoT-ReX For the cooperative architecture to be both reactive and deliberative, the robot has to achieve its mission and adapt its plan. In the scenario we consider, the objective is to detect and localize targets. We assume that the vehicle has a predefined ordered set of goals to achieve (way points, communication rendezvous) that defines a priori exploration strategy. We added the MapReactor as a new reactor in the T-ReX architecture: it is the component that takes into account the

A Cooperative Architecture for Target Localization using Multiple AUVs

perception of the vehicle and generates new goals for the Mission Manager. To make each vehicle cooperative, we added a CoopReactor that gets sensory data from other vehicles and passes them to the MapReactor and vice versa. Figures 10 shows our proposed architecture where each reactor has a specific role. This new architecture is called CoT-ReX, where the first two reactors are similar to the one proposed by the MBARI (see figure 8). This architecture consists of the following reactors: GlobalVar

TCP/IP

MAP


Buffer


Buffer


Communication

CoopReactor
 Update the MAP (notify)

Get MAP State (notify)

Request (Goal)

MapReactor


A

Mission
Manager


Transect width

Generated points Pre-defined points

Request(Goal)

B

Executer
 GetObservation ()

the vehicle take sample of a specific area. The PathSurvey is a token that defines the survey that the vehicle can do given two points. For example, in figure 11 the use of the PathSurvey defines the two points A, B and the transect width for the exploration. The PathSurvey will generate the other waypoints according to a specified transect width. Each waypoint is represented by a Going from the beginning of the transect to the ending point. Between each generated waypoint there are other precedence constraints that should be also taken into account. For this example the PathSurvey contains different Going(x, y) tokens that achieve the required path (see figure 11 – other strategies of exploring a given area can be defined).

GetObservation ()

GetObservation ()

GetObservation ()

MAP
 Simulation


CoopReactor


9

Request(Goal)

Simulator


CoT-ReX e.g. Speed, maximum pitch and heading

CoT-ReX

Fig. 10: Illustration of the proposed architecture in simulation mode (the “simulation” block emulates the same interfaces than a real AUV)

– The Mission Manager represents the deliberative reactor. It manages high level goals of all the mission. This reactor has a look-ahead of the whole mission. – The Executer (Functional level) is the reactive reactor. The planning horizon of this reactor is smaller than the Mission Manager reactor. – The MapReactor (Generation of opportunistic goals) has the role to generate opportunistic goals. These goals are taken into account by the Mission Manager. – The CoopReactor (Collaborative mechanism) is the cooperative reactor which communicates with the other vehicles through the ASV. – The Simulator: This reactor simulates the vehicle motion by defining maximum and minimum limits of the vehicle speed, pitch, heading, depth, etc. 5.2.1 The timelines in the Mission Manager: The different timelines in the mission manager reactor (see figure 13) are explained below: – MissionGoals: This internal timeline contains three different tokens: Inactive, Pickup and PathSurvey. The Inactive token does not do anything. The Pickup token lets

Fig. 11: Illustration of using a PathSurvey. The points A and B are the parameter points defined at the PathSurvey and the other points are the generated points using the PathSurvey according given a predefined transect width.

– PathController: It is an external timeline that implements the constraints of the navigation path. PathController is used for general vehicle positioning and to deal with the path constraints. Note that this is not designed to spatially constrain the vehicle motion in general, but only to make the vehicle traverse a path when required. This timeline is active when the token Going is active otherwise the associated token is Inactive. 5.2.2 The timelines in the Executer: The executer reactor has less look-ahead (5 ut) and planning horizon (1 ut) than the Mission Manager reactor. The used timelines and the associated tokens are: – Actions: This timeline is an internal timeline, that is executed in one unit of time. Two tokens are associated to this timeline: Idle that does nothing and DoBehavior that is used to give properties to other tokens. For example, if a token is preceded by the token DoBehavior, at the end of the execution of the DoBehavior, the token can be directly executed. In another case, if the token is not preceded by this DoBehavior, the token will be executed at the last time of the token interval. – PathController: This timeline is an internal timeline that allows the vehicle to move and control its path.

10

Assia Belbachir et al. GlobalVar

CoopReactor

MAP


Communication

!"#$%&'()

*$%&'() +,-./)

VehicleState

MissionManager

MapReactor Communication

-./()*+,&

'()*+,&

MissionGoals

!"#$%&

VehicleState SetPoint

-./()*+,&

'()*+,&

Ascend

-./()*+,&

'()*+,&

Descend

-./()*+,&

'()*+,&

!"#$%&'()

*&$+,-)

*#%12,3'(456789)

!"#$%&'()

PathController

./&"0)

Request() GetObservation ()

GetObservation ()

– Mutex: It is an internal timeline used to execute one token at a time. Two tokens are defined: InUse means that there are tokens that are executed, Free means no token is executed. – Communication: This timeline is an external timeline that allows or disallows the vehicle to communicate. Communication timelines contain a token called Active and another Inactive. When the token is Active, then the vehicle has to communicate, otherwise the communication is not allowed. – VehicleState: It is an external timeline that gives the state of the vehicle at any time. The state of the vehicle is represented by the depth, northing, easting, etc. – SetPoint, Ascend, Descend and Waypoint: All these timelines are external and they give the location of the ascend/decscent/setpoint. In the simulation case, the path is computed with predefined functions and values.

Executer !"#

%$Actions PathController

GetObservation ()

!.*/0,+

%$&'($)*+,'-% 1',.2%

!.34

%$Mutex Communication

GetObservation ()

!.*/0,+

%$5-$$% 6/0,+

%$VehicleState

7'#"4%

SetPoint

6/0,+

%$Ascend

6/0,+

%$Descend

6/0,+

%$waypoint

6/0,+

%$Request() MapReactor

Simulator

Grid Dispatcher


DataFusion
 DataBase


ComputeGoal


Fig. 12: Illustration of different function in the MapReactor

5.2.3 The architecture of the MapReactor: The MapReactor (see figure 12) component is composed of a synchronizer, a database and a dispatcher. The database contains the map of the whole exploration zone and the different associated timelines. According to the computed map and the existing timelines, the vehicle changes its strategy of exploration by generating goals. Timelines: The implemented timelines are (figure 13): – Communication: This timeline is an external timeline used to know the next nearest communication point. This timeline can be Active or Inactive depending on the defined execution time. When the MapReactor generates an opportunistic goal, it estimates the required time to reach both the communication point and the opportunistic goal. To validate this new goal, the MapReactor needs to know the predefined time of the communication token. This is why the communication token is necessary for the MapReactor. – VehicleState: This timeline is external to the MapReactor and is used to provide the vehicle state for example heading, northing, etc.

!"#$%&

MissionSimulator

!"#$%&

NavigationControl 0$#*./&

-"+*./&

SetPoint

0.1()*+,&

'()*+,&

Ascend

0.1()*+,&

'()*+,&

Descend

0.1()*+,&

'()*+,&

waypoint

0.1()*+,&

'()*+,&

GPS

0.1()*+,&

'()*+,&

Gulper

0.1()*+,&

'()*+,&

VehicleActivity

Synchronizer


!"#$%&

VehicleState

TimeLines

Fig. 13: Illustration of different timelines used at each reactor in the T-ReX architecture

– SetPoint, Ascend and Descend are the behaviors of the vehicle, that are external timelines. Algorithms: To allow the vehicle to change its exploration strategy we have to represent the perception of the vehicle, the used timelines, the fusion of the measured data with preceding measurement, and finally find a function to compute the next goal. The algorithm for exploring a discrete environment (grid) is outlined below: 1. Update the grid using the measured data. 2. Make decisions: This allows to define adaptive strategies, in which the AUVs select the next motions in order to confirm the presence of a target. 3. Adaptive Cooperative Exploration Strategy. 5.2.4 The architecture of the CoopReactor: To allow cooperation between vehicles, they have to be at predefined rendezvous points to communicate with the ASV. These rendezvous points allow vehicles to exchange data. To manage these exchanges we added a new reactor called “CoopReactor”. This reactor is composed of a synchronizer,

A Cooperative Architecture for Target Localization using Multiple AUVs

a dispatcher and a database. The database of this reactor contains predefined rendezvous points for the whole mission. The internal timeline of this reactor is the Communication timeline, when the used token is Active than the communication is active otherwise the token Inactive is used (see figure 13). In our experiments, a client/server protocol is implemented where data can be exchanged between vehicles. The exchanged data can be the explored grid or the failed tasks. All these exchanged data can help the vehicle to localize the target.

11 Waypoint End mission point Start mission point Time precedence between two waypoints (