Model-based adaptive spatial sampling for occurrence map construction

5 downloads 29 Views 1MB Size Report
CompSust'09 - Cornell University - june 2009. Model-based adaptive spatial sampling for occurrence map construction. N. Peyrard and R. Sabbadin. – p. 1 ...
Model-based adaptive spatial sampling for occurrence map construction N. Peyrard and R. Sabbadin

CompSust’09 - Cornell University - june 2009

– p. 1

Mapping spatial processes in environmental management Mapping pest occurrence • Building pest occurrence map in order to eradicate • Observations costly • Errors in mapping also costly P2001

P2002

P2004

P2003

500

500

500

500

400

400

400

400

300

300

300

300

200

200

200

200

100

100

100

100

200

400

600

200

400

600

200

Y2002

400

600

200

500

500

500

500

400

400

400

400

300

300

300

300

200

200

200

200

100

100

100

200

400

600

200

400

600

CompSust’09 - Cornell University - june 2009

400

600

Y2004

Y2003

100 200

400

600

200

400

600

– p. 2

Mapping spatial processes in environmental management Different problems depending on observations nature • Data visualization - Complete observations (everywhere) - Perfect observations (No errors/missing data) ⇒ How to visualize data? • Map reconstruction - Complete observations - Noisy observations ⇒ How to reconstruct the “true” map? • Sampling and map construction - Incomplete observations (not everywhere) - Noisy observations ⇒ Where to observe? / How to reconstruct? CompSust’09 - Cornell University - june 2009

– p. 3

Mapping spatial processes in environmental management How to design an efficient spatial sampling method to estimate an occurrence (0/1) map when X process to map has spatial structure X observations are imperfect/incomplete X sampling is costly X process does not evolve during the sampling period

CompSust’09 - Cornell University - june 2009

– p. 4

Overview of the proposed approach Optimization approach for designing spatial sampling policies The Hidden Markov Random Field model is used for: • Representing current uncertain knowledge about map to reconstruct • Updating knowledge after observations • Defining a unique criterion for - map reconstruction from observed data - spatial sampling actions selection

CompSust’09 - Cornell University - june 2009

– p. 5

Optimal sampling problem Hidden variable X

X a Y

Sampling action a Observation model p(Y = o|x, a)

Question: How to reconstruct hidden variable X using sampling actions? 1. Hidden variable model 2. Updated model after sampling result 3. Hidden variable reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 6

Spatial sampling optimization The hidden variable x is a map ⇒ The sampling optimization problem has to be revisited

Question: How to reconstruct hidden map x using sampling actions? 1. Hidden map model 2. Updated model after sampling result 3. Hidden map reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 7

Pairwise Markov random field (1) • Multiple interacting variables • Independence given neighborhood ⇒ Pairwise Markov random field

Question: How to reconstruct hidden map x using sampling actions? 1. Hidden map model 2. Updated model after sampling result 3. Hidden map reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 8

Pairwise Markov random field (2) • Multiple interacting variables • Independence given neighborhood ⇒ Pairwise Markov random field • Interaction graph G = (V, E) • ψi : “weights” on states of vertex i • ψij : correlations “strength” between neighbor vertices • Z : normalizing constant / partition function   Y 1Y P (x) = ψij (xi , xj ) ψi (xi ) Z i∈V

(i,j)∈E

CompSust’09 - Cornell University - june 2009

– p. 9

Hidden Markov random field (1) Hidden variables

• a ∈ {0, 1}|V | : subset of V selected for sampling • Independent observations:

Observations

P (o|x, a) =

Y

Pi (oi |xi , ai )

i∈V

Question: How to reconstruct hidden map x using sampling actions? 1. Hidden map model 2. Updated model after sampling result 3. Hidden map reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 10

Hidden Markov random field (2) Hidden variables

• a ∈ {0, 1}|V | : subset of V selected for sampling • Independent observations:

Observations

P (o|x, a) =

Y

Pi (oi |xi , ai )

i∈V

Updated Markov random field (Bayes’ theorem) P (x|o, a) =

 Y  1Y ′ ψi (xi , oi , ai ) ψij (xi , xj ) where Z i∈V

(i,j)∈E

ψi′ (xi , oi , ai ) = ψi (xi )Pi (oi |xi , ai ) CompSust’09 - Cornell University - june 2009

– p. 11

Hidden map reconstruction (1) Hidden variables

Observations

11 00

Local (MPM): x∗i = arg maxxi Pi (xi |o, a), ∀i ∈ V

Reconstruction

Question: How to reconstruct hidden map x using sampling actions? 1. Hidden map model 2. Updated model after sampling result 3. Hidden map reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 12

Hidden map reconstruction (2) Hidden variables

Observations

Local (MPM): x∗i = arg maxxi Pi (xi |o, a)

Reconstruction

11 00

Value of reconstructed map Expected number of well classified sites in x∗ V

MP M

(o, a) = f

X i∈V

 max Pi (xi |o, a) xi

CompSust’09 - Cornell University - june 2009

– p. 13

Sampling action optimization (1) Hidden variables

Observations

• a ∈ {0, 1}|V | selected for sampling • Independent observations o ∈ {0, 1}|V |

⇒ How to optimize the choice of a? Question: How to reconstruct hidden map x using sampling actions? 1. Hidden map model

2. Updated model after sampling result 3. Hidden map reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 14

Sampling action optimization (2) Hidden variables

• a ⊆ V selected for sampling • Independent observations o result Observations ⇒ How to optimize the choice of a? X U (a) = −c(a) + P (o|a)V (o, a) o

a∗ = arg max U (a) a

• The computation of a∗ is hard! (NP-hard) • Only feasible for small problems or needs approximation! CompSust’09 - Cornell University - june 2009

– p. 15

Approximate spatial sampling (1) Approximate the computation of ∗

a = arg max −c(a) + a

X

P (o|a)V M P M (o, a)

o

• Explore cells where initial knowledge is the most uncertain: marginal Pi (xi |o, a) closest to 21   n o X a ˜ = arg max −c(a) + f  min Pi (Xi = 1), Pi (Xi = 0)  a

i,ai =1

• Marginals computation is itself NP-hard ⇒ approximation using belief propagation (sum prod) algorithm CompSust’09 - Cornell University - june 2009

– p. 16

Approximate spatial sampling (2) The approximation results from simplifying assumptions: • Sampling actions are reliable • No passive observations • Joint probability approximated by one with idependent factors

CompSust’09 - Cornell University - june 2009

– p. 17

Adaptive spatial sampling (1) • Idea:

- Sampling locations not chosen once for all before the sampling campaign - Intermediate observations are taken into account to design next sampling step - Possibility to visit a cell more than once

CompSust’09 - Cornell University - june 2009

– p. 18

Adaptive spatial sampling (2) • a sampling strategy δ is a tree • a trajectory in τ = (a1 , o1 , . . . , aK , oK )

δ:

Value of a leaf

U (τ ) = −

K X

c(ak ) + V M P M (o0 , o1 , . . . , oK , a0 , a1 , . . . , aK )

k=1

Value of a strategy

V (δ) =

P

τ

U (τ )P (τ | δ)

CompSust’09 - Cornell University - june 2009

– p. 19

Heuristic adaptive spatial sampling • Exact computation is PSPACE-hard ! ⇒ Heuristic algorithm

- on line computation - approximate method for static sampling at each step

CompSust’09 - Cornell University - june 2009

– p. 20

Concluding remarks • A framework for spatial sampling optimization: - based on Hidden Markov random fields - different map quality criteria - extended to “adaptive” sampling • Problems too complex for exact resolution ⇒ Heuristic solution based on approximate marginals computation • Empirical validation on simulated problems: - Comparison of SSS, ASS and classical sampling methods (random sampling, ACS) - Markov random fields parameters learned from real data - ASS > SSS > classical methods CompSust’09 - Cornell University - june 2009

– p. 21

Ongoing work • Exact algorithms for small problems (Usman Farrokh): combining variable elimination and tree search • “Random sets + kriging” approach (Mathieu Bonneau): development of a dedicated approximate method and comparison to the HMRF approach • PhD thesis on adaptive spatial sampling for weeds mapping at the scale of an agricultural area (Sabrina Gaba, INRA-Dijon). • Future? ⇒ Spatial partially observed Markov decision processes

CompSust’09 - Cornell University - june 2009

– p. 22

Questions?

Thanks for listening

CompSust’09 - Cornell University - june 2009

– p. 23

Contents 1- Optimal sampling of a hidden random variable 2- Defining optimal spatial sampling problems 3- Approximate computation of an optimal strategy 4- Evaluation of proposed method on simulated data

CompSust’09 - Cornell University - june 2009

– p. 24

Optimal sampling problem Hidden variable model Prior model P (x)

X a Y

Question: How to reconstruct hidden variable X using sampling actions? 1. Hidden variable model 2. Updated model after sampling result 3. Hidden variable reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 25

Optimal sampling problem Updated model X a

P (o|x, a)P (x) Posterior: P (x|o, a) = P (o|a)

Y

Question: How to reconstruct hidden variable X using sampling actions? 1. Hidden variable model 2. Updated model after sampling result 3. Hidden variable reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 26

Optimal sampling problem Hidden variable reconstruction X

x∗ (o, a) = arg max P (x|o, a) a

x ∗

V (o, a) = f (P (x |o, a))

Y

Question: How to reconstruct hidden variable X using sampling actions? 1. Hidden variable model 2. Updated model after sampling result 3. Hidden variable reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 27

Optimal sampling problem Hidden variable reconstruction X

x∗ (o, a) = arg max P (x|o, a) a

x ∗

V (o, a) = f (P (x |o, a))

Y

Question: How to reconstruct hidden variable X using sampling actions? • x∗ (o, a) is the best reconstruction given sampling result (o, a) • V (o, a) is the value of reconstructed variable after sampling result (o, a) CompSust’09 - Cornell University - june 2009

– p. 28

Optimal sampling problem Sampling action optimization X

U (a) = −c(a) + a

X

P (o|a)V (o, a)

o

a∗ = arg max U (a) Y

a

Question: How to reconstruct hidden variable X using sampling actions? 1. Hidden variable model 2. Updated model after sampling result 3. Hidden variable reconstruction 4. Sampling action optimization CompSust’09 - Cornell University - june 2009

– p. 29

Optimal sampling problem Sampling action optimization X

U (a) = −c(a) + a

X

P (o|a)V (o, a)

o

a∗ = arg max U (a) Y

a

Question: How to reconstruct hidden variable X using sampling actions? The value of an action is a tradeoff between • The cost c(a) of the action and • The expected quality of the reconstructed variable (over all possible sample results) CompSust’09 - Cornell University - june 2009

– p. 30

Contents 1- Optimal sampling of a hidden random variable 2- Defining optimal spatial sampling problems 3- Approximate computation of an optimal strategy 4- Evaluation of proposed method on simulated data

CompSust’09 - Cornell University - june 2009

– p. 31

Contents 1- Optimal sampling of a hidden random variable 2- Defining optimal spatial sampling problems 3- Approximate computation of an optimal strategy 4- Evaluation of proposed method on simulated data

CompSust’09 - Cornell University - june 2009

– p. 32

HMRF model for fire ants problem (1) Eradicated cells, year 2001

Observations, year 2002 Searched cells, year 2002

10

10 10

20

20 20

30

30 30

40

40 40

50

50

50 60

60

60 70

70

70

80

80

80

90

90

90

100

100

100

10

20

30

40

50

60

70

80

90

Eradication (e)

100

10

20

30

40

50

60

70

80

90

100

Search actions (a)

10

20

30

40

50

60

70

80

90

100

Observations (o)

• eradication (at previous year): ei ∈ {0, 1}, i = 1, . . . n • search actions: passive search or active search, ai ∈ {0, 1}, i = 1, . . . n • observations: no nest detected / at least one nest detected, oi ∈ {0, 1}, i = 1, . . . n CompSust’09 - Cornell University - june 2009

– p. 33

HMRF model for fire ants problem (2) • Distribution on maps = Potts model X  X 1 Pe (x | α, β) = exp αei eq(xi , 1) + β eq(xi , xj ) Z i∈V

(i,j)∈E

• Distribution of observation given map, Pai (oi | xi , θ) oi \ xi 0 1

0 1 0

1 1 − θ ai θ ai

with θ0 < θ1

CompSust’09 - Cornell University - june 2009

– p. 34

HMRF model for fire ants problem (3) An initial arbitrary sampling (a0 , o0 ) is used for: • Parameters estimation: λ = (α, β, θ) approximate version of EM for HMRF (Simul field EM) - identification problem between α and θ - OK if θ known: use of expert values • Marginals computation: Pi (xi |o0i , a0i )

5 10 15 20 25 30 35 40 45 50 5

CompSust’09 - Cornell University - june 2009

10

15

20

25

30

35

40

45

50

– p. 35

Heuristic sampling methods evaluation (1) • Evaluation on simulated data • Comparison of behavior of

- random sampling (RS) - adaptive cluster sampling (ACS) - static heuristic sampling (SHS) - adaptive heuristic sampling (AHS)

CompSust’09 - Cornell University - june 2009

– p. 36

Heuristic sampling methods evaluation (2) • Procedure: repeat 10 times

- simulate hidden map x from P (x | α, β) (50 × 50 cells) - apply regular sampling (about 10% of area): a0 - simulate o0 from Pai (oi | xi , θ) (regular sampling plus passive search) - estimate initial knowledge - apply RS, ACS, SHS, AHS, 10 times

CompSust’09 - Cornell University - june 2009

– p. 37

Rate of misclassified cells Configuration 2: total classification errors 0.45 Static Adaptive Cluster Random

0.4

Proportion of misclassified cells

0.14 0.12 0.1 0.08 0.06 0.04

0.14 Static Adaptive Cluster Random

0.35

0.3

0.25

0.2

0.15

Static Adaptive Cluster Random

0.12

Proportion of misclassified cells

0.16

Proportion of misclassified cells

Configuration 8: total classification errors

Configuration 6: total classification errors

0.18

0.1

0.08

0.06

0.04 0.1

0.02 0 0

500

1000

1500

2000

2500

Number of sampled cells

α = (0, −2), β = 0.8

0.05 0

500

1000

1500

2000

2500

Number of sampled cells

α = (0, 0), β = 0.5

0.02 0

500

1000

1500

2000

2500

Number of sampled cells

α = (1 − 1), β = 0.4

θ = (0, 0.8) legend: SHS AHA ACS RS

CompSust’09 - Cornell University - june 2009

– p. 38

Per color error rate Configuration 2: misclassified empty cells

0.06

0.04

0.02

1000

1500

2000

0.5

0.4

0.3

0.2

0.1

0 0

2500

500

0.07 0.06 0.05

1500

2000

0.03 0

2500

500

0.8 0.7 0.6 0.5 0.4 0.3 0.2

2000

2500

Number of sampled cells

α = (0, −2) β = 0.8

1000

1500

2000

Configuration 8: misclassified occupied cells 0.18

Static Cluster

0.7

Random

0.6

0.5

0.4

0.3

0.2

0.1 0

500

1000

1500

2500

Number of sampled cells

Static Adaptive Cluster Random

Adaptive

Proportion of misclassified occupied cells

Proportion of misclassified occupied cells

1000

0.8 Static Adaptive Cluster Random

1500

0.08

Configuration 6: misclassified occupied cells

Configuration 2: misclassified occupied cells

1000

0.1 0.09

Number of sampled cells

0.9

500

Static Adaptive Cluster Random

0.11

0.04

Number of sampled cells

0.1

legend: SHS AHA ACS RS

0.6

2000

2500

Number of sampled cells

α = (0, 0) β = 0.5

CompSust’09 - Cornell University - june 2009

Proportion of misclassified occupied cells

500

0.12 Static Adaptive Cluster Random

Proportion of misclassified empty cells

0.08

Proportion of misclassified empty cells

0.1

0 0

Configuration 8: misclassified empty cells

0.7 Static Adaptive Cluster Random

0.12

0 0

misclassified occupied cells

Configuration 6: misclassified empty cells

0.14

Proportion of misclassified empty cells

misclassified empty cells

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02 0

500

1000

1500

2000

2500

Number of sampled cells

α = (1 − 1) β = 0.4

– p. 39

General behavior

• ACS is not adapted (as expected): poor results • Adaptive HS ≥ Static HS ≥ Random S • Discrepancy between Adaptive HS and Static HS increases with - sampling ressources - hidden map structure

CompSust’09 - Cornell University - june 2009

– p. 40

Where do we sample? Hidden map 5 10 15 20 25 30 35 40 45 50 5

10

15

20

25

30

35

40

45

50

α = (1, −1), β = 0.4, θ = (0, 0.8)

CompSust’09 - Cornell University - june 2009

– p. 41

Where do we sample? Static sampling: A and O

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

CompSust’09 -20Cornell - june 2009 20 40 40 University 20 40 20 40

20

40

20

40

20

40

– p. 42

Where do we sample? Static sampling:marginals

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

CompSust’09 -20Cornell - june 2009 20 40 40 University 20 40 20 40

20

40

20

40

20

40

– p. 43

Where do we sample? Adaptive sampling: A and O (cumul)

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

CompSust’09 -20Cornell - june 2009 20 40 40 University 20 40 20 40

20

40

20

40

20

40

– p. 44

Where do we sample? Adaptive sampling: marginals

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

CompSust’09 -20Cornell - june 2009 20 40 40 University 20 40 20 40

20

40

20

40

20

40

– p. 45

Where do we sample? • No sampling in large empty areas • Sampling preferably near detected occupied sites within low density areas • If sampling ressources increase - SHS complete exploration until the whole area is covered - AHA can visit several times a site before extending exploration to another area

CompSust’09 - Cornell University - june 2009

– p. 46