Multigrid methods for pde optimization

10 downloads 899 Views 2MB Size Report
Page 1 ... Multigrid methods and optimization with partial differential equations are two ... A criterion that models the purpose of the optimization and its cost.
Multigrid methods for pde optimization Alfio Borz`ı Universit`a degli Studi del Sannio Dipartimento e Facolt`a di Ingegneria, Benevento, Italia

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

1 / 64

Introduction Multigrid methods and optimization with partial differential equations are two modern fields of research in applied mathematics and engineering, both starting in the seventies with the works of A. Brandt and W. Hackbusch, and J.L. Lions. PDE-based optimization problems arise in application to medicine biology fluid dynamics mechanics chemical processes ...inverse problems, control problems, design problems are large-sized problems many have nonlinear nondifferentiable structure pose new theoretical & scientific computing challenges Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

2 / 64

Physiology: Control of cardiac arrhythmia 1 || y1 || || y2 ||

0.9 20 0.8

0.7

40

0.6 60 0.5

0.4

80

0.3 100 0.2

0.1 120

20

40

60

80

100

0

120

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

∂y1 ∂t

= −ky1 (y1 − a)(y1 − 1) − y1 y2 + σ ∆y1 + u

∂y2 ∂t

=

[0 +

µ1 y2 ][−y2 − ky1 (y1 − b − 1)] µ2 + y1

y1 - transmembrane potential and y2 - conduttance Austrian Science Fund SFB MOBIS ”Fast Multigrid Methods for Inverse Problems“ Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

3 / 64

Quantum computing: Control of Bose Einstein condensates

0.7

0.7

0.6

0.6

1

0.9

0.8 0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0 −10

−8

−6

−4

−2

0

2

4

6

8

10

0 −10

−8

−6

−4

−2

0

2

4

6

8

10

0

0

1

2

3

4

5

6

7

The BEC is described by the Gross-Pitaevskii equation   ˙ t) = − 1 ∇2 + V (x, u(t)) + g |ψ(x, t)|2 ψ(x, t) i ψ(x, 2 Austrian Science Fund FWF Project ”Quantum optimal control of semiconductor nanostructures“ Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

4 / 64

CFD Shape design optimization: From pipes to engines

Navier-Stokes equations, RSM turbulence models, multi-phase, spray and combustion

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

5 / 64

Outline of the talk

A.B. and V. Schulz, Multigrid methods for PDE optimization, SIAM Review. Optimization with PDE constraints SQP multigrid Schur-complement-smoothing multigrid Collective-smoothing multigrid Multigrid for optimization Convergence theory

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

6 / 64

Optimization with PDE constraints

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

7 / 64

The formulation of PDE-based optimization problems A model of the governing system A description of the optimization mechanism A criterion that models the purpose of the optimization and its cost We have an infinite-dimensional constrained minimization problem

  minimize

J(y , u)

 under the constraint

c(y , u) = 0

L(y , u, p) = J(y , u)+(c(y , u), p)

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

8 / 64

Characterization of the optimizing solution

min J(y , u)

:= h(y ) + ν g (u)

s.t.

=

u∈Uad

c(y , u)

J :Y ×U →R

0

The existence of cy−1 enables a distinction between y , the state variable, and u ∈ Uad ⊂ U, the optimization variable in the admissible set. So we have the mapping u 7→ J(y (u), u) in the form IFT

ˆ u 7→ y (u) 7→ J(y (u), u) =: J(u)

reduced objective

ˆ The solution of minu∈Uad J(u) is characterized by the following optimality system c(y , u) = 0 cy (y , u)∗ p = −h0 (y ) (ν g 0 (u) + cu∗ p , v − u) ≥ 0 for all v ∈ Uad ˆ We have ∇J(u) = ν g 0 (u) + cu∗ p(u), the reduced gradient. Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

9 / 64

Elliptic optimization problems Consider the simplest objective J(y , u) =

1 ν ky − zk2L2 + kuk2L2 2 2

where z ∈ L2 is a target function. Linear elliptic optimal control problem c(y , u) = −∆y − u − f Bilinear elliptic parameter identification problem c(y , u) = −∆y − u y − f we assume that c(y , u) = 0 provides the mapping u 7→ y (u). A. B. and M. Vallejos, Multigrid optimization methods for linear and bilinear elliptic optimal control problems, Computing, 82 (2008), pp. 31-52.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

10 / 64

Elliptic optimality systems Linear

−∆y − u −∆p + y νu − p y = 0 and p

= f = z = 0 = 0

in in in on

Ω Ω Ω ∂Ω

in in in on

Ω Ω Ω ∂Ω

ˆ ∇J(u) = νu − p ˆ ∇2 J(u) = νI + ∆−2 Bilinear

−∆y − uy −∆p + y − up νu − yp y = 0 and p

= f = z = 0 = 0

ˆ ∇J(u) = νu − yp ˆ ∇2 J(u) = νI + y (∆ + u)−2 y + p(∆ + u)−1 y + y (∆ + u)−1 p

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

11 / 64

Quantum control problems

Control of Bose Einstein condensates in magnetic microtraps min J(ψ, u) := u∈U

i ∂ψ ∂t (x, t)

=

1 2



2  1 − hψd |ψ(T )i +

γ 2

RT 0

2 (u(t)) ˙ dt

 − 12 ∇2 + V (x, u(t)) + g |ψ(x, t)|2 ψ(x, t)

A. B. and U. Hohenester, Multigrid optimization schemes for solving Bose–Einstein condensates control problems, SIAM J. Sci. Comp., 30 (2008), pp. 441-462.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

12 / 64

Optimality system for BEC control

The optimal solution is characterized by the optimality system   ∂ψ 1 2 2 i = − ∇ + Vu + g |ψ| ψ ∂t 2   1 2 ∂p 2 = − ∇ + Vu + 2g |ψ| p + g ψ 2 p ∗ i ∂t 2 ∂Vu γ¨ u = − + cu∗ (y` , u` )p` and build the reduced gradient ∇J(u) (2) build some approximation B` ≈ H(y` , u` , p` ) of the reduced Hessian, e.g., by Quasi-Newton update formulae (3) solve ∆u = arg

min 1 u > B` u u∈LU(u` ) 2

ˆ > u, where LU(u` ) denotes the + ∇J(u) `

linearization of U in u` (4) solve the linear problem cy (y` , u` )∆y = −(cu (y` , u` )∆u + c(y` , u` )) (5) update (y`+1 , u`+1 ) = (y` , u` ) + τ · (∆y , ∆u), where τ is some line-search updating factor in the early iterations. A multigrid scheme is applied in steps (1) and (4). Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

17 / 64

Consistency condition for multigrid SQP schemes SQP convergence theory requires that the reduced gradient can be interpreted as a derivative, i.e. we need the consistency condition ˆ > ∇J(u) ` =

∂ J(y` − Acu (y` , u` )u, u` + u) ∂u

where A ≈ cy (y` , u` )−1 is the approximation to cy (y` , u` )−1 defined by the multigrid algorithm for the forward problem. This fact results in the following requirements for the construction of the multigrid components  ∗  ∗ ∗ k−1 = F I kk−1 , A S = (F S) , A I kk−1 = F I k−1 AI k k where

A

and

F

label the operators for the adjoint and forward problems.

V. Schulz, Solving discretized optimization problems by partially reduced SQP methods, Comput. Vis. Sci. 1 (1998), pp. 83–96.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

18 / 64

Schur-complement multigrid approach

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

19 / 64

The KKT-matrix Many multigrid optimization approaches use a smoothing concept based on a Schur-complement splitting of the KKT-matrix. Let w = (∆y , ∆u, ∆p) be the solution to the equation   −∇y L(y , u, p) Aw =  −∇u L(y , u, p)  =: f . (3) −c(y , u) where L(y , u, p) is the Lagrangian of the matrix  Lyy A =  Luy cy

optimization problem and the operator  Lyu cy∗ (4) Luu cu∗  cu 0

is the Karush-Kuhn-Tucker matrix, i.e. matrix of second order derivatives of the Lagrangian of the optimization problem. All variants of SQP methods consider approximations of the matrix A above.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

20 / 64

Schur-complement-based smoothing

Use the iterative scheme wl = wl−1 + R(f − A wl−1 ) For a linear elliptic optimal control problem  I 0 νI A= 0 −∆ −I

 −∆ −I  0

where we can use a multigrid Poisson-solver for inverting −∆. This is the starting point of the early multigrid optimization methods. W. Hackbusch, Fast solution of elliptic control problems, J. Optim. Theory and Appl., 31(1980), pp. 565-581.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

21 / 64

Schur decomposition and Schur complement Consider

 A=

A B

B> D



with symmetric blocks A and D, and A invertible, is an explicit reformulation of a block-Gauss-decomposition, i.e.     A 0 I −A−1 B > A = B S 0 I where S = D − BA−1 B > is the Schur-complement. Iterative Schur-complement solvers are based on the scheme l

w =w

l−1

 +

I 0

˜ −1 B > −A I



˜ A B

0 S˜

−1

(f − A wl−1 ).

(5)

˜ and S˜ are approximations to A and S. where A

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

22 / 64

Choice of the blocks, different strategies First tentative choice (range space factorization)   Lyy Lyu A= Luy Luu The A-block may not be invertible. This arrangement is not well suited for PDE constrained optimization problems, unlike to variational problems like Stokes or Navier-Stokes [Braess-Sarazin ’97, Wittum ’88] Interchanging the 2nd and 3rd row and column in the matrix A and identifying     Lyy cy∗ A= , B = Luy cu∗ , D = Luu cy 0 leads to a nullspace decomposition.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

23 / 64

Choice of the blocks, different strategies With nullspace decomposition the Schur-complement reads S = Luu − Luy cy−1 cu − cu∗ cy∗ −1 Lyu + cu∗ cy∗ −1 Lyy cy−1 cu that is the reduced Hessian that characterizes the optimization problem. Coercivity of the reduced Hessian guarantees well-posedness of the overall optimization problem. In the case of a linear control problem one obtains S = ν I − 0 ∆−1 I − I ∆−1 0 + I ∆−1 I ∆−1 I = ν I + (∆−1 )2 With the choice ˜= A



Lyy c˜y

c˜y∗ 0

 and

S˜ = ν I

one obtains a transforming smoothing iteration. Th. Dreyer, B. Maar, V. Schulz, Multigrid optimization in applications, J. Comput. Appl. Math., 120 (2000), pp. 67–84. Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

24 / 64

Collective smoothing multigrid approach

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

25 / 64

CSMG scheme

A collective smoothing multigrid (CSMG) approach means solving the optimality system for the state, the adjoint, and the control variables simultaneously in the multigrid process by using collective smoothers for the optimizations variables. A CSMG based scheme aims at realizing the tight coupling in the optimality system along the hierarchy of grids in space and time. The smoothing process: The basic idea is to solve the optimality system at grid or block-grid point level, within the admissible set and consistently with the differential structure of the problem. The CSMG multigrid schemes result robust with respect to opimization parameters and provide mesh-independent convergence.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

26 / 64

CSMG smoothing for a linear elliptic control problem −∆y − u = f −∆p + y = z (νu − p, v − u) ≥ 0

for all v ∈ Uad

where Uad := {u ∈ L2 (Ω) | uL (x) ≤ u(x) ≤ uH (x) a.e. in Ω} Let Then

A B

= −yi−1,j − yi+1,j − yi,j−1 − yi,j+1 − h2 fi,j , = −pi−1,j − pi+1,j − pi,j−1 − pi,j+1 − h2 zi,j . A + 4yi,j − h2 ui,j B + 4pi,j + h2 yi,j νui,j − pi,j ⇒ ⇒

= = =

0, 0, 0.

yi,j (ui,j ) = pi,j (ui,j ) =

1 2 4 (h ui,j − A) 1 2 2 16 (−h ui,j + h A

− 4B)

ˆ Setting ∇J(u) = νu − p(u) = 0, we obtain the auxiliary variable ei,j = u

1 (h2 A − 4B). 16ν + h4

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

27 / 64

Collective Smoothing: Projection The new value for ui,j resulting from the smoothing step is given by  ei,j ≥ uH i,j  uH i,j if u ei,j if uLi,j < u ei,j < uH i,j . u ui,j =  ei,j ≤ uLi,j uLi,j if u This collective Gauss-Seidel step satisfies the inequality constraint. Example: u˜ > uH ; then u = uH . Therefore (v − u) ≤ 0 for any v ∈ Uad , and νu − p

=

νu − (4 B − h2 A − h4 uij )/16

=

[(16ν + h4 )u − (4 B − h2 A)]/16


0.Then λ kv k3 + (∇2 Jˆk (u)v , v )k k

1 Jˆk (u + α(u, v )v ) ≤ Jˆk (u) + α(u, v )(fk , v ) + α(u, v )(∇Jˆk (u) − fk , v )k . 2

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

60 / 64

The coarse-grid correction provides a descending direction The following lemma states that the coarse-to-fine minimization step with step-length α is a minimizing step. Lemma 3: Let u ∈ Vk and define u˜ = Ik−1 u. Let v˜ ∈ Vk−1 and define k v = Ikk−1 (˜ v − u˜). Assume Jˆk−1 (˜ v ) − (fk−1 , v˜ )k−1 ≤ Jˆk−1 (˜ u ) − (fk−1 , u˜)k−1 , where   (fk−1 , v˜ )k−1 = Ik−1 fk + ∇Jˆk−1 (˜ v ) − Ik−1 ∇Jˆk (v ), v˜ k k

, and k−1

  k−1 ˆ ˆk−1 (˜ (fk−1 , u˜)k−1 = Ik−1 f + ∇ J u ) − I ∇ J (u), u ˜ k k k k

. k−1

Then 1 Jˆk (u + α(u, v )v ) − Jˆk (u) − α(u, v )(fk , v )k ≤ α(u, v )(∇Jˆk (u) − fk , v )k . 2

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

61 / 64

Convergence of MGOPT

Theorem 4: Let Jˆk (uk ) − (fk , uk )k be twice Fr´echet differentiable and let ∇2 Jˆk be locally Lipschitz continuous and satisfies the Hessian conditions in a neighborhood Vk of uk∗ = argminuk (Jˆk (uk ) − (fk , uk )k ). Then the MGOPT method provides a minimizing iteration.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

62 / 64

References on CSMG and MGOPT convergence theory

A.B., K. Kunisch, and Do Y. Kwak, Accuracy and Convergence Properties of the Finite Difference Multigrid Solution of an Optimal Control Optimality System, SIAM J. Control Opt., 41(5) (2003), 1477-1497. A.B. and M. Vallejos, Multigrid optimization methods for linear and bilinear elliptic optimal control problems, Computing, 82 (2008), pp. 31-52. W. Hackbusch and A. Reusken, Analysis of a damped nonlinear multilevel method, Numer. Math., 55 (1989), pp. 225–246. S.G. Nash, A multigrid approach to discretized optimization problems, Optimization Methods and Software, 14 (2000), pp. 99–116. J. Schoeberl and W. Zulehner, Symmetric Indefinite Preconditioners for Saddle Point Problems with Applications to PDE-Constrained Optimization Problems, SIAM J. Matrix Analysis and Applications, 29(2007), pp. 752-773.

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

63 / 64

Best regards and many thanks for your interest!!

See you in Benevento

Alfio Borz`ı (Universit` a degli Studi del Sannio, Multigrid Italy) methods for pde optimization

64 / 64