NONCONVEX SEMI-LINEAR PROBLEMS AND CANONICAL ...

11 downloads 0 Views 727KB Size Report
where DW(u) represents the Gâteau derivative of W at u, which is a mapping from U into ..... U(Λu) denotes the Gâteaux derivative of ¯U with respect to. ϵ = Λu.
Chapter 5 NONCONVEX SEMI-LINEAR PROBLEMS AND CANONICAL DUALITY SOLUTIONS David Yang Gao Department of Mathematics Virginia Polytechnic Institute & State University Blacksburg, VA 24061, USA [email protected]

Abstract

This paper presents a brief review and some new developments on the canonical duality theory with applications to a class of variational problems in nonconvex mechanics and global optimization. These nonconvex problems are directly related to a large class of semi-linear partial differential equations in mathematical physics including phase transitions, post-buckling of large deformed beam model, chaotic dynamics, nonlinear field theory, and superconductivity. Numerical discretizations of these equations lead to a class of very difficult global minimization problems in finite dimensional space. It is shown that by the use of the canonical dual transformation, these nonconvex constrained primal problems can be converted into certain very simple canonical dual problems. The criticality condition leads to dual algebraic equations which can be solved completely. Therefore, a complete set of solutions to these very difficult primal problems can be obtained. The extremality of these solutions are controlled by the so-called triality theory. Several examples are illustrated including the nonconvex constrained quadratic programming. Results show that these very difficult primal problems can be converted into certain simple canonical (either convex or concave) dual problems, which can be solved completely. Also some very interesting new phenomena, i.e. trio-chaos and meta-chaos, are discovered in post-buckling of nonconvex systems. The author believes that these important phenomena exist in many nonconvex dynamical systems and deserve to have a detailed study.

Keywords: duality, triality, global optimization, nonconvex variations, canonical dual transformation, nonconvex mechanics, critical point theory, semilinear equations, NP-hard problems, quadratic programming.

261

262

1.

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Nonconvex Problems and New Phenomena

Nonconvex phenomena arise naturally from real-life systems. Many problems in modern mechanics, sciences, and economics require the consideration of nonconvexity for their accurate mathematical modeling. In engineering mechanics, a multi-interdisciplinary research area, i.e. the so-called nonconvex mechanics has been developed recently. This new research field involves a powerful combination of theoretical analysis in mathematical modelling of natural systems, finite deformation theory, material science, nonlinear partial differential equations, global optimization, variational methods, dynamical systems, global optimization, numerical algorithms and scientific computations (see Gao, Ogden and Stavroulakis, 2002). The primary goal of this paper is to present a brief review and recent development on canonical dual transformation method, as well as associated theory for solving the following nonconvex variational problem (in short, the primal problem (P) ): (P) :

1 min P (u) = u, Au + W (u) − u, f  2

∀u ∈ Uk ,

(1.1)

where the feasible space Uk is a convex subset of a normed space U with non-empty interior; A : U → U ∗ is a linear, self-adjoint operator such that A = A∗ , which maps each u ∈ U into the dual space U ∗ ; the bilinear form u, u∗  : U × U ∗ → R puts U and U ∗ in duality; W : U → R is a given (not necessarily convex) function; f ∈ U ∗ is a given input; P : Uk → R represents the total cost (action) of the system. We are interested mainly in finding all critical points of the nonconvex function P (u), although the global minimizers will be discussed in constrained optimization problems. Thus, in the case that the nonconvex function ateaux differentiable, the stationary (or criticality) W : Uk → R is Gˆ condition DP (u) = 0 leads to the governing equation Au + DW (u) = f,

(1.2)

where DW (u) represents the Gˆateau derivative of W at u, which is a mapping from U into its dual space U ∗ . The abstract form (1.2) of the primal problem (P) covers many situations.

1.1

Semi-linear equations and double-well potential

In nonconvex mechanics and variational problems, where U is an infinite dimensional function space, the state variable u ∈ U is a field function, and A : U → U ∗ is usually a partial differential operator over a given space domain Ω ⊂ Rn . In this case, the governing equation (1.2)

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

263

is a so-called semi-linear equation, which plays an important role in materials science and physics including: ferroelectricity, liquid crystals, ferromagnetism, ferroelasticity, and superconductivity. For example, in Landau’s theory of superconductivity where  2  1 2 1 α |u| − λ dΩ (1.3) W (u) = 2 Ω 2 is the so-called double-well potential, in which α, λ > 0 are material constants, and |u| denotes the Euclidean norm of u. If A = ∆ is simply a Laplacian operator, then the governing equation (1.2) leads to the well-known Landau-Ginzburg equation 1 ∆u + αu( |u|2 − λ) = f. 2 If A = ∆ + curlcurl, then (1.2) is the Cahn-Hillard equation in liquid crystal 1 ∆u + curlcurlu + αu( |u|2 − λ) = f. 2 The double-well potential was first studied by van der Waals in fluids mechanics in 1895. In phase transitions of shape memory alloys, each local minimizer of W corresponds to a certain phase state of the material. However, each local maximizer characterizes the critical conditions that lead to the phase transitions. In unilateral post-bifurcation analysis of beam contact problems, the solution of the post-buckling state is usually a local minimizer (see gao00a). Due to the nonconvexity of the doublewell function W (u), the semi-linear equation (1.2) has proven difficult to solve. Traditional direct analysis and related numerical methods for solving this nonconvex variational problem have proven unsuccessful to date. In dynamical systems, if A = ∂,tt − ∆ is a wave operator over a given space-time domain Ω ⊂ Rn × R, and the nonconvex functional is simply given as W (u) = Ω cos u dΩ, then (1.2) is the well-known sine-Gordon equation u,tt − ∆u = sin(u) + f. This equation appears in many branches of physics. It provides one of the simplest models of the unified field theory. It can also be found in the theory of dislocations in metals, in the theory of Josephson junctions, as well as in interpreting certain biological processes like DNA dynamics. In one-dimensional case where Ω ⊂ R is a time domain, and A = d2 /dt2 , the third order Taylor’s expansion of sin(u) leads to the well-known Duffing equation 1 (1.4) u,tt + αu( u2 − λ) = f. 2

264

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Even this very simple one dimensional ordinary differential equation, an analytic solution is still very difficult to obtain. It is known that this equation is extremely sensitive to the parameters λ, α > 0, the input f (t) and the initial conditions given in Uk . Mathematically speaking, the so-called chaotic phenomena in nonlinear dynamics is mainly due to the nonconvexity of the total energy P (u). Very small perturbations of the system’s initial conditions and parameters may lead the system to different local minimizers with significantly different performance characteristics. For the double-well potential (1.3), the total energy associated with the semi-linear equation (1.2) is    α 1 2 2 ( |u| − λ) − u(f − Au) dΩ. (1.5) P (u) = Ω 2 2 To see the influence of the driving force f (t) and the reaction force Au on the critical point of the nonconvex energy P (u), we let J(u) = α 1 2 2 2 ( 2 |u| − λ) − ufu be the so-called energy density, where fu = f − Au. The graph of the energy density J(u) is shown in Fig. 5.1, where fc is a certain critical force measure. Increasing fu resulted in changes in the

(a) fu > fc Figure 5.1.

(b) fu = fc

(c) fu < fc

Effect of driving force fu on potential diagrams J(u).

relative depths of the two minimizers of J(u) and in the height of the local maximizer. The value u at which the minimizer(s) occurred shifted slightly with fu . For fu > fc the results demonstrate that the potential energy surface has a single potential well which is a global minimizer, whereas for fu < fc it has a double potential well which has a local and a global minimizer with a local maximizer in-between. Since the force field fu depends on both time t and the state u, any numerical error at each iteration may lead the state u to be very different critical points of the nonconvex action P (u). This is one of main reasons why traditional perturbation analysis and the direct approaches cannot successfully be

265

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

applied to nonconvex systems. For Duffing system, Fig. 5.3 shows clearly that for the same given data, different numerical algorithms produce very different vibration modes and “trajectories” in phase space u-p (p = u,t ). (a) u(t)

(b) Trajectory in phase space u−p

4

2

3 1

2 1

0 0 −1

−1

−2 −3

0

10

20

Figure 5.2a.

30

40

−2 −4

−2

0

2

4

Numerical results computed by “ode23”

(a) u(t)

(b) Trajectory in phase space u−p

4

2

3 1

2 1

0 0 −1

−1

−2 −3

0

10

20

Figure 5.2b. Figure 5.3.

1.2

30

40

−2 −4

−2

0

2

4

Numerical results computed by “ode15s”

Numerical results by two different software in MATLAB

Parameter effects: meta-chaos and trio-chaos

It is well-known that the semi-linear equation (1.2) is also very sensitive to the parameter λ. Many research papers have shown that for a given load f (t) and initial conditions, certain positive parameters λ may lead the Duffing equation to a chaotic vibration. However, the parameter λ has particular physical meaning in each real system concerned, and can not be chosen arbitrarily. For example, λ could be residual strain in solid mechanics, dislocation in mate rail science, or input control in distributed parametric control systems (see Gao, 1989). In the large deformation elastic beam model

266

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

proposed by the author (see Gao, 1996, 2000d) 1 u,tt + Ku,xxxx + α( u2,x − λ)u,xx − f = 0 ∀(x, t) ∈ (0, ) × (0, tc ), (1.6) 2 the parameter λ represents the applied axial load, and K > 0 is a material constant. Clearly, if λ < 0, i.e. the beam is subjected to an extensive load, the stored energy density W () = 12 α( 12 2 − λ)2 is a strictly convex function of the bending slop  = u,x . In this case, the beam is in stable deformation state. However, for compressive axial load λ > 0, the function W () is nonconvex with two local potential wells. In static problems, it was shown by the author (see also gao00a) that when the axial load λ is bigger than the Euler buckling load  Ku2,xx dx λc = inf  2 u,x dx the beam is in a post-buckled (bifurcation) state. In this case, the total potential    1 1 1 2 2 2 Ku,xx + α( u,x − λ) − f u dΩ P (u) = 2 2 Ω 2 may have three critical points ui (x), i = 1, 2, 3 at each material point x ∈ Ω ⊂ R: two local minimizers, corresponding to two possible stable buckled states, and one local maximizer, corresponding to an unstable buckled state. The global minimizer of P depends on the lateral load f (see also Fig. 5.1). If the beam is subjected to a periodic dynamical load f (x, t), the two local minimizers of P at each point (x, t) in space-time space become extremely unstable. It was shown also by the author in gao00d that if the displacement u(x, t) can be separated variables as u(x, t) = q(t)w(x), this nonlinear beam model can be eventually reduced to the Duffing equation. In the case that the beam is subjected to a periodic load f (t) = C cos(ωt), then for a given parameter λ > λc and w(0) = w0 , i.e. the initial post-buckling state, Figure 5.5a shows a chaotic post-buckling diagram of the deflection vs the force amplitude C. This chaotic diagram is well-known to both mathematicians and engineers. Actually, Figure 5.5b shows that this diagram is only a projection of the trajectories of the Duffing system for each fixed force amplitude C > 0. The number of research publications on this chaotic diagram for different semi-linear equations is getting larger and larger. However, most of these efforts are based on the traditional perturbation analysis and the direct iteration methods. Since the physical parameter λ were given discretely, some important physical phenomena in the real-life bifurcation problems could not been observed.

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

Figure 5.4a.

Chaotic bifurcation diagram: displacement u vs force amplitude

Figure 5.4b.

Trajectories of chaotic vibration in force-phase space (u, u,t , C)

Figure 5.5.

267

Chaotic bifurcation for pre-bucked Duffing equation

Based on the nonlinear beam model and the physical meaning of the parameter λ, it is discovered recently (see Gao, 2002) that for a given linearly increasing compressive load λ = kt + λ0 > 0, the Duffing system may experience three chaotic bifurcation periods before the beam system finally crushed (see Fig. 5.6(a)). A closed look at the amplitude u vs the axial load λ is given in Fig. 5.8, which reveals a very interesting new phenomenon in nonconvex dynamical systems, i.e. there exists a so-called meta-chaos transition period between the pre-buckling and chaotic bifurcation (see Fig.5.8b). The author believes that this interesting phenomenon exists also in many other nonconvex systems, and deserves to have a detailed study.

268

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

(a) Chaos in beam buckling: u(t) and λ(t) 10

5

0

−5

−10

0

2000

4000

6000

8000

10000

12000

(b) Zone−I: meta−chaos and chaos−1 3 2 1 0 −1 −2 −3

400

450

500

550

600

650

700

750

800

(c) Zone−II: chaos−2 −3.5 −3.6 −3.7 −3.8 −3.9 −4

8780

8800

8820

8840

8860

8880

8900

(d) Zone−III: chaos−3 10

5

0

−5 9820

9840

9860

9880

9900

9920

9940

9960

9980

10000

Figure 5.6. Trio-Chaos: Life of the semi-linear nonconvex system with time dependent parameter λ = kt + λ0

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

Figure 5.7a.

Pre- to post-bifurcation: Amplitude u vs λ.

Figure 5.7b. Figure 5.8.

Chaos vase: A closed vision.

Meta-chaos: A new phenomenon in chaotic systems

269

270

1.3

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Global optimization and NP-hard problems

The numerical discretization of the nonconvex variational problem (P) in mathematical physics usually leads to a nonconvex optimization problem in finite dimensional space U = Rn , where the bilinear form u, u∗  = uT u∗ is simply the dot product of two vectors, and the operator A : Rn → U ∗ = Rn is a symmetrical matrix. In discrete dynamical systems, the operator A = AT ∈ Rn×n is usually indefinite. For constrained mathematical programming problems, the function W (u) could be the so-called indicator of the constraint set Uk , defined by  0 if u ∈ Uk , W (u) = ∞ otherwise. This nonsmooth function was first studied by J.J. Moreau in frictional mechanics (Moreau, 1968), and is called the superpotential in nonsmooth mechanics. Clearly, W (u) is a convex function if the feasible set Uk is convex subspace of U. For example, if we let Uk = {u ∈ Rn | Bu ≤ b}, where B is an m × n matrix and b ∈ Rm is a vector, then Uk is a convex set of Rn In this case, the primal variational problem can be reduced to the well-known (nonconvex) quadratic minimization (Pb ) :

1 min P (u) = uT Au − uT f 2 s.t. Bu ≤ b.

(1.7)

Introducing a Lagrange multiplier ∗ ∈ Rm to relax the inequality constraint Bu ≤ b, the classical Lagrange function for (Pb ) is given by 1 L(u, ∗ ) = uT Au − f T u + ∗ T (Bu − b). 2

(1.8)

Thus the first order Karush-Kuhn-Tucker (KKT) optimality conditions for (Pb ) can be written as follows Au + B T ∗ = f, Bu − b ≤ 0, ∗ ≥ 0,

(1.9) (1.10)

∗ T (Bu − b) = 0.

(1.11)

Equation (1.11) is also refereed as the complementarity condition, which is usually written in the form of ∗ T ⊥(Bu − b),

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

271

i.e. the Lagrange multiplier ∗ ∈ Rm should be complementary (perpen¯ which dicular) to the constraint vector (Bu − b) ∈ Rm . Any point u satisfies (1.10)-(1.11) is called a KKT stationary point of (Pb ). KKT conditions have wonderful physical meanings in engineering mechanics (see Gao, 1988, 1996, and Chapter 7 in gao00a). By use of the socalled subdifferential of the superpotential W (u), the KKT conditions (1.9-1.11) can be written in an unified elegant format (see Section 6). In mathematical programming, it is known that the KKT conditions ¯ are only necessary for the quadratic optimization problem (Pb ), i.e. if u ¯ must be a KKT point. If the mais an optimal solution of (Pb ), then u trix A is positive semi-definite, or positive definite, then (Pb ) is a convex programming problem. In this case, a KKT point u ¯ is also sufficient for problem (Pb ), which can be solved easily by any of polynomial algorithms. However, when A is not positive semi-definite, the cost function P (u) is nonconvex, and it might possess many local minimizers. In this case, (Pb ) becomes a nonconvex problem, and the application of traditional local optimization procedures for this problem can not guarantee the identification of the global minima. Nonconvex quadratic programming problem has great importance both from the mathematical and application viewpoints. Sahni (1974) first showed that for a negative definite matrix A, the problem (Pb ) is NP-hard. This result was also proved by Vavasis (1990, 1991) and by Pardalos (1991). During the last decade, several authors have shown that the general quadratic programming problem (Pb ) is an NP-hard problem in global optimization (cf. Murty and Kabadi, 1987; Horst et al, 2000). It was shown by Pardalos and Vavasis (1990) that even when the matrix A is of rank one with exactly one negative eigenvalue, the problem is NP-hard. In order to solve this difficult problem, many efforts have been made during the last decade. A comprehensive survey has been given by Floudas and Visweswaran (1995). However, by using the canonical dual transformation method developed recently by the author, a complete set of solutions can been obtained for the problem with certain constraints (see Gao, 2004a,b). The aim of this article is to present applications of the generalized canonical dual transformation method to the general nonconvex problem (P) in finite dimensional space. We will show that by using this method, the nonconvex primal problem (P) can be transformed into a perfect dual problem (P d ), and the coupled nonlinear system (1.2) in Rn can be converted into a dual algebraic equation in R1 . Therefore, a complete set of critical points for the nonconvex function P (u) on the feasible set Uk can be obtained. The global minimizer of the primal problem is controlled by the triality theorem. Some concrete examples

272

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

are presented in Sections 5 for unconstrained problems, while in Section 6, a set of complete solutions is obtained for quadratic programming with inequality constraints.

2.

Canonical Duality Theory: A brief Review

The concept of duality is one of the most successful ideas in modern science. Inner beauty in natural phenomena is bound up with duality and, in particular, is significant in mathematics and mechanics. Classical duality theory in convex analysis and optimization can be found in monographs by Rockaffellar (1974), Ekeland and Temam (1976), Strang (1986), Sewell (1987), Walk (1989), Ekeland (1990), Goh and Yang (2002) and many more. For nonconvex systems, a so-called canonical duality theory was presented by the author in gao00a. By the definition introduced in gao00a; gao00c, a Gˆ ateaux differen¯ tiable function F : Ua → R is said to be a canonical function on Ua if its Gˆ ateaux derivative DF¯ : Ua → Ua∗ ⊂ U is an one-to-one mapping from Ua onto its range Ua∗ . Thus, if F¯ (u) is a canonical function, the duality relation u∗ = DF¯ (u) is invertible on Ua × Ua∗ , and its Legendre conjugate F¯ ∗ : Ua∗ → R can be defined uniquely by the classical Legendre transformation F¯ ∗ (u∗ ) = {u, u∗  − F¯ (u) | DF¯ (u) = u∗ , u ∈ Ua }.

(2.1)

The duality pair (u, u∗ ) is called the canonical duality pair on Ua × Ua∗ if and only if the duality relations u∗ = DF¯ (u) ⇔ u = DF¯ ∗ (u∗ ) ⇔ F¯ (u) + F¯ ∗ (u∗ ) = u, u∗ 

(2.2)

hold on Ua ×Ua∗ . For examples, if the function W (u) in (P) is a canonical function on Ua , then F¯ (u) = u, f  − W (u) is also a canonical function for any given f ∈ Ua∗ . In engineering mechanics and physics, if the canonical function F¯ (u) represents the stored energy, then its canonical conjugate F¯ ∗ (u∗ ) is called complementary energy. The one-to-one canonical duality relation u∗ = DF¯ (u) represents the constitutive law of the system. Detailed study on canonical duality theory and its applications to general nonconvex systems is given in the monograph gao00a. This paper will discuss the applications to the general nonconvex variational problem (P) presented at the beginning of this paper. Thus, if the function W (u) in the primal problem (P) is a canonical function, the dual problem (P d ) can be formulated in different ways.

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

2.1

273

Clarke-Ekeland-Lasey duality

First, we let F (u) = u, f  − W (u), and assume that W (u) is convex. Therefore, F (u) is concave and, a canonical function. In this case, the primal function P (u) = 12 u, Au + W (u) − u, f  can be written in the so-called action form (see Ekeland, 1990, 2003) 1 P (u) = u, Au − F (u). 2 We shall use the notation F  to denote the Fenchel inf-conjugate of F , defined by F  (u∗ ) = inf {u, u∗  − F (u)}. u∈Ua

Clearly, F  : U ∗ → R ∪ {−∞} is always concave, upper semi-continuous. If F (u) is also concave, upper semi-continuous, then the following Fenchel inf-duality relations hold on Ua × Ua∗ u∗ ∈ ∂ + F (u) ⇔ u ∈ ∂ + F  (u∗ ) ⇔ F (u) + F  (u∗ ) = u, u∗ , (2.3) where ∂ + F = −∂ − (−F (u)) is called the super-differential of F , corresponding to the sub-differential ∂ − in convex analysis. The duality pair (u, u∗ ) ∈ Ua × Ua∗ is called a Fenchel inf-duality pair if the Fenchel infduality relations (2.3) hold on Ua × Ua∗ . Thus, in the case that W (u) is convex, the first dual action form can be presented as 1 P c (u) = u, Au − F  (Au). 2 This dual action form was originally given by Clarke (1985) in the case of convex Hamiltonian systems. The generalized formulation is due to Ekeland and Lasry (see Ekeland, 1990, 2003). The dual action principle states that if F is concave, then u ¯ is a critical point of P if and only if c ¯ + Ker A are critical points of P c , and the complementarity all the u ¯ ∈u condition uc ) = 0 ∀¯ uc ∈ u ¯ + Ker A (2.4) P (¯ u) + P c (¯ holds, where Ker A = {u ∈ U| Au = 0} represents the kernel of A. However, if F (u) is nonconcave, the Fenchel-Young inequality F  (u∗ ) ≤ u, u∗  − F (u) leads to uc ) ≥ 0. θ = P (¯ u) + P c (¯ The non zero θ > 0 is called the duality (or complementarity) gap. This duality gap shows that the Clarke dual action principle does not hold

274

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

for nonconvex problems. Coincided with the complementarity condition (2.4), the dual action form P c is also called the complementary action, and the duality gap is referred as the complementarity gap in gao00a. Actually, the so-called complementary formulation has been a classical concept in engineering mechanics and physics for about one century, where a problem is said to be a complementary problem means that it is equivalent to the primal problem without any duality gap (see Gao and Strang, 1989, and gao00a). It seems that engineers and physicists like only the perfect duality formulations. As indicated in the very recent paper by Ivar Ekeland (2003) that if F (u) is nonconvex, the (perfect) dual action form (without complementarity gap) is an open problem in nonconvex systems. In global optimization, canonical duality is also refereed as the perfect duality, or duality with zero duality gap (see Gao, 2003). Perfect duality theory and reformulation are playing more and more important roles in nonlinear mathematical programming. Based on the augmented Lagrangian theory and penalty function methods, a so-called nonlinear Lagrange theory has been developed recently for solving nonconvex constrained optimization problems, where the zero duality gap property is equivalent to the lower semi-continuity of a perturbation function (see Rubinov and Yang, 2003).

2.2

Lagrangian duality

The second dual formulation is based on the factorization of the selfadjoint (symmetrical) operator A = Λ∗ KΛ, where Λ : U → E is a so-called geometrical operator, which maps each configuration u ∈ U into a so-called intermediate space E; the symmetrical constitutive operator K links E with its dual space E ∗ . Let ; ∗  denotes the bilinear form in E × E ∗ , the balance operator Λ∗ : E ∗ → U ∗ can be defined by Λu; ∗  = u, Λ∗ ∗ , which maps each dual intermediate variable ∗ ∈ E ∗ back to the dual configuration space U ∗ . By the definition of the canonical function, if the operator K : Ea → Ea∗ is invertible, then the quadratic function ¯ () = 1 ; K is a canonical function on Ea . Moreover, if the feasible U 2 space Uk can be written in the canonical form (see gao00a): Uk = {u ∈ Ua | Λu ∈ Ea },

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

275

then, based on the trio-factorization A = Λ∗ KΛ, the primal problem (P) can be re-written in the canonical form ¯ (Λu) − F¯ (u)| u ∈ Uk }. min{P (u) = U

(2.5)

The criticality condition DP (¯ u) = 0, i.e. the semi-linear equation (1.2) in this case can be reformulated as ¯ (Λu) − DF¯ (¯ u) = 0, Λ ∗ DΛ U ¯ (Λu) denotes the Gˆ ¯ with respect to where DΛ U ateaux derivative of U ∗  = Λu. In terms of the canonical duality pairs (u, u ) and (, ∗ ), this semi-linear equation can be split into the so called trio-canonical forms (a) geometrical equations:  = Λu, ¯ (), u∗ = DF¯ (u), (b) duality relations: ∗ = DU ∗ ∗ (c) balance equation: u = Λ ∗ .

(2.6)

The problem (2.5) is said to be geometrically linear (resp. nonlinear) if the geometrical operator Λ is linear (resp. nonlinear); the problem is said to be physically (or constitutively) linear (resp. nonlinear) if the both duality relations are linear (resp. nonlinear); the problem is said to be fully nonlinear if it is both geometrically and physically nonlinear (see gao00a; gao00c) The development of Λ∗ Λ-operator theory was apparently initiated by von Neumann in 1932, and was subsequently extended and put into a more general setting in the studies of complementary variational principles by Noble (1966), Rall (1969), Arthus (1970, 1980), Tonti (1972), Oden and Reddy (1974, 1983) and Sewell (1987). In the excellent textbook by Strang (1986), the trio-factorization A = Λ∗ KΛ for linear operators can be seen through continuous theories to discrete systems. For nonlinear operators A the trio-factorization and canonical forms in nonconvex and non conservative systems were presented in gao00a. The trio-canonical forms (2.6) serve as a framework for the classical Lagrangian duality theory in geometrically linear systems. Through the classical Lagrangian L : Ua × Ea∗ → R ¯ ∗ (∗ ) − F¯ (u), L(u, ∗ ) = Λu; ∗  − U

(2.7)

the canonical dual function P ∗ : Ek∗ ⊂ Ea∗ → R can be defined by ¯ ∗ (∗ ) P ∗ (∗ ) = {L(u, ∗ )| Du L(u, ∗ ) = 0, u ∈ Ua } = F¯ ∗ (Λ∗ ∗ ) − U (2.8) on the dual feasible space Ek∗ = {∗ ∈ Ea∗ | Λ∗ ∗ ∈ Ua∗ } (see gao00a).

276

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Mono-Duality Theory. In geometrically linear static systems, where ¯ () is usually convex and F¯ (u) is concave. In the canonical function U this case, (the total potential) P (u) is convex, and L(u, ∗ ) is a saddle function on Ua × Ea∗ . The saddle Lagrange duality theory leads to the classical min-max duality theory in convex systems inf P (u) = inf sup L(u, ∗ ) = sup inf L(u, ∗ ) = sup P ∗ (∗ ). u∈Ua ∗ ∈E ∗ a

∗ ∈Ea∗ u∈Ua

∗ ∈Ek∗

Based on this classical saddle-Lagrangian duality, the so-called primaldual interior-point method has been considered as a revolutionary technic in convex programming during the last fifteen years (cf. Wright, 1998). Bi-Duality Theory. In geometrically linear dynamical systems and ¯ () and F¯ (u) are usually convex. In this case, the game theory, both U ¯ () − F¯ (u) is the so-called total action in canonical function P (u) = U dynamic systems, which is a d.c. function (i.e. difference of convex functions). The Lagrangian (2.7) associated with the d.c. function P (u) is a so-called super- (or ∂ + -) Lagrangian (see gao00a), i.e. L(u, ∗ ) is u, ¯∗ ) is a critical point concave in each of its variables u and ∗ and if (¯ ∗ of L(u,  ), than the inequality u, ¯∗ ) ≥ L(¯ u, ∗ ) ∀(u, ∗ ) ∈ Ua × Ea∗ . L(u, ¯∗ ) ≤ L(¯ Clearly, the Hamiltonian ¯ ∗ (∗ ) H(u, ∗ ) = Λu; ∗  − L(u, ∗ ) = F¯ ∗ (Λ∗ ∗ ) + U associated with a super Lagrangian L(u, ∗ ) is always convex in each of its variables. This might be the reason why most people prefer the convex Hamiltonian H(u, ∗ ) instead of the super-Lagrangian L(u, ∗ ) in dynamic systems. The super-Lagrangian leads to a so-called bi-duality theory gao99a; gao00a; gao01a, i.e. if (¯ u, ¯∗ ) is a critical point of a superLagrangian L, then either u, ¯∗ ) = ∗inf ∗ P ∗ (∗ ), inf P (u) = L(¯

u∈Uk

or

 ∈Ek

u, ¯∗ ) = sup P ∗ (∗ ). sup P (u) = L(¯ u∈Uk

∗ ∈Ek∗

This bi-duality theory plays an important role in periodic convex Hamilton systems, as well as the so-called d.c. programming (cf. gao00a). The Lagrange duality theory also plays an important role in nonsmooth convex systems. As illustrated in gao00a; gao00c if the primal problem is nonsmooth, its Legendre dual problem is smooth. In nonlinear programming, if we can choose a geometrical operator Λ : U =

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

277

Rn → E = Rm with n > m, then the original primal problem in Rn can be converted into a dual problem in Rm . This dimension reduction technique is very important in large-scale nonlinear programming. In the problem (P) considered in the present paper, since the function W (u) is nonconvex, it turns out that F (u) = u, f  − W (u) is no longer a canonical function, and the duality relation u∗ = DF (u) is not one-to-one. Thus, the Legendre transformation (2.1) of the nonconvex function F can not be uniquely defined (see Sewell, 1987). In this case, the Fenchel-Young inequality for the nonconvex function F produces also a nonzero duality gap between the primal function P (u) and its classical Lagrangian dual function P ∗ (∗ ). This duality gap shows that the well-developed classical Lagrange duality can be used mainly for convex problems or d.c. programming. During last three decades, many modified versions of the Fenchel-Rockafellar duality have been proposed, one, the so-called relaxation method in nonconvex mechanics (cf., Dacorogna, 1989; Atai and Steigmann, 1998), can be used to solve the relaxed convex problems. However, due to the duality gap, these relaxed solutions are not equivalent to the real solutions. Tremendous efforts have been focused recently on finding the so-called perfect duality theory (i.e. without a duality gap) in global optimization. Some important concepts have been developed in global optimization and variational inequalities (cf. e.g., Ekeland, 1977; Toland, 1978; Auchmuty, 1983, 2001; Penot and Volle, 1990; Singer, 1998; Thach et al, 1993-96; Tuy, 1991, 1995; Rubinov et al, 2001; Gasimov, 2002; Goh and Yang, 2002, Rubinov and Gasimov, 2003, and much more). Generally speaking, the main difficulty is due to the fact that the Legendre conjugate of a general nonconvex function is usually multi-valued. Although a striking example in nonlinear elasticity has been proposed recently by Ekeland (2003), as he pointed out, the general methods and theory for solving nonconvex problems remain open.

2.3

Canonical duality theory

Canonical duality theory and the trio-canonical forms in nonconvex (geometrically nonlinear) systems were originally studied by Gao and Strang (1989) in large deformation variational/boundary value problems governed by nonsmooth duality relations (constitutive laws), where A = 0 and the primal problem (P) takes the following stationary variational form (Psta ) :

ˇ (Λ(u)) − u, f  → sta ∀u ∈ Uk , P (u) = W

(2.9)

in which, the notation P (u) → sta ∀u ∈ Uk stands for finding all staˇ () tionary points of P over the feasible space Uk ; the internal energy W

278

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

is a convex functional of the canonical (geometric) strain tensor , while, for a given input f ; the external energy F (u) = u, f  is a linear functional, and the geometrical measure Λ(u) is a quadratic tensor function of the state variable u. In the case that Λ(u) is Gˆ ateaux differentiable, we have the following decomposition (Gao and Strang, 1989) Λ(u) = Λt (u)u + Λc (u),

(2.10)

ateaux derivative of Λ(u) with rewhere Λt (u) = DΛ(u) denotes the Gˆ spect to u, while Λc = Λ(u) − Λt (u)u is the so-called complementary operator of Λt . It is by this decomposition (2.10), Gao and Strang discovered that the duality gap existing in classical Lagrange duality theory can be naturally recovered by the so-called complementary gap function defined by (2.11) G(u, ∗ ) = −Λc (u); ∗ . Therefore, they proved that the original nonconvex problem (2.9) is equivalent to the following constrained complementary variational problem  c ˇ  (∗ ) + G(u, ∗ ) → sta ∀∗ ∈ Ea∗ , P (u, ∗ ) = W c (2.12) (Psta ) : s.t. Λ∗t (u)∗ = f, where the balance operator Λ∗t (u) is the adjoint operator of Λt defined ˇ  (∗ ) is the by the duality pairing Λt (u)u; ∗  = u, Λ∗t (u)∗ , and W Fenchel super-conjugate: ˇ ()|  ∈ Ea }. ˇ  (∗ ) = sup{; ∗  − W W Gao and Strang further proved that if (¯ u, ¯∗ ) is a critical point of the extended Lagrangian ˇ  (∗ ) − u, f , Ξ(u, ∗ ) = Λ(u); ∗  − W

(2.13)

u, ¯∗ ) = 0 holds. Morethen the complementarity condition P (¯ u) + P c (¯ u, ¯∗ ) is a saddle point of Ξ(u, ∗ ) and u ¯ is a over, if G(¯ u, ¯∗ ) ≥ 0, then (¯ global minimizer of P (u). Their original work on duality theory in finite field theory leads to a unified framework in fully nonlinear canonical systems (see Fig. 5.9). For example, let us consider the very simple nonconvex optimization in Rn : 1 1 (2.14) min P (u) = α( |u|2 − λ)2 − uT f ∀u ∈ Rn . 2 2 The criticality condition DP (u) = 0 leads to a coupled nonlinear algebraic equation system in Rn 1 α( |u|2 − λ)u = f, 2

(2.15)

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

u ∈ U  u , u∗ 

279

- U ∗ u∗ 6

Λ∗t = (Λ − Λc )∗

Λ t + Λc = Λ ?

∈ E  Figure 5.9.

 ; ∗ 

- E ∗ ∗

Framework in fully nonlinear systems

which is usually difficult to solve analytically for all roots, and to determine which solution is a global minimizer of P . However, by the canonical dual transformation, this problem can be solved very easily. Since W (u) = 12 α( 12 |u|2 − λ)2 is a double-well energy, if we choose ˇ () = 1 α2 is a canonical (quadratical)  = Λ(u) = 12 |u|2 −λ ∈ R, then W 2 ˇ ∗ (∗ ) = 1 α−1 ∗ 2 . function. Its Legendre conjugate is simply given by W 2 n ateaux differential For the quadratic operator Λ(u) : R → R, its Gˆ is simply Λt (u) = uT , thus the complementary operator can given by Λc (u) = ( 12 |u|2 − λ) − uT u = − 12 |u|2 − λ. The extended Lagrangian for this nonconvex optimization problem is 1 1 Ξ(u, ∗ ) = ( |u|2 − λ)∗ − α−1 ∗ 2 − uT f. 2 2

(2.16)

For a fixed ∗ ∈ R, the partial criticality condition Du Ξ(u) = 0 leads to the canonical balance equation Λ∗t (u)∗ = u∗ − f = 0.

(2.17)

With this condition, the complementary energy P c in the problem (2.12) takes the following form 1 1 P c (u, ∗ ) = α−1 ∗ 2 + ( |u|2 + λ)∗ , 2 2 where G(u, ∗ ) = ( 12 |u|2 + λ)∗ = −Λc (u)∗ is the complementary gap function. For each nonzero ∗ = 0, the canonical balance equation (2.17) gves u = f /∗ . Substituting this into the extended Lagrangian Ξ, the canonical dual function of P can be obtained by the canonical dual trasformation P d (∗ ) = {Ξ(u, ∗ )| Du Ξ(u, ∗ ) = 0} 1 fT f = − ∗ − α−1 ∗ 2 − λ∗ ∀∗ = 0. 2 2

(2.18)

280

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

The critical points of this canonical function solve the following dual eqiation 1 (2.19) (α−1 ∗ + λ)∗ 2 = f T f. 2 For any given parameters α, λ and the vector f ∈ Rn , this cubic algebric equation has at most three roots satisfying ∗ 1 > 0 > ∗ 2 ≥ ∗ 3 , and each of these roots leads to a critical point of the nonconvex function P (u), i.e. ui = f /∗ i , i = 1, 2, 3. It was show by the author (Gao, 1997) that u1 is a global minimizer of P , while u2 is a local minimizer and u3 is a local maximizer. For the global minimizer u1 , we have the saddle duality relation P d (∗ ) = P d (∗ 1 ). P (u1 ) = minn P (u) = max ∗ u∈R

 >0

While for the local extremers, the bi-duality relations P (u2 ) = min P (u) = min P d (∗ ) = P d (∗ 2 ), and

P (u3 ) = max P (u) = max P d (∗ ) = P d (∗ 3 ),

hold on the neighborhoods of (u2 , ∗ 2 ) and (u3 , ∗ 3 ). Actually, this simple but elegant tri-duality theory was originally discovered by the author in the study of post-buckling analysis of an extended beam model (see gao97). Mathematically speaking, nonconvex problems in functional space are much more difficult than global optimization problems in Rn . However, on the dual side, these mechanics problems possess wonderful physical meaning. For example, in finite deformation theory, the quadratic operator  = Λ(u) = 12 (∇u)T (∇u) is the so-called Cauchy-Green strain tensor. For the well-known St. Venantˇ () is a quadratic function Kirchhoff materials, the canonical energy W  (see page 289, gao00a). In this case, the extended Lagrangian Ξ(u, ∗ ) is the well-known Hellinger-Reissner energy. This complementary energy variational principle plays an essential role in large deformation mechanics. However, the extrimality condition of this important energy was an open problem for more than 50 years. Recently, in the study of the post-bifurcation in nonconvex mechanics, it was discovered that for a quadratic operator Λ, if the complementary gap function G(¯ u, ¯∗ ) is ∗ + negative, then the critial point (¯ u, ¯ ) is a super (or ∂ -) critical point u, ¯∗ ) could be eiof the extended Lagrangian Ξ(u, ∗ ). In this case, (¯ c ther a local minimizer or local maximizer of P (u, ∗ ). Therefore, an interesting triality theory was proposed in finite deformation theory and nonsmooth/nonconvex variational analysis (see gao97; gao98; gao99b). This triality solved completely the open problem on the extremality

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

281

condition of the Hellinger-Reissner variational principle in nonconvex mechanics. A self-contained comprehensive presentation of the mathematical theory of duality and triality in general nonconvex, nonsmooth systems was given recently in the monograph gao00a. During the writing of this book, a potentially useful method, i.e. the so-called canonical dual transformation method, was developed. The key idea of this canonical dual transformation method is to choose a certain (geometrically reasonable) operator  = Λ(u) : Ua → Ea such that a given nonconvex function P (u) can be written in the canonical form P (u) = Φ(u, Λ(u)), where Φ(u, ) : Ua × Ea → R is a canonical function in each of its variables (see gao00c). Very often, Φ(u, ) = ¯ () − F¯ (u). Since both W ¯ : Ea → R and F¯ : Ua → R are canoniW cal functions, their Legendre conjugates can be uniquely defined via the classical Legendre transformation. Thus the extended Lagrangian ¯ ∗ (∗ ) − F¯ (u) Ξ(u, ∗ ) = Λ(u); ∗  − W

(2.20)

is well defined on Ua × Ea∗ . Then by using the so-called Λ-canonical dual transformation (see gao00a) F¯ Λ (∗ ) = {Λ(u); ∗  − F¯ (u)| Λ∗t (u)∗ − DF¯ (u) = 0, u ∈ Ua }, (2.21) the canonical dual function of the nonconvex P (u) can be well defined by ¯ ∗ (∗ ). P d (∗ ) = {Ξ(u, ∗ )| Du Ξ(u, ∗ ) = 0, u ∈ Ua } = F¯ Λ (∗ ) − W (2.22) In the case that F¯ is linear and Λ is quadratic, the Λ-conjugate F¯ Λ (∗ ) is equivalent to the complementary gap function, i.e. F¯ Λ (∗ ) = {G(u, ∗ )| Λ∗t (u)∗ = f, u ∈ Ua }. ¯ () is In mathematical physics, the canonical duality relation ∗ = DW usually called the constitutive law. By the duality of natural phenomena we know that physical variables (always) exist in pairs. The one-to-one duality relation between each canonical dual pair insures the existence of the geometrical measure  = Λ(u) and the canonical functional for most well-posted systems. Extensive applications of this canonical dual transformation method have been given in nonconvex continuous systems, and some analytical solutions of nonconvex/nonsmooth boundary value problems have been obtained (see gao98; gao99b; gao00c). The generalization of this method was made for nonsmooth global optimization problems suitable for arbitrary nonlinear operator Λ (see gao00c).

282

3.

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Canonical Dual Theory and Solutions Recall the nonconvex problem proposed in the beginning of the paper: (P) :

1 min P (u) = u, Au + W (u) − u, f  2

∀u ∈ Uk .

The canonical duality theory for solving this primal problem in infinite dimensional systems has been established recently, and applications have been made to post-buckling analysis of nonconvex beam models gao00d, nonsmooth/nonconvex/nonconservative dynamics gao01c, as well as the Landau-Ginzburg equation in super-conductivity (see gao02; gao03b; gaolin02). Numerical discretization of these nonconvex variational problems leads to nonconvex global optimization problems, where A is usually a large-scale symmetric matrix. So in this paper, we will limit our attention to the finite dimensional systems.

3.1

Canonical dual transformation and perfect dual problem

In order to formulate an explicit dual problem, we assume that the operator A : Ua ⊂ U → Ua∗ ⊂ U ∗ is invertible. Thus, for each given input f ∈ Ua∗ , the function F¯ : Ua → R, defined by 1 F¯ (u) = u, f  − u, Au, 2 is a canonical function on Ua since its Gˆateaux derivative u∗ = DF¯ (u) = f −Au is an one-to-one mapping from Ua onto the range Ua∗ . It turns out that (u, u∗ ) is a (Legendre) canonical duality pair on Ua ×Ua∗ . We further assume that for the given nonconvex function W (u) : Ua → R, there exists a geometrical operator Λ(u) : U → Ea , which maps each u ∈ Ua into another metric space E, such that the nonconvex function W (u) ¯ (Λ(u)), where W ¯ () is can be written in the canonical form W (u) = W a canonical function defined on a subset Ea ⊂ E. By the definition of ¯ : Ea → R is Gˆ ateaux differentiable, and the the canonical function, W ∗ ∗ ¯ duality relation  = DW : Ea → Ea ⊂ E ∗ is invertible. Let ∗; ∗ : E × E ∗ → R denote the bilinear form on E × E ∗ . Then the Legendre ¯ can be ¯ ∗ : Ea∗ → R of the canonical function W conjugate function W obtained uniquely by the classical Legendre transformation ¯ ()| DW ¯ () = ∗ ,  ∈ Ea }, ¯ ∗ (∗ ) = {; ∗  − W W

(3.1)

and the Legendre canonical duality relations ¯ () ⇔  = DW ¯ ∗ (∗ ) ⇔ W ¯ () + W ¯ ∗ (∗ ) = ; ∗  ∗ = DW

(3.2)

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

283

hold on Ea × Ea∗ . So the pair (, ∗ ) is also a Legendre canonical dual pair on Ea × Ea∗ . For the sake of simplicity, we first limit our attention on scalar-valued quadratic operator Λ : U → E 1 Λ(u) = |u|2 − λ, 2

(3.3)

where | ∗ | is an Euclidean norm, and λ ∈ R is a constant. The vectorvalued nonlinear operator Λ will be given in Section 6. Also, by using the so-called sequential canonical dual transformation method developed in gao00a; gao00c, the results of this paper can be generalized for any socalled canonical polynomial operator Λ(u) (see gao98; gao00a. Finally, we assume that the feasible set Uk can be written as Uk = {u ∈ Ua | Λ(u) ∈ Ea }. Since we are interested in finding all critical points of the nonconvex function P (u) over the feasible space Uk , in terms of the canonical func¯ and the geometrical measure  = Λ(u), the primal minimization tion W problem (P) should be rewritten in the canonical stationary variational form ((Psta ) in short): (Psta ) :

¯ (Λ(u)) − F¯ (u) P (u) = W 1 = W (u) + u, Au − u, f  → sta ∀u ∈ Uk . 2

(3.4)

The criticality condition DP (¯ u) = 0 leads to the following canonical equation  ¯ (Λ(¯ u))I u ¯ = f, (3.5) A + DΛ W ¯ stands for the Gˆ ¯ (Λ(u)) with respect where DΛ W ateaux derivative of W to Λ(u), and I is an identity matrix. Clearly, the canonical equation (3.5) is equivalent to the original semi-linear equation (1.2). However, by the canonical dual transformation, a complete set of solutions of this nonlinear system can be obtained via the canonical (i.e. perfect) duality formulation. Theorem 1 (Perfect Duality Formulation) Suppose that for a given f ∈ Ua∗ such that the dual feasible space Ek∗ = {∗ ∈ Ea∗ | (A + ∗ I) is invertible and (A + ∗ I)−1 f ∈ Ua } (3.6) is not empty, then the problem 1 d ¯ ∗ (∗ ) → sta ∀∗ ∈ E ∗ (Psta ) : P d (∗ ) = − (A + ∗ I)−1 f, f  − λ∗ − W k 2 (3.7)

284

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

is canonically (perfectly) dual to the primal problem (Psta ) in the sense that if u ¯ ∈ Uk is a solution of the primal stationary problem (Psta ) given ¯ (Λ(¯ u)) is a solution of the dual problem in equation (3.4), then ¯∗ = DΛ W d (Psta ) and ∗ ). (3.8) P (¯ u) = P d (¯ Proof. Following the standard procedure of the canonical dual transformation described in Section 2, the extended Lagrangian Ξ : Ua × Ea∗ → R can be defined as ¯ ∗ (∗ ) + 1 u, Au − u, f . (3.9) Ξ(u, ∗ ) = Λ(u); ∗  − W 2 The criticality condition DΞ(¯ u, ¯∗ ) = 0 leads to the canonical Lagrange equations: ¯ ∗ (¯ ∗ ), Λ(¯ u) = D W u)¯ ∗ = DF¯ (¯ u) = (f − A¯ u), Λt (¯

(3.10) (3.11)

where Λt (¯ u) = DΛ(¯ u) = u ¯ is the Gˆ ateaux derivative of Λ at u ¯. By the Legendre canonical duality relations (3.2), the inverse duality equation ¯ (Λ(¯ u)). Substituting this into (3.11), we (3.10) is equivalent to ¯∗ = DW obtain the canonical Euler equation (3.5). This shows that the critical u) = Ξ(¯ u, ¯∗ ). points of Ξ(u, ∗ ) solves the primal problem, and P (¯ ∗ ∗ By the definition, for each fixed  ∈ Ea , the canonical dual function P d is defined by ¯ ∗ (∗ ), P d (∗ ) = {Ξ(u, ∗ )| Du Ξ(u, ∗ ) = 0, u ∈ Ua } = F¯ Λ (∗ ) − W where the Λ-canonical dual transformation F Λ : Ea∗ → R of the canonical function F (u) = u, f  − 12 u, Au is defined by

F¯ Λ (∗ ) = Λ(u); ∗  − F¯ (u)| DF¯ (u) = Λ∗t (u)∗ , u ∈ Ua . (3.12) For a given f ∈ Ua∗ , if the dual feasible space Ek∗ is not empty, then for each ∗ ∈ Ek∗ , the linear equation DF¯ (u) = Λ∗t (u)∗ has a unique solution u ¯ = (A + ∗ I)−1 f . Substituting this into the Λ-canonical dual transformation (3.12), we have 1 F¯ Λ (∗ ) = − (A + ∗ I)−1 f, f  − λ∗ . 2 Thus, on the canonical dual feasible space Ek∗ , the canonical dual function P d is formulated uniquely as form of (3.7). Moreover, if (¯ u, ¯∗ ) is a critical point of Ξ(u, ∗ ), and ¯∗ ∈ Ek∗ , the canonical Lagrangian equation (3.11) has a unique solution u ¯ = (A + ¯∗ I)−1 f.

285

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

Substituting this into (3.10), we obtain the dual algebraic equation 1 T ¯ ∗ (¯ f (A + ¯∗ I)−2 f − DW ∗ ) = λ. 2

(3.13)

This is exactly the criticality condition DP d (¯ ∗ ) = 0. Thus, the critical point (¯ u, ¯∗ ) of the extended Lagrangian Ξ(u, ∗ ) solves both the primal and dual problems. The Legendre duality relations lead to the equality (3.8). 

3.2

Complete set of solutions

Theorem 1 shows that there is no duality gap between the primal d ). Since the critproblem (Psta ) and its canonical dual problem (Psta d icality condition (3.13) of P is an algebraic equation with only one d ) has a finite number unknown ∗ ∈ R, the canonical dual problem (Psta ∗ of critical points in Ek . All these dual solutions ¯∗i (i = 1, 2, ...) form a subset of Ek∗ , it is denoted by 1 ¯ ∗ (¯ ∗ ∈ Ek∗ | DW ∗ ) + λ = f T (A + ¯∗ I)−2 f }. Es∗ = {¯ 2

(3.14)

The following result shows that the dual solution set Es∗ leads to a complete set of solutions of the primal problem (Psta ). Theorem 2 (Complete Solution Set) Suppose that the assumption ¯ defined by in Theorem 1 holds. For every solution ¯∗ ∈ Es∗ , the vector u u ¯(¯ ∗ ) = (A + ¯∗ I)−1 f

(3.15)

¯ of the solves the primal problem (Psta ). Conversely, every solution u primal problem (Psta ) can be written in the form (3.15) for some dual solution ¯∗ ∈ Es∗ . Proof. We first prove that the vector defined by (3.15) solves (3.5). ¯ into the dual algebraic equation (3.13), Substituting (A + ¯∗ I)−1 f = u we obtain the inverse canonical dual relation 1 2 ¯ ∗ (¯ u| − λ = DW ∗ ). Λ(¯ u) = |¯ 2 ¯ () is a canonical function, by the Legendre duality relation Since W ¯ (Λ(¯ (3.2) we know that ¯∗ = D W u)). Substituting u ¯(¯ ∗ ) = (A + −1 ¯ u))I) f into the left hand side of the canonical equation (3.5) D W (Λ(¯ leads to f . Thus for every solution ¯∗ of the dual algebraic equation

286

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

(3.13), u ¯ = (A + ¯∗ I)−1 f solves the canonical equation (3.5), and is a critical point of P . Conversely, if u ¯ is a solution of the couple nonlinear system (3.5), u)I)−1 f with ¯∗ (¯ u) = then it can be written in the form u ¯ = (A + ¯∗ (¯ ¯ (Λ(¯ u)) is a critical DW u)). By Theorem 1 we know that the pair (¯ u, ¯∗ (¯ point of the extended Lagrangian Ξ(u, ∗ ), and ¯∗ is a critical point of ¯ (Λ(¯ u) has to be a solution P d (∗ ) on Ek∗ . It turns out that ¯∗ = DW of the canonical dual algebraic equation (3.13). This shows that every solution of the coupled nonlinear system (3.5) can be written in the form u ¯ = (A + ¯∗ I)−1 f for some solution ¯∗ of the dual algebraic equation (3.13).  This theorem shows that, by the canonical dual transformation, a complete set of solutions to the nonconvex primal problem is obtained as u ∈ Uk | u ¯ = (A + ¯∗ (¯ u)I)−1 f ∀¯ ∗ ∈ Es∗ }. (3.16) Us = {¯

3.3

Global minimizer and local extrema

For the given nonconvex problem (P), each solution u ¯ ∈ Us could be only a local extremum point (either local minimizer or local maximizer) of the nonconvex function P (u). In order to determine the global minimizers and local extremes of P , we introduce the following subsets ∗ = {∗ ∈ Ek∗ | (A + ∗ I) is positive definite}, E+ ∗ = {∗ ∈ Ek∗ | (A + ∗ I) is negative definite}. E−

(3.17) (3.18)

By the triality theory proposed in gao97; gao98; gao00a, the global minimizers and maximizers of the primal problem (Psta ) and the dual probd ) can be clarified by the following theorem. lem (Psta Theorem 3 (Global Minimizer and Maximizer) Suppose that the ¯ () is convex on Ea , and for each dual solution canonical function W ∗ ∗ ∗ ¯(¯  ) = (A + ¯∗ I)−1 f . ¯ ∈ Es , we let u ∗ ∗ ∗ , while u ¯(¯ ∗ ) is If ¯ ∈ E+ , then ¯∗ is a global maximizer of P d on E+ a global minimizer of P on Uk , and P d (∗ ) = P d (¯ ∗ ). P (¯ u) = min P (u) = max ∗ ∗ u∈Uk

 ∈E+

(3.19)

∗. Moreover, the dual solution set Es∗ has at most one element ¯∗ ∈ E+ ∗ , then  ¯∗ and the associated u ¯ are local critical points of P d If ¯∗ ∈ E− and P , respectively. In this case, u ¯ is a local maximizer of P (u) on its neighborhood1 Ur ⊂ Uk if and only if ¯∗ is a local maximizer of P d on its neighborhood Er∗ ⊂ Ek∗ , and

P d (∗ ) = P d (¯ ∗ ). P (¯ u) = max P (u) = max ∗ ∗ u∈Ur

 ∈Er

(3.20)

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

287

Proof. The proof of the statement (3.19) follows the original idea presented in the paper jointed with Strang gsa. By the convexity of the ¯ (), we know that the inequality canonical function W ¯ () − W ¯ (¯ ¯ (¯ W ) ≥  − ¯; DW )

(3.21)

holds for all , ¯ ∈ E. For any given u ∈ U, we let  = Λ(u), and particularly, for each solution ¯∗ of the dual algebraic equation (3.13), u). Since Λ is a quadratic we let u ¯(¯ ∗ ) = (A + ¯∗ I)−1 f , and ¯ = Λ(¯ operator, the Taylor expansion of  = Λ(u) at u ¯ has only three terms u))T (u − u ¯) − Λc (u − u ¯) Λ(u) = Λ(¯ u) + (Λt (¯ 1 1 ¯ |2 , ¯T (u − u ¯) + |u − u = ( |u|2 − λ) + u 2 2 where Λt (¯ u) = u ¯T is the Gˆ ateaux derivative of Λ(u) at u ¯, while Λc (u) = 1 2 − 2 |u| is the complementary operator of Λt (see gsa). Thus, substituting  = Λ(u) and ¯ = Λ(¯ u) into the inequality (3.21) leads to ¯ (Λ(¯ u))I)¯ u − f P (u) − P (¯ u) ≥ u − u ¯, (A + D W 1 ¯ (Λ(¯ u))I)(u − u ¯) ∀u ∈ Uk . ¯, (A + D W + u − u 2 By Theorem 2 we know that for each solution ¯∗ of the dual algebraic ¯ (Λ(¯ u ) = D W equation (3.13), u ¯(¯ ∗ ) is a critical point of P , and ¯∗ (¯ u)), ∗ thus if A + ¯ I is positive definite, we have 1 ¯, (A + ¯∗ (¯ u)I)(u − u ¯) ≥ 0 ∀u ∈ Uk . P (u) − P (¯ u) ≥ u − u 2 This shows that for each solution ¯∗ of the dual algebraic equation (3.13), ¯(¯ ∗ ) is a global minimizer of P (u) over if A + ¯∗ I is positive definite, u ¯ is strictly convex, then the inequality (3.21) holds Uk . Moreover, if W ∗ , and u ¯ = (A + ¯∗ I)−1 f , then for all u ∈ Uk strictly. Thus if ¯∗ ∈ Es∗ ∩ E+ such that u = u ¯, we have 1 ¯, (A + ¯∗ I)(u − u ¯) > 0. P (u) − P (¯ u) > u − u 2 This shows that u ¯ is a unique global minimizer of P over Uk . By the fact that (Λ(¯ u), ¯∗ ) is a canonical duality pair on Ea × Ea∗ , we know that ∗. the dual solution set Es∗ has a unique element ¯∗ ∈ E+ ∗ , then (¯ u, ¯∗ ) is a so-called super-critical point of the extended If ¯∗ ∈ E− ∗ Lagrangian Ξ(u,  ), i.e. Ξ(u, ∗ ) is locally concave in each of its variables u and ∗ on the neighborhood Ur × Er∗ . In this case, we have Ξ(u, ∗ ) = max max Ξ(u, ∗ ) = P d (¯ ∗ ) P (¯ u) = max max ∗ ∗ ∗ ∗ u∈Ur  ∈Er

 ∈Er u∈Ur

288

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

by the fact that the maxima of the super-Lagrangian Ξ(u, ∗ ) can be taken in either order on the open set Ur × Er∗ (see gao00a). This proves the rest part of the theorem and (3.20).  Remark Theorem 3 can also be simply proved by the triality the¯ () ory developed in gao00a, i.e. for the convex canonical function W ¯ ((u)) may not be convex in u), its Legendre conjugate (but W (u) = W ∗ , the extended Lagrangian (3.9) ¯ ∗ (∗ ) is also convex, then if ¯∗ ∈ E+ W ¯ ∗ (∗ ) + 1 u, Au − u, f  Ξ(u, ∗ ) = Λ(u); ∗  − W 2 is a saddle function at the critical point (¯ u, ¯∗ ). In this case, the classical ∗ , then Ξ(u, ∗ ) is a saddle min-max theory leads to (3.19). If ¯∗ ∈ E− so-called super-Lagrangian in the neighborhood of (¯ u, ¯∗ ). In this case, the bi-duality theory developed in gao00a proves the double max duality relation (3.20), as well as the double min duality relation P d (∗ ) = P d (¯ ∗ ), P (¯ u) = min P (u) = min ∗ ∗ u∈Ur

 ∈Er

(3.22)

under certain additional constraints. ¯ () is concave, a parallel In the case that the canonical function W theorem can be obtained simply by applying the triality theory (see ∗ , then (¯ u, ¯∗ ) is the so-called left-saddle point of gao98), i.e. if ¯∗ ∈ E− ∗ Ξ(u,  ), in this case, we have P d (∗ ) = P d (¯ ∗ ). P (¯ u) = max P (u) = min ∗ ∗ u∈Uk

 ∈E−

(3.23)

∗ , then (¯ u, ¯∗ ) is the so-called sub-critical point of Dually, if ¯∗ ∈ E+ ∗ Ξ(u,  ), in this case, the bi-duality theory leads to the double min (3.22) and the double max (3.20) duality relations under certain conditions.

4.

Applications to Unconstrained Global Optimization

The canonical dual transformation method and associated triality theory can be used to solve many nonconvex problems in engineering and science. Some applications in nonconvex mechanics have been given in the recent papers (see gao03b; gao-li-v). This section presents some examples in finite dimensional space.

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

4.1

289

¯ () Quadratic W

First, let us consider the following unconstrained nonconvex stationary problem  2 1 2 1 T 1 |u| − λ − f T u → sta ∀u ∈ Rn , (4.1) P (u) = u Au + α 2 2 2 where A ∈ Rn×n is a given symmetrical matrix, f ∈ Rn is a given vector, and α, λ > 0 are positive constants. In this problem, W (u) is a fourth-order canonical polynomial (cf. gao00a) in U = Rn  2 1 2 1 |u| − λ . W (u) = α (4.2) 2 2 In two dimensional space R2 , this double-well energy W (u) is also called the “Mexican hat” function in cosmology and theoretical physics (see Gao, 2000b). In Rn , the nonconvex function P (u) may have many local critical points, which depend on the matrix A. To solve this nonconvex problem by the canonical dual transformation method, we let U = Rn = U ∗ . The geometrical measure  = Λ(u) = 1 2 n 2 |u| − λ is a quadratic operator from U = R into E = R. By the fact 1 2 that 2 |u| =  + λ ≥ 0 ∀u ∈ U, the range of the quadratic mapping Λ is Ea = { ∈ R|  + λ ≥ 0}. ¯ : Ea → R is simply a quadratic Then on Ea , the canonical function W 1 2 ¯ function W () = 2 α . For a given f ∈ Rn , the function 1 1 F (u) = u, c − u, Au = uT f − uT Au 2 2 is a quadratic function on Ua = Rn . By the fact that u∗ = DF (u) = f − Au, the range for the canonical mapping DF : Ua → U ∗ is Ua∗ = Rn . The feasible set for the primal problem is Uk = {u ∈ Ua | Λ(u) ∈ Ea } = Rn . Thus, the canonical dual problem is to find all critical point of P (u) such that 1 1 (Psta ) : P (u) = u, Au + α(Λ(u))2 − u, f  → sta ∀u ∈ Rn . (4.3) 2 2 The Euler equation associated with this nonconvex variational problem is a coupled nonlinear algebraic system in Rn   1 2 |¯ u| − λ u A¯ u+α ¯ = f. 2

290

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

¯ () = 1 α2 is quadratic, the canonical Since the canonical function W 2 dual relation ∗ = α is invertible on Ea . The range of the canonical ¯ : Ea → E ∗ is dual mapping DW Ea∗ = {∗ ∈ R| ∗ ≥ −λα}. ¯ For each ∗ ∈ Ea∗ , the Legendre conjugate of W ¯ () | DW ¯ () = ∗ } = 1 α−1 ∗ 2 ¯ ∗ (∗ ) = {∗ − W W 2 is also a quadratic function. Thus, on the dual feasible space Ek∗ = {∗ ∈ R| det(A + ∗ I) = 0,

∗ ≥ −αλ}

the canonical dual problem is formulated as 1 1 d ) : P d (∗ ) = − f T (A + ∗ I)−1 f − α−1 ∗ 2 − λ∗ → sta ∀∗ ∈ Ek∗ (Psta 2 2 (4.4) The canonical dual algebraic equation associated with this dual problem is 1 (4.5) α−1 ∗ + λ = f T (A + ∗ I)−2 f. 2 For the given f ∈ Rn and the parameters α, λ > 0, if the symmetric matrix A = AT ∈ Rn×n has p ≤ n distinct eigenvalues a1 < a2 < · · · < ap , this algebraic equation has at most 2p + 1 real roots ¯∗1 > ¯∗2 ≥ ¯∗3 ≥ · · · ≥ ¯∗2p+1 , which can be obtained by using MATHEMATICA. These dual solutions lead to at most 2p + 1 critical points of P (u): u ¯i = (A + ¯∗i I)−1 f, i = 1, 2, . . . , 2p + 1.

(4.6)

By Theorem 3, if α > 0, then u ¯1 = (A + ¯∗1 I)−1 f is the global minimizer ∗ of P (u), and u ¯2p+1 = (A + ¯2p+1 I)−1 f is a local maximizer of P (u). Example 1 In the case of n = 1, the nonconvex function P (x) = 1 1 1 2 2 2 2 ax + 2 α( 2 x − λ) − cx has at most two potential wells and one local maximizer. Its canonical dual function 1 1 P d (∗ ) = − c2 (a + ∗ )−1 − ∗ 2 /α − λ∗ 2 2

(4.7)

is discontinuous at ∗ = −a (see Fig. 5.10). If we choose a = −0.5, λ = 1.3, α = 1.0 and c = 0.2, the dual algebraic equation has three real roots:

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

291

¯∗3 = −1.29378 < ¯∗2 = 0.391255 < ¯∗1 = 0.60253, which gives the three critical points {¯ ui = c/(a + ¯∗i )} = {−1.83916, −0.111496, 1.95066} of ∗ ¯i + a < 0 for i = 2, 3, by the Theorem 3, we P (x). Since ¯1 + a > 0, u know that u ¯1 = 1.95066 is a global minimizer, while u ¯2 = −0.111496 is a local maximizer and u ¯3 = −1.83916 is a local minimizer. 1.5 1 0.5 0 -0.5 -1 -1.5 -2 Figure 5.10.

-1

0

1

2

Double-well energy P (x) and its dual P d (∗ )

Example 2 In two dimensional space R2 , the nonconvex function P (u) has at most 2n + 1 = 5 critical points. If we simply choose A = {aij } with a11 = 0.5, a22 = −0.6, a12 = a21 = 0, f = {f1 , f2 } with f1 = 0.2 f2 = −0.15. For a given parameter λ = 1.3, and α = 1.0, the graph of P (u) is a nonconvex surface (see Fig. 5.11a) with four potential wells and one local maximizer. The graph of the canonical dual function P d (∗ ) is a two-dimensional curve (see Fig. 5.11b). The dual canonical dual algebraic equation (4.5) has total five real roots: ¯∗5 = −1.26234 < ¯∗4 = −0.680712 < ¯∗3 = −0.353665 < ¯∗2 = 0.520982 < ¯∗1 = 0.675737, and we have P d (¯ ∗5 ) > P d (¯ ∗4 ) = 0.772699 > P d (¯ ∗3 ) = 0.272349 ∗2 ) = −0.690204 > P d (¯ ∗1 ). > P d (¯ Since (A + ¯∗1 I) is positive-definite, by Theorem 3, we know that u ¯1 = (A + y1∗ I)−1 f = {0.170106, −1.98054} is a global minimizer of P (u), and ∗1 ) = −1.27232. By Theorem 3 (also from the graph of P (u1 ) = P d (¯ d ¯5 = (A + ¯∗5 I)−1 f = {−0.2623, 0.0805} is the biggest P ), we know that u ∗5 ) = 0.876567 since ¯∗5 is a local local maximizer of P and P (¯ u5 ) = P d (¯

292

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

maximizer of P d and (A + ¯∗5 I) is negative definite.

1.5 1 0.5

6 4 2 2 1

0

-1.5

-1

-0.5

0.5

1

-0.5

0

-2

-1 -1

-1 0

-1.5 -2

1 2

(a) Graph of P (u). Figure 5.11. ample 2.

(b) Graph of P d (∗ )

Graphs of the primal function P (x1 , x2 ) and its canonical dual for Ex-

Example 3 In a high dimensional space n > 2, it is very difficult to find all critical points and global minimizers of P (u). However, the graph of the canonical dual function P d is only a plane curve. If a1 < a2 < · · · < ap are distinct eigenvalues of A, then within each interval −ar < ∗ < −ar+1 , the canonical dual function P d (∗ ) has at most two critical points ¯∗r1 ≤ ¯∗r2 , and ¯∗r1 is a local maximizer and ¯∗r2 is a local minimizer of P d . For n = 4, and we let a11 = −2.5, a22 = −1.8, a33 = −.5, a44 = 1.4, aij = 0 for all i = j, and f = (−.2, .5, .3, 0.2)T , λ = 2.8, the graph of P (u) is in R5 , which is impossible to be viewed. However, the graph of P d is shown in Fig. 5.12.

4.2

¯ () Concave W

Now let us consider the following constrained nonconvex problem 1 1 T (P) : P (u) = u Au + α λ − |u|2 − f T u → sta ∀u ∈ Uk , (4.8) 2 2 where α, λ > 0 are given parameters, the feasible set 1 Uk = {u ∈ Rn | 0 ≤ |u|2 ≤ λ} 2 √ is an n-dimensional ball with radius ρ = 2λ. Thus, by choosing the geometrical measure  = Λ(u) = 12 |u|2 − λ, the canonical function √ ¯ () = − ∀ ≤ 0 W

293

Nonconvex Semi-Linear Problems And Canonical Duality Solutions 5

2.5

-3

-2

-1

1

2

3

-2.5

-5

-7.5

-10

-12.5

Figure 5.12.

Graphs of the P d (∗ ) for four-dimensional problem.

is a concave canonical function defined on its domain Ea = { ∈ R|  ≤ 0}. ¯ () = − 1 α(−)−1/2 . The range of The canonical dual variable ∗ = DW 2 ¯ : Ea → Ea∗ ⊂ R is also a set of negative the canonical dual mapping DW real numbers Ea∗ = {∗ ∈ R| ∗ ≤ 0}. The Legendre conjugate for this concave function is 2 ¯ ∗ (∗ ) = {; ∗  − W ¯ ()| DW ¯ () = ∗ ,  ∈ Ea } = α . W 4∗

Thus, on the dual feasible space Ek∗ = {∗ ∈ Ea∗ | det(A+∗ I) = 0} = { ∗ ∈ R| det(A+∗ I) = 0, ∗ ≤ 0}, the canonical dual problem is 1 α2 P d (∗ ) = − f T (A + ∗ I)−1 f − ∗ − λ∗ → sta ∀∗ ∈ Ek∗ . 2 4 The dual algebraic equation takes the following form α2 1 − λ = f T (A + ∗ I)−2 f. 2 ∗ 2 4

(4.9)

The number of solutions of this nonlinear equation depends on the number of eigenvalues of A.

294

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Example 4 For n = 1, if we choose A = {a11 } = 1.3 > 0, f = 0.62, α = 1.0, λ = 2.0, the equation (4.9) has total four roots ¯∗1 = −1.61768 < ¯∗2 = −0.966937 < ¯∗3 = −0.375268 < ¯∗4 = 0.359885. Since the positive root ¯∗4 ∈ / Ek∗ , so for each ¯∗i ∈ Ek∗ , i = 1, 2, 3, the nonconvex problem (P) has total three solutions in Uk : {¯ ui } = {(a + ¯∗i )−1 f } = {−1.95165, 1.86151, 0.670465} ¯ () is concave, and (a+ ¯∗ ) < 0 is negative, then by the Theorem Since W 1 we know that u ¯1 is a global maximizer of P (x) on Uk (see Fig. 5.13). 4.5

4

3.5

3

2.5

2

1.5

-2

Figure 5.13. ¯ W

-1

0

1

2

Graphs of the primal function P (u) and its canonical dual for concave

Example 5 In two dimensional space, we let u = (x1 , x2 ) = (r cos t, r sin t), then the parametric surface of the nonconvex function P (u) is shown in Fig. 5.14 (a). If we choose a11 = 1.3, a12 = a21 = 0, a22 = −.4 and f = (0.62, −.2), α = 1, λ = 2, the canonical dual function P d has four critical points {¯ ∗i } = {−1.61809, −0.965843, −0.378999, 0.536445} (see Fig. 5.14 (b)), which leads to four solutions of the primal problem.

5.

Application to Constrained Quadratic Programming

We now turn our attention to the constrained global optimization problems of the form:

 1 (5.1) u, Au − u, f  | u ∈ Uλ , (Pλ ) : min 2 where the feasible space Uλ is defined as Uλ = {u ∈ Rn | Bu ≤ b ∈ Rm ,

1 2 |u| ≤ λ}, 2

(5.2)

295

Nonconvex Semi-Linear Problems And Canonical Duality Solutions 20 1

2

0

4

-1

6

-2 2

6

4 2

2

0 0 -2

-2

(a) Parametric surface of P (u). Figure 5.14.

-1.5

-1

-0.5

0

0.5

1

(b) Graph of P d (∗ )

Graphs of the primal function P (x1 , x2 ) and its canonical dual.

in which, B ∈ Rm×n is a given matrix, b ∈ Rm is a given vector, and λ > 0 is a constant. Physically speaking, for any given real problems, if the global minimizer exists, its norm |u| must be finite. Thus, the quadratic inequality 12 |u|2 ≤ λ is indeed a constraint for any real global optimization problems. In structural limit analysis, for example, where the matrix B is an equilibrium operator, while the normality inequality represents the so-called plastic yield condition (see Gao, 1988, 2000a). In mathematical programming, the problem (Pλ ) can also be considered as a normalized problem, or the parametrization of the standard quadratic programming problem (Pb ) proposed in (1.7) (see Gao, 1998, 2004). The problem (Pλ ) is nonconvex if the matrix A ∈ Rn×n is indefinite. It is known that this nonconvex quadratic programming is very difficult to solve by the traditional direct approaches. However, by the canonical dual transformation method, a complete set of solutions can be obtained.

5.1

Canonical dual formulation

To set the constrained global optimization problem (Pλ ) (5.1) in our framework, we let the geometrical operator Λ : Rn → Rm × R be a vector-valued mapping: 1 ξ = Λ(u) = (Bu − b, |u|2 − λ) = (, ρ) 2 where  = Bu − b and ρ = 12 |u|2 − λ. Let E = Rm × R = E ∗ , and let m m Rm − = { ∈ R |  ≤ 0 ∈ R }, R− = {ρ ∈ R| ρ ≤ 0}

296

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

be the negative cones in Rm and R, respectively. Then, the canonical ¯ : E → R ∪ {+∞} can be defined as the indicator of the function W convex sets Rm + and R− :  0 if  ∈ Rm − , ρ ∈ R− , ¯ W (ξ) = +∞ otherwise which is convex, lower semi-continuous on E. Its effective domain is ¯ () = {ξ = (, ρ) ∈ Rm × R|  ∈ Rm , ρ ∈ R− }, Ea = dom W − and the feasible space of the primal problem is Uλ = {u ∈ Rn | Λ(u) ∈ Ea }. On the space U = Rn , the constrained primal problem (5.1) can be written in the unconstrained canonical form: to find global minimizer u ¯ such that

 1 ¯ u, Au + W (Λ(u)) − u, f | u ∈ U . (5.3) u) = min (Pλ ) : P (¯ 2 ¯ (ξ) is convex, lower semiBy the fact that the canonical function W continuous on E, the canonical dual variable ξ ∗ ∈ E ∗ is defined by the sub-differential inclusion:  ∗ ∗ ∗ if ∗ ∈ Rm ( , ρ ) ∗ − ¯ + , ρ ∈ R+ , , ξ ∈ ∂ W (ξ) = ∅ otherwise ∗ m ∗ ∗ ∗ where Rm + = { ∈ R |  ≥ 0} and R+ = {ρ ∈ R| ρ ≥ 0} are the ¯ dual cones of Rm − and R− , respectively. The canonical conjugate W of ¯ W can be obtained by the Fenchel transformation:  ∗ 0 if ∗ ∈ Rm  ∗ ∗ + , ρ ∈ R+ , ¯ ¯ W (ξ ) = sup{ξ; ξ  − W (ξ)} = +∞ otherwise ξ∈E

Its effective domain is ¯  = {ξ ∗ = (∗ , ρ∗ ) ∈ Rm × R| ∗ ∈ Rm , ρ∗ ∈ R+ }. Ea∗ = dom W + Since the Fenchel sup-duality relations ¯ (ξ) + W ¯  (ξ ∗ ) = ξ; ξ ∗  (5.4) ¯ (ξ) ⇔ ξ ∈ ∂ − W ¯  (ξ ∗ ) ⇔ W ξ∗ ∈ ∂−W hold on E × E ∗ , we know that (ξ, ξ ∗ ) is a canonical dual pair on E × E ∗ . On Ea × Ea∗ , the Fenchel sup-duality relations (5.4) are equivalent to the following KKT conditions: Ea ξ ⊥ ξ ∗ ∈ Ea∗ .

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

297

The dual feasible space in this problem has the form Eλ∗ = {ξ ∗ = (∗ , ρ∗ ) ∈ Ea∗ | det(A + ρ∗ I) = 0}. For a fixed ξ ∗ ∈ Eλ∗ , F¯ Λ (ξ ∗ ) can be well defined by the Λ-canonical dual transformation F¯ Λ (ξ ∗ ) = {Λ(u); ξ ∗  − F¯ (u)| DF¯ (u) = Λ∗t (u)ξ ∗ , u ∈ Ua } 1 = − (f − B T ∗ )T (A + ρ∗ I)−1 (f − B T ∗ ) − λρ∗ − bT ∗ . 2 ¯  (ξ ∗ ) = 0 ∀ξ ∗ ∈ E ∗ , the canonical dual function P d : E ∗ → R Since W λ λ for this constrained problem can be obtained in the form 1 P d (∗ , ρ∗ ) = − (f − B T ∗ )T (A + ρ∗ I)−1 (f − B T ∗ ) − λρ∗ − bT ∗ . (5.5) 2 Thus, the canonical dual problem ((Pλd ) in short) associated with the parametric problem (Pλ ) can be formulated as the following (Pλd ) :

max P d (∗ , ρ∗ ) s.t. ∗ ≥ 0, ρ∗ ≥ 0, det(A + ρ∗ I) = 0.

(5.6)

Theorem 4 (Gao, 2003) Problem (Pλd ) is canonically (perfectly) dual to the primal parametric optimization problem (Pλ ) in the sense that if ξ¯∗ = (¯∗ , ρ¯∗ ) ∈ Eλ∗ is a KKT point of (Pλd ), then the vector defined by u ¯ = (A + ρ¯∗ I)−1 (f − B T ¯∗ )

(5.7)

is a KKT point of (Pλ ), and P (¯ u) = P d (ξ¯∗ ).

(5.8)

Proof. Suppose that ξ¯∗ = (¯∗ , ρ¯∗ ) ∈ Eλ∗ is a KKT point of (Pλd ), then we have 1 (f − B T ¯∗ )T (A + ρ¯∗ I)−2 (f − B T ¯∗ ) − λ ≤ 0, (5.9) 2 (5.10) 0 ≤ ¯∗ ⊥ B(A + ρ¯∗ I)−1 (f − B T ¯∗ ) − b ≤ 0.

0 ≤ ρ¯∗ ⊥

In terms of u ¯ = (A + ρ¯∗ I)−1 (f − B T ¯∗ ), we have 1 2 |¯ u| − λ ≤ 0, 2 ¯ − b ≤ 0. 0 ≤ ¯∗ ⊥ B u 0 ≤ ρ¯∗ ⊥

(5.11) (5.12)

This shows that u ¯ = (A + ρ¯∗ I)−1 (f − B T ¯∗ ) is a KKT point of the problem (Pλ ). By the complementarity conditions (5.11) and (5.12),

298

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

we have ρ¯∗ λ = 12 ρ¯∗ u ¯T u ¯ and bT ¯∗ = (B u ¯)T ¯∗ . Thus, in terms of u ¯ = ∗ −1 T ∗ (A + ρ¯ I) (f − B ¯ ), we have 1 T 1 ¯ (A + ρ¯∗ I)¯ u − ρ¯∗ u ¯T u ¯ − (B u ¯)T ¯∗ P d (ξ¯∗ ) = − u 2 2 1 T = u−u ¯T f = P (¯ u), u ¯ A¯ 2 which shows that there is no duality gap between the problems (Pλ ) and  (Pλd ). This proves the theorem.

5.2

KKT points and global minimizers

Theorem 4 shows that the primal problem (Pλ ) is equivalent to the canonical dual problem (Pλd ). While the following theorem shows that the number of KKT points of (Pλ ) depends on the number of negative eigenvalues of the matrix A. Theorem 5 (KKT points of the problem (Pλ ) ) Suppose that the symmetric matrix A has p ≤ n distinct eigenvalues and id of them are negative, i.e. a1 < a2 < · · · < aid < 0 ≤ aid +1 < · · · < ap . Then for a given sufficiently large parameter λ > 0, the parametric problem (Pλ ) ui }, i = 1, . . . , 2id + 1 on the sphere has at most 2id + 1 KKT points {¯ 1 2 = λ. |u| 2 Proof. Since A = AT , there exists an orthogonal matrix RT = R−1 such that A = RT DR, where D = (ai δij ) is a diagonal matrix. For any given vector ∗ ∈ Rm , let g = R(f − B T ∗ ) = (gi ) and 1 1 2 gi (ai + ρ∗ )−2 . ψ(ρ∗ ) = (f − B T ∗ )T (A + ρ∗ I)−2 (f − B T ∗ ) = 2 2 i=1 (5.13) Clearly, this real valued function ψ(ρ∗ ) is strictly convex within each interval −ai+1 < ρ∗ < −ai , as well as the intervals −∞ < ρ∗ < −ap and −a1 < ρ∗ < ∞ (see Fig.5.15). Thus, for a given sufficiently large ρ∗i } parameter λ > 0, the equation ψ(ρ∗ ) = λ has at least 2p solutions {¯ ∗ ∗ ∗ satisfying −aj+1 < ρ¯2j+1 < ρ¯2j < −aj for j = 1, . . . , p−1, and ρ¯1 > −a1 , ρ¯∗2p < −ap . Since A has only id negative eigenvalues, the equality ψ(ρ∗ ) = λ has at most 2id + 1 positive roots ρ¯∗i > 0, i = 1, . . . , 2id + 1 (if aid +1 > 0, the equality ψ(ρ∗ ) = λ may have at most 2id positive ui |2 − λ) = 0 tells that the roots). The complementarity condition ρ¯∗i ( 12 |¯ KKT points u ¯i (i = 1, . . . , 2id + 1) of the problem (Pλ ) should be on  the sphere 12 |u|2 = λ. p

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

299

5 4

ψ=λ 3 2 1 -2−a

-4

−aid

p

− a12

4

-1

Figure 5.15.

Graph of ψ(ρ∗ ).

Theorem 5 tells us that for a given matrix A and the parameter λ > 0, the equality ψ(ρ∗ ) = λ has at most one solution ρ¯∗1 > −a1 such that the matrix (A + ρ¯∗1 I) is positive definite, and a possible solution ρ¯∗k < −aid such that (A + ρ¯∗k I) is negative definite. Let R(A+ ) := {ρ∗ > 0 | (A + ρ∗ I) is positive definite},

(5.14)

R(A− ) := {ρ∗ ≥ 0 | (A + ρ∗ I) is negative definite}.

(5.15)

Thus, by Theorem 3, we have the following theorem. Theorem 6 Suppose that A has at least one negative eigenvalue, and for a given parameter λ > 0, the vector (¯ ui , ¯∗i , ρ¯∗i ) is a KKT point of the ∗ + ¯i is a global minimizer of P (u) on problem (Pλ ). If ρ¯i ∈ R(A ), then u 1 2 Uλe := {u ∈ Uλ | 2 |u| = λ} if and only if (¯∗i , ρ¯∗i ) is a global maximizer + of P d on Rm + × R(A ), i.e. P (¯ ui ) = min P (u) = u∈Uλe

max

+ (∗ ,ρ∗ )∈Rm + ×R(A )

P d (∗ , ρ∗ ) = P d (¯∗i , ρ¯∗i ). (5.16)

¯i is a global maximizer of P on Uλ if However, if ρ¯∗i ∈ R(A− ), then u − and only if (¯∗i , ρ¯∗i ) is a global maximizer of P d on Rm + × R(A ), i.e. P (¯ ui ) = max P (u) = u∈Uλ

max

− (∗ ,ρ∗ )∈Rm + ×R(A )

P d (∗ , ρ∗ ) = P d (¯∗i , ρ¯∗i ). (5.17)

Proof. By Theorem 4 and 5 we know that the vector ξ¯∗ i = (¯∗i , ρ¯∗i ) ∈ ¯i = (A+ ρ¯∗i I)−1 (f − Eλ∗ is a KKT point of the problem (Pλd ) if and only if u T ∗ B ¯i ) is a KKT point of the problem (Pλ ). Particularly, if ρ¯∗i ∈ R(A+ ), the matrix (A + ρ∗i I) is positive definite and the canonical dual function

300

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

P d (∗ , ρ∗ ) is concave in each of its components ∗ and ρ∗ , respectively. In this case the extended Lagrangian 1 ¯  (∗ , ρ∗ ) + (Bu − b)T ∗ − λρ∗ − uT f Ξ(u, ∗ , ρ∗ ) = uT (A + ρ∗ I)u − W 2 (5.18) is convex in u ∈ Rn and concave in both ∗ ∈ Rm and ρ∗ > −a1 > 0. Thus, we have P d (¯∗i , ρ¯∗i ) = =

max

+ (∗ ,ρ∗ )∈Rm + ×R(A )

max

P d (∗ , ρ∗ )

max min Ξ(u, ∗ , ρ∗ )

ρ∗ ∈R(A+ ) ∗ ≥0 u∈Rn

max min max Ξ(u, ∗ , ρ∗ ) 

1 T ∗ ∗ T = ∗max minn s.t. Bu ≤ b u (A + ρ I)u − λρ − u f ρ >−a1 u∈R 2 

1 T ∗ 1 T T = min ∗max u Au + ρ ( u u − λ) − u f u∈Uk ρ >−a1 2 2 1 T = min P (u) s.t. u u = λ, u∈Uk 2 =

ρ∗ >−a1 u∈Rn ∗ ≥0

where Uk = {u ∈ Rn | Bu ≤ b}. By the fact that ρ¯∗i ∈ R(A+ ) if and ui |2 = λ, and only if ρ¯∗i > −a1 > 0, the KKT condition (5.11) leads to 12 |¯ the vector u ¯i minimizes P on Uλe . On the other hand, if ρ¯∗i ∈ R(A− ), then the extended Lagrangian Ξ(u, ∗ , ρ∗ ) is concave in u ∈ Rn and concave in both ∗ ∈ Rm and ρ∗ ∈ R(A− ). Thus, by the so-called super-Lagrange duality theory (see Gao, 2000a), we have max P d (∗ , ρ∗ ) = max max max Ξ(u, ∗ , ρ∗ ) P d (¯∗i , ρ¯∗i ) = max ∗ ≥0 ρ∗ ∈R(A− ) ρ∗ ∈R(A− ) ∗ ≥0 u∈Rn Ξ(u, ∗ , ρ∗ ) = max maxn max ∗ ≥0 ∗ −  u∈R ρ ∈R(A ) 

1 T ∗ T ∗ u (A + ρ I)u − u f − λρ = max max ρ∗ ∈R(A− ) u∈Uk 2 

1 T T ∗ 1 T u Au − u f + ρ ( u u − λ) = max max u∈Uk ρ∗ ∈R(A− ) 2 2 1 T u u = λ or ρ∗ = 0. = max P (u) if either u∈Uλ 2 ui ) for all KKT points of (Pλ ), we know By the fact that P d (¯∗i , ρ¯∗i ) = P (¯  that u ¯i maximizes P on Uλ .

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

301

Remark. Theorem 6 shows that if the dual problem (Pλd ) has a KKT ¯i is a minimizer of P on point ρ¯∗i > −a1 > 0, then the associated vector u the subset Uλe ⊂ Uλ . However, if ρ¯∗i ∈ R(A− ), i.e. 0 ≤ ρ¯∗i < −aid , then Ξ(u, ∗ , ρ∗ ) is a super-Lagrangian. By the triality theory developed in gao00a, in addition to the double max duality theory (5.17), the double min duality relation P (¯ u) = min P (u) = u∈Uλ

min

− (∗ ,ρ∗ )∈Rm + ×R(A )

P d (∗ , ρ∗ ) = P d (¯∗ , ρ¯∗ )

(5.19)

holds also under certain condition, and this might lead to a global minimizer of P on Uλ . Thus, the triality theory can be used to develop certain powerful primal-dual algorithms for solving the constrained quadratic programming problem (Pλ ).

5.3

Examples

Example 6 (One-D Concave Minimization) First of all, let us consider one dimensional concave minimization problem: 1 min P (x) = ax2 − f x, s.t. |x| ≤ r. 2

(5.20)

Clearly, if a < 0, the global minimizer of P (x) has to be one of boundary points u ¯ = ±r. In this case, λ = 12 r2 . The canonical dual problem is 1 max P d (ρ∗ ) = − f 2 /(a + ρ∗ ) − λρ∗ , s.t. (a + ρ∗ ) > 0. 2

(5.21)

Since n = 1, the dual algebraic equation 12 f 2 /(a + ρ∗ )2 − λ has only two roots: ρ¯∗1 > −a is a unique maximizer of P d , and ρ¯∗2 < −a is a local minimizer. If we choose f = .4, a = −.6 and r = 1.5, the ¯1 = global maximizer ρ¯∗1 = 0.866667, which gives the global minimizer u f /(a + ρ¯∗1 ) = 1.5. It is easy to check that P (¯ u1 ) = −1.275 = P d (¯ ρ∗1 ). While the local minimizer ρ¯∗2 = 0.3333, which gives the local minimizer u ¯2 = −1.5. Since for ρ¯∗2 < −a, the extended Lagrangian (5.18) is a so-called super-Lagrangian (cf. gao00a). In this case, the double-min duality theory leads to ρ∗2 ). P (¯ u2 ) = −0.075 = P d (¯ ¯3 = f /(a+¯ ρ∗3 ) = −0.666667 It is interesting to note that for ρ¯∗3 = 0, then u is a global maximizer of P (x) and we have ρ∗3 ) = .13333. P (¯ u3 ) = P d (¯

302

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003 1 0.5 -2

-1.5

-1

-0.5

0.5

1

1.5

-0.5 -1 -1.5 -2

Figure 5.16.

Graphs of P (x) and P d (ρ∗ ) for one dimensional problem.

The graphs of P (x) and P d (ρ∗ ) are shown in Fig. 5.16. Example 7 (Two-D Concave Minimization within Convex Set) We now consider the following quadratic programming within a convex set: 1 min P (x1 , x2 ) = (a1 x21 + a2 x22 ) − f1 x1 − f2 x2 2 1 1 x1 + x2 ≤ 1, x2 ≥ 0, (x21 + x22 ) ≤ 2. s.t. 2 2

(5.22)

In this case, the radius of the feasible set Uλ = {(x1 , u2 ) ∈ R2 | Bu ≤ b,

1 2 (x + x22 ) ≤ 2} 2 1

is r0 = 2, and B = {{ 12 , 1}, {0, −1}} is a 2 × 2 matrix, b = {1, 0} is a 2-vector. If both a1 , a2 ≤ 0, P is concave and its global minima must be located on the boundary of Uλ (see Fig. 5.17). The canonical dual problem in this case is to find (∗ , ρ∗ ) ∈ R2 × R such that   1 (f1 − 12 ∗1 )2 (f2 − ∗1 + ∗2 )2 d ∗ ∗ ∗ + − λρ∗ − ∗1 max P (1 , 2 , ρ ) = − 2 a1 + ρ∗ a2 + ρ∗ s.t.

∗1 ≥ 0, ∗2 ≥ 0, ρ∗ ≥ − min{a1 , a2 }.

(5.23)

If we let f = (.3, .3), a1 = −0.5, a2 = −0.3, λ = 12 r02 = 2, then this dual problem has a unique solution: ρ¯∗ = 0.548375, ¯∗1 = 0.406502, ¯∗2 = 0.106507.

303

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

This leads to a global minimizer u ¯1 = 2.0, u ¯2 = 0. It is easy to verify ¯2 ) = −1.6 = P d (¯ ∗1 , ¯∗2 , ρ¯∗ ). that P (¯ u1 , u

2

0 -0.5 -1 -1.5 -2 0

0.1

1

0.2

0.3

0.4

0.5

0.6

0.7

0.8 -2

0.6 0.4

0.5

-4

1

0.2 1.5 20

Figure 5.17.

6.

Graphs of P (x1 , x2 ) and P d (¯ ∗1 , ¯∗2 , ρ∗ ).

Quadratic Programming Over a Sphere

As a particular application of the quadratic parametrical programming, let us consider the following quadratic programming with only a quadratic constraint over a sphere: (Pq )

min s.t.

1 T u Au − f T u 2 1 T u u ≤ λ. 2

(6.1)

This problem often comes up as a subproblem in general optimization algorithms (cf. Powell, 2002). Often, in the model trust region methods, the objective function in nonlinear programming is approximated locally by a quadratic function. In such cases, the approximation is restricted to a small region around the current iterate. If the 2-norm is used to define this region, then these methods ended up with the quadratic programming over a sphere (Pq ). As indicated by Floudas and Visweswaran (1995), due to the presence of the nonlinear sphere constraint, the solution of (Pq ) is likely to be irrational, which implies that it is not possible to exactly compute the solution. Therefore, many polynomial time algorithms have been suggested to compute the approximate solution to this problem (see, Sorensen, 1982; Karmarkar, 1990; and Ye, 1992). However, by the

304

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

canonical dual transformation, this problem can be solved completely. Since there is no linear inequality constraint Bu ≤ b, the canonical dual problem (Pλd ) in this case is simply a concave maximization in R: (Pqd ) :

1 max P d (ρ∗ ) = − f T (A + ρ∗ I)−1 f − λρ∗ , 2 s.t. ρ∗ ≥ 0, (A + ρ∗ I) is positive definite.

(6.2)

This is a concave maximization with only one degree-of-freedom. The criticality condition of P d (ρ∗ ) leads to the following dual algebraic equation 1 T f (A + ρ∗ I)−2 f − λ = 0, (6.3) 2 which can be solved completely by MATHEMATICA. Thus we have the following result. Theorem 7 (Complete Solutions to (Pq )) Suppose that the matrix A = AT has p ≤ n distinct eigenvalues and id of them are negative, i.e. a1 < a2 < · · · < aid < 0 ≤ aid +1 < · · · < ap . Then for a given vector f ∈ Rn , and λ > 0, the canonical dual function P d (ρ∗ ) has at most 2id + 1 KKT points ρ¯∗i , i = 1, . . . , 2id + 1 satisfying the following distribution law ρ¯∗1 > −a1 > ρ¯∗2 ≥ ρ¯∗3 > −a2 > · · · > −ai > ρ¯∗2i ≥ ρ¯∗2i+1 < −ai+1 > · · · > −aid > ρ¯∗2id ≥ ρ¯∗2id +1 . (6.4) For each ρ¯∗i , i = 1, 2, . . . , 2id + 1, the vector defined by u ¯i = (A + ρ¯∗i I)−1 f

(6.5)

is a KKT point of P (u) and ρ∗i ), i = 1, 2, . . . , 2id + 1. P (¯ ui ) = P d (¯

(6.6)

Moreover, u ¯1 is a global minimizer of the problem (Pq ). If a1 < 0, u1 |2 = 2λ. then u ¯1 is located on the boundary of the sphere, i.e. |¯ Proof. This is a special case of Theorem 5 and Theorem 6.



Example 8 (Quadratic Programming over a 4-d Sphere) We simply let A is a diagonal matrix with four non zero eigenvalues: {a1 = −0.5, a2 = −0.25, a3 = 0.1, a4 = 0.4}. If we choose f = (.3, .4, −.2, .1), and λ = 2, the canonical dual algebraic equation (6.3) has four real roots (see Fig. 5.18) ρ∗4 = −0.455359 < ρ∗3 = −0.339287 < ρ∗2 = −0.219664 < ρ∗1 = 0.672415.

305

Nonconvex Semi-Linear Problems And Canonical Duality Solutions

Since ρ∗1 > 0 and (A+ρ∗1 I) is positive definite, so ρ∗1 is a global maximizer of P d , which leads to the global minimizer u ¯1 = (A + ρ∗1 I)−1 f = (1.73999, 0.946937, −0.258928, 0.0932475) on the boundary of the 4-D sphere |u| ≤ 2, i.e. (¯ u21 + u ¯22 + u ¯23 + u ¯24 )1/2 = 2. This is the reason why the primal problem is very difficult. However, the dual problem is a concave maximization programming and the global maximizer is in the interior of the dual feasible set Eλ∗ . 3

2

1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

-1

-2

-3

Figure 5.18.

7.

Graphs of P d (ρ∗ ) in four dimensional problem.

Concluding Remarks

We have presented detailed applications of the canonical dual transformation method developed recently to the general nonconvex optimization problem (P) proposed in (1.1). This problem is directly related to many important applications in mathematical physics. For the quadratic geometrical measure Λ, a canonical dual problem is formulated, i.e. the so-called perfect dual formulation with zero duality gap and without any perturbation. Based on the perfect duality theory, a complete set of solutions is obtained. Several examples are illustrated. The results show that the local minimizers and maximizers appear periodically in the order of the dual solutions. This phenomenon has been verified experimentally in superconductivity governed by Landau-Ginzburg equation (see gao-li-v). The results presented in the last section (Section 6) is particularly interesting. Quadratic programming with only the norm constraint

306

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

√ |u| ≤ ∆ = 2λ was studied √recently by M.J.D. Powell (2002). Since the normality condition |u| ≤ 2λ is a general constraint for any real problems in applications, the quadratic operator Λ(u) = 12 |u|2 − λ can be used to solve many nonconvex problems in quadratic and d.c. programming. The idea, results and method presented in this paper can be used and generalized to solve some difficult problems in global optimization, nonconvex mechanics and scientific computations. The canonical dual transformation method for fully nonlinear systems (where Λ is a general polynomial operator) was discussed in gao98; gao00a; gao00c. Compared with the traditional direct methods in global optimization problems, the main advantages of the canonical dual transformation method are the following: 1. it provides powerful and efficient primal-dual alternative approaches; 2. it converts nonsmooth/nonconvex constrained problems into smooth concave dual problems; 3. it reduces the dimensions in nonlinear programming. Duality plays a key role in modern mathematics and science. The inner beauty of duality theory owes much to the fact that many different natural phenomena can be put in the unified mathematical framework (cf. Fig. 1, gao00a). Generally speaking, most of physical variables appear in dual pairs. This one-to-one canonical duality relation serves as the foundation for the canonical dual transformation method. For any given nonlinear problem, as long as the geometrical operator Λ is chosen properly and the trio-canonical forms can be characterized correctly, the canonical dual transformation method can be used to establish nice theoretical results, and to develop efficient alternative algorithms for robust computations. The extended Lagrange duality and triality theories might have certain impact in some research fields. Acknowledgement The author is sincerely grateful to Professor C.J. Goh at University of Western Australia and Professor Alex Rubinov at Ballarat University for their detailed comments and suggestions on the author’s recent results.

Notes 1. The sub-space Ur ⊂ Uk is said to be the neighborhood of the critical point u ¯ if u ¯ is the only critical point of P on Ur . The definition for the neighborhood Er∗ ⊂ Ek∗ is similar. In the case that A is a matrix, the definition for Ur and Er∗ are given in the Remark following this theorem.

References

Arthurs, A.M. (1980). Complementary Variational Principles, Clarendon Press, Oxford. Atai, A.A. and Steigmann, D. (1998), Coupled deformations of elastic curves and surfaces, Int. J. Solids and Structures, 35, 1915-1952. Auchmuty, G (1983), Duality for non-convex variational principles, J. Diff. Equations, 50, pp 80-145 Auchmuty, G (2001). Variational Principles for Self-Adjoint Elliptic Eigenproblems, in Nonconvex/Nonsmooth Mechanics: Modelling, Methods and Algorithms, Edited by Gao, D.Y., R.W. Ogden and G. Stavroulakis (2000) Kluwer Academic Publishers, 2000, 478pp.

Benson, H (1995), Concave minimization: theory, applications and algorithms, in Handbook of Global Optimization, eds. R. Horst and P. Pardalos, Kluwer Academic Publishers, 43-148. Clarke, FH (1985), The dual action, optimal control, and generalized gradients, Mathematical Control Theory, Banach Center Publ., 14, PWN, Warsaw, pp. 109-119. Crouzeix, J.P. (1981), Duality framework in quasiconvex programming, in Generalized Convexity in Optimization and Economics, eds. S. Schaible and W.T. Ziemba, Academic Press, 207-226. Dacorogna, D (1989), Direct Methods in the Calculus of Variations. Springer-Verlag. Ekeland, I. (1977). Legendre duality in nonconvex optimization and calculus of variations, SIAM J. Control and Optimization, 15, 905-934. Ekeland, I (1990), Convexity Methods in Hamiltonian Mechanics, SpringerVerlag, 247pp. Ekeland, I. (2003). Nonconvex Duality, to appear in Proceedings of IUTAM Symposium on Duality, Complementarity and Symmetry in Nonlinear Mechanics, D.Y. Gao (Edited), Kluwer Academic Publishers, Dordrecht/Boston/London. Ekeland, I and Temam, R (1976),Convex Analysis and Variational Problems, North-Holland.

307

308

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Floudas, C.A .and V. Visweswaran (1995). Quadratic optimization, in Handbook of Global Optimization, R. Horst and P.M. Pardalos (eds), Kluwer Academic Publishers, Dordrecht/Boston/London, pp. 217-270. Gao, D.Y. (1988). Panpenalty finite element programming for limit analysis, Computers & Structures, 28, pp. 749-755. Gao, D.Y. (1996). Nonlinear elastic beam theory with applications in contact problem and variational approaches, Mech. Research Communication, 23, 1, pp. 11-17. Gao, D.Y. (1997). Dual extremum principles in finite deformation theory with applications to post-buckling analysis of extended nonlinear beam theory, Applied Mechanics Reviews, 50, 11, November 1997, S64-S71. Gao, D.Y. (1998a). Duality, triality and complementary extremum principles in nonconvex parametric variational problems with applications, IMA J. Appl. Math., 61, 1998, 199-235. Gao, D.Y.(1998b) Bi-complementarity and duality: A framework in nonlinear equilibria with applications to the contact problems of elastoplastic beam theory, J. Appl. Math. Anal., 221, 672-697. Gao, D.Y., (1999a). Duality-Mathematics, Wiley Encyclopedia of Electronical and Electronical Engineering, 6, 1999, 68-77. Gao, D.Y. (1999b). General Analytic Solutions and Complementary Variational Principles for Large Deformation Nonsmooth Mechanics. Meccanica 34, 169-198. Gao, D.Y. (2000a). Duality Principles in Nonconvex Systems: Theory, Methods and Applications, Kluwer Academic Publishers, Dordrecht/Boston/London, xviii+454pp. Gao, D.Y. (2000b). Analytic solution and triality theory for nonconvex and nonsmooth variational problems with applications, Nonlinear Analysis, 42, 7, 1161-1193. Gao, D.Y. (2000c). Canonical dual transformation method and generalized triality theory in nonsmooth global optimization, J. Global Optimization, 17 (1/4), pp. 127-160. Gao, D.Y.(2000d). Finite deformation beam models and triality theory in dynamical post-buckling analysis. Int. J. Non-Linear Mechanics, 35, 103-131. Gao, D.Y. (2001a). Bi-Duality in Nonconvex Optimization, in Encyclopedia of Optimization, C. A. Floudas and P.D. Pardalos (eds). Kluwer Academic Publishers, Dordrecht/Boston/London, 1, pp. 477-482. Gao, D.Y. (2001b). Gao, D.Y., Tri-duality in Global Optimization, in Encyclopedia of Optimization, C. A. Floudas and P.D. Pardalos (eds). Kluwer Academic Publishers, Dordrecht/Boston/London, Vol. 1, pp. 485-491.

REFERENCES

309

Gao, D.Y. (2001c) Complementarity, polarity and triality in nonsmooth, nonconvex and nonconservative Hamilton systems, Philosophical Transactions of the Royal Society: Mathematical, Physical and Engineering Sciences, 359, 2347-2367, 2001. Gao, D.Y. (2002). Duality and triality in non-smooth, nonconvex and nonconservative systems: A survey, new phenomena and new results, in Nonsmooth/Nonconvex Mechanics with Applications in Engineering, edited by C. Baniotopoulos. Thessaloniki, Greece. pp. 1-14. Gao, D.Y. (2003a). Perfect duality theory and complete solutions to a class of global optimization problems, to be published in Optimisation, special issue edited by A. Rubinov. Gao, D.Y. (2003b). Canonical dual principle, algorithm, and complete solutions to Landau-Ginzburg equation with applications, Journal of Mathematics and Mechanics of Solids, special issue dedicated to Professor Ray Ogden for the occasion of his 60th birthday, edited by D. Steigmann. Gao, D.Y. (2004). Perfect duality theory and complete solutions for constrained global optimization problems, to be published J. Global Optimisation, special issue on Duality, edited by D.Y. Gao and K.L. Teo. Gao, D. Y., Jie-Fang Li and D. Viehland (2002). Complete solutions and triality theory to Landau-Ginzburg equation in imperfect ferroelectrics, Proceedings of the 4th International Conference on Nonlinear Mechanics, W.Z. Chien (ed), Shanghaui University Press, submitted to Physics Review. Gao, D.Y. and Lin, P. (2002). Calculating global minimizers of a nonconvex energy potential, in Recent Advances in Computational Science Engineering, edited by H.P. Lee and K. kumar, Imperial College Press, pp. 696-700. Gao, D. Y., R.W. Ogden and G. Stavroulakis (2001). Nonsmooth and Nonconvex Mechanics: Modelling, Analysis and Numerical Methods. Kluwer Academic Publishers, Boston/Dordrecht/London, 2001, xliv+471pp. Gao, D.Y. and Strang, G. (1989a), Geometric nonlinearity: Potential energy, complementary energy, and the gap function, Quart. Appl. Math., 47(3), 487-504, 1989. Gao, D.Y. and Strang, G. (1989b), Dual extremum principles in finite deformation elastoplastic analysis, Acta Appl. Math., 17, pp. 257267. Gasimov, R.N. (2002). Augmented Lagrangian duality and nondifferentiable optimization methods in nonconvex programming. J. Global Optimization, 24, 187-203. Goh, CJ and XQ Yang (2002). Duality in Optimization and Variational Inequalities, Taylor and Francis, 329pp.

310

ADVANCES IN MECHANICS AND MATHEMATICS II, 2003

Horst, R., Panos M. Pardalos, Nguyen Van Thoai (2000). Introduction to Global Optimization, Kluwer Academic Publishers. Murty, K.G. and Kabadi, S.N. (1987). Some NP-complete problems in quadratic and nonlinear programmings, Math. Progr., 39, 117-129. Noble, B. and Sewell, M.J. (1972). On dual extremum principles in applied mathematics, J. Inst. Math. Appl., 9, 123-193. Oden, J.T. and Reddy, J.N. (1983). Variational Methods in Theoretical Mechanics. Springer-Verlag. Rockafellar, R.T. and Wets, R.J.B. (1997). Variational analysis, Springer: Berlin, New York. Powell, M.J.D (2002). UOBYQA: unconstrained optimization by quadratic approximation, Mathematical Programming, Series B, 92 (3), pp. 555582. Rubinov, A.M. and R.N. Gasimov (2003). Scalarization and nonlinear scalar duality for vector optimization with preferences that are not necessarily a pre-order relation, J. Global Optimization (special issue on Duality edited by D.Y. Gao and K.L. Teo), to appear. Rubinov, A.M. and Yang XQ (2003). Lagrange-Type Functions in Constrained Non-Convex Optimization. Kluwer Academic Publishers, Boston/Dordrecht/London, 285 pp. Rubinov, A.M., Yang XQ and Glover, BM (2001). Extended Lagrange and penalty functions in optimization. J. Optim. Theory Appl., 111 (2), 381–405. Sahni, S. (1974). Computationally related problems, SIAM J. Comp., 3, 262-279. Sewell, MJ (1987), Maximum and Minimum Principles, Cambridge Univ. Press, 468pp. Singer, I (1998), Duality for optimization and best approximation over finite intersections. Numer. Funct. Anal. Optim., 19, no. 7-8, 903-915. Strang, G (1986), Introduction to Applied Mathematics, Wellesley-Cambridge Press, 758 pp. Tabarrok, B and Rimrott, FPJ (1994), Variational methods and complementary formulations in dynamics. Kluwer Academic Publishers: Dordrecht. Thach, P.T. (1993), Global optimality criterion and a duality with a zero gap in nonconvex optimization. SIAM J. Math. Anal. 24, no. 6, 1537-1556. Thach, P. T. (1995). Diewert-Crouzeix conjugation for general quasiconvex duality and applications. J. Optim. Theory Appl., 86, no. 3, 719-743.

REFERENCES

311

Thach, P. T., Konno, H. and Yokota, D. (1996). Dual approach to minimization on the set of Pareto-optimal solutions. J. Optim. Theory Appl., 88, no. 3, 689-707. Toland, JF (1978), Duality in nonconvex optimization, J. Mathematical Analysis and Applications, 66, 399-415. Tonti, E. (1972). A mathematical model for physical theories, Accad. Naz. dei Lincei, Serie III, LII, I, 175-181; II, 350-356. Tonti, E. (1972). On the mathematical structure of a large class of physical theories, Accad. Naz. dei Lincei, Serie VIII, LII, 49-56. Tuy, H (1995), D.C. optimization: theory, methods and algorithms, in Handbook of Global Optimization, eds. R. Horst and P. Pardalos, Kluwer Academic Publishers, 149-216. Vavasis, S. (1990). Quadratic programming is in NP, Info. Proc. Lett., 36, 73-77. Vavasis, S. (1991). Nonlinear Optimization: Complexity Issues, Oxford University Press, New York. Walk, M (1989). Theory of duality in mathematical programming, SpringerVerlag, Wien / New York. Wright, M. H. (1998). The interior-point revolution in constrained optimization, in High-Performance Algorithms and Software in Nonlinear Optimization (R. DeLeone, A. Murli, P. M. Pardalos, and G. Toraldo, eds.) 359–381, Kluwer Academic Publishers, Dordrecht. Ye, Y. (1992). A new complexity result on minimization of a quadratic function with a sphere constraint, in Recent Advances in Global Optimization (C. Floudas and P. Pardalos eds.), Princeton University Press, NJ, 1992 .