perfect duality theory and complete solutions to a

0 downloads 0 Views 280KB Size Report
which maps each x 2 X into the dual space XГ; the bilinear form hx, xГi: XВXГ ! R puts X and XГ in duality; W : X ! R is a given (not necessarily convex) function; f ...
Optimization Vol. 52, Nos. 4–5, August–October 2003, pp. 467–493

PERFECT DUALITY THEORY AND COMPLETE SOLUTIONS TO A CLASS OF GLOBAL OPTIMIZATION PROBLEMS* DAVID YANG GAOy Department of Mathematics, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA (Received 13 December 2002; In final form 23 July 2003) This article presents a complete set of solutions for a class of global optimization problems. These problems are directly related to numericalization of a large class of semilinear nonconvex partial differential equations in nonconvex mechanics including phase transitions, chaotic dynamics, nonlinear field theory, and superconductivity. The method used is the so-called canonical dual transformation developed recently. It is shown that, by this method, these difficult nonconvex constrained primal problems in Rn can be converted into a one-dimensional canonical dual problem, i.e. the perfect dual formulation with zero duality gap and without any perturbation. This dual criticality condition leads to an algebraic equation which can be solved completely. Therefore, a complete set of solutions to the primal problems is obtained. The extremality of these solutions are controlled by the triality theory discovered recently [D.Y. Gao (2000). Duality Principles in Nonconvex Systems: Theory, Methods and Applications, Vol. xviii, p. 454. Kluwer Academic Publishers, Dordrecht/Boston/London.]. Several examples are illustrated including the nonconvex constrained quadratic programming. Results show that these problems can be solved completely to obtain all KKT points and global minimizers. Keywords: Duality; Triality theory; Global optimization; Nonconvex variations; Canonical dual transformation; Nonconvex mechanics; Critical point theory; Semilinear equations; NP-hard problems; Quadratic programming Mathematics Subject Classifications 2000: 49N15; 49M37; 90C26; 90C20

1

PROBLEM AND MOTIVATION

The primary goal of this article is to solve the following global optimization problem (in short, the primal problem ðPÞ):   1 ðPÞ : min PðxÞ ¼ hx, Axi þ WðxÞ  hx, f ij x 2 X k , ð1Þ 2 *This article is dedicated to Professor Gilbert Strang on the occasion of his 70th birthday. The main results of this article has been presented at the International Conference on Nonsmooth/Nonconvex Mechanics, Aristotle University of Thessaloniki (A.U.Th.), July 5–6, 2002 (keynote lecture), and the Second International Conference on Optimization and Control with Applications, August 18–22, 2002, Yellow Mountains, Anhui, China (plenary lecture). y E-mail: [email protected] ISSN 0233-1934 print: ISSN 1029-4945 online ß 2003 Taylor & Francis Ltd DOI: 10.1080/02331930310001611501

468

D.Y. GAO

where the feasible space X k is a convex subset of a normed space X such that the algebraic interior of X is not empty; A : X ! X  is a linear operator with A ¼ A , which maps each x 2 X into the dual space X  ; the bilinear form hx, x i: X  X  ! R puts X and X  in duality; W : X ! R is a given (not necessarily convex) function; f 2 X  is a given input; P : X k ! R represents the total cost of the system. Although the global minimization with inequality constraints are discussed in Section 6, this article is mainly interested in finding all critical points of the nonconvex function P(x). Thus, in the case that the nonconvex function W : X k ! R is Gaˆteaux differentiable, the stationary (or criticality) condition DP(x) ¼ 0 leads to the governing equation Ax þ DWðxÞ ¼ f ,

ð2Þ

where DW(x) represents the Gaˆteaux derivative of W at x, which is a mapping from X into its dual space X  . The abstract form (2) of the primal problem ðPÞ covers many situations. In nonconvex mechanics (cf. [23,27]), where X is an infinite dimensional function space, the state variable x is a field function, usually denoted by u(x), and A : X ! X  is usually a partial differential operator. In this case, the governing Eq. (2) is a so-called semilinear equation. For example, in Landau–Ginzburg theory of superconductivity, A ¼  is the Laplacian over a given space domain   Rn and  2 1 1 2  u   d WðuÞ ¼ 2 2 Z

is the Landau double well potential, in which ,  > 0 are material constants. Then the governing Eq. (2) leads to the well-known Landau–Ginzburg equation   1 2 u þ u u   ¼ f : 2 This semilinear differential equation plays an important role in materials science and physics including: ferroelectricity, ferromagnetism, ferroelasticity, and superconductivity. Due to the fact that the Landau energy W(u) is nonconvex functional, the Landau–Ginzburg equation has proven difficult to solve. Traditional direct analysis and related numerical methods for solving this nonconvex variational problem have proven unsuccessful to date. In dynamical systems, if A ¼ @, tt þ  is a wave operator over a given space-time domain   Rn  R, and the nonconvex functional is simply given as WðuÞ ¼ R   cos u d, then (2) is the well-known sine-Gordon equation u, tt  u ¼ sin ðuÞ  f : This equation appears in many branches of physics. It provides one of the simplest models of the unified field theory. It can also be found in the theory of dislocations in metals, in the theory of Josephson junctions, as well as in interpreting certain biological processes like DNA dynamics. In the time domain   ½0, 1Þ, if we keep

GLOBAL OPTIMIZATION PROBLEMS

469

only the first two terms of the Taylor’s expansion of sin(u), then the sine-Gordon equation reduces to the well-known Duffing equation, u, tt ¼ u ð1  u2 =6Þ  f . Even in this very simple one-dimensional ordinary differential equation, an analytic solution is still very difficult to obtain. It is known that this equation is extremely sensitive to the initial conditions and the input (driving force) f (t) (see [17,18]). Generally speaking, due to the nonconvexity of the function W(u), very small perturbations of the system’s initial conditions and parameters may lead the system to different operating points with significantly different performance characteristics, i.e. the so-called chaotic phenomena. The numerical results vary with the methods used. This is one of main reasons why traditional perturbation analysis and the direct approaches cannot successfully be applied to nonconvex systems. For convex W(u), duality theory has been well studied (see [5,9,17]). However, for nonconvex W(u), the problem of duality theory and solutions for nonlinear wave equation is one of the two main open problems proposed by Ekeland [10]. The numerical discretization of these semilinear problems in mathematical physics usually leads to the nonconvex optimization problem ðPÞ in finite dimensional space X ¼ Rn . In discrete dynamical systems, the operator A ¼ AT 2 Rnn is usually indefinite. For constrained mathematical programming problems, the function W(x) could be also an indicator I X k ðxÞ of a convex constraint set X k defined by ( I X k ðxÞ :¼

0

if x 2 X k ,

1

otherwise:

In this case, even the (nonconvex) quadratic minimization 1 PðxÞ ¼ hx, Axi  hx, f i ! min 2

8 x 2 X k  Rn

is a so-called NP-hard problem (see [12,32]) in global optimization. However, by using the canonical dual transformation method developed recently by the author, a complete set of solutions can been obtained for the problem with certain constraints (see Section 6).

2

CANONICAL DUAL TRANSFORMATION METHOD: A BRIEF REVIEW

The concept of duality is one of the most successful ideas in modern mathematics and science [17,40,46]. Classical duality theory in convex analysis and optimization can be found in monographs by Ekeland [9], Ekeland and Temam [11], Goh and Yang [31] Sewell [38], Strang [40], Walk [47], Rockaffellar [49], and many more. In the primal problem ðPÞ, if the function W(x) is a so-called canonical function, i.e. x ¼ DWðxÞ is invertible, then, the dual problem can be formulated in different ways. First, we let FðxÞ ¼ hx, f i  WðxÞ, which is concave if W(x) is convex. Thus, the primal function P(x) can be written in the so-called action form (see [9,10]). 1 PðxÞ ¼ hx, Axi  FðxÞ: 2

470

D.Y. GAO

We shall use the notation F [ to denote the Fenchel inf-conjugate of F, defined by F [ ðx Þ ¼ inf fhx, x i  FðxÞg: x2X a

Clearly, F [ : X  ! R [ f1g is always concave, upper semicontinuous. If F (x) is also concave, upper semicontinuous, then the following Fenchel inf-duality relations hold on X a  X a x 2 @þ FðxÞ () x 2 @þ F [ ðx Þ () FðxÞ þ F [ ðx Þ ¼ hx, x i,

ð3Þ

where @þ F ¼ @ ðFðxÞÞ is called the super-differential of F, corresponding to the subdifferential @ in convex analysis. The duality pair ðx, x Þ 2 X a  X a is called a Fenchel inf-duality pair if the Fenchel inf-duality relations (3) hold on X a  X a . Thus, in the case that W(x) is convex, the first dual action form can be presented as 1 Pc ðxÞ ¼ hx, Axi  F [ ðAxÞ: 2 This dual action form was originally given by Clarke [5] in the case of convex Hamiltonian systems. The generalized formulation is due to Ekeland and Lasry (see [9,10]). The dual action principle states that if F is concave, then x is a critical point of P if and only if all the x c 2 x þ Ker A are critical points of Pc, and the complementarity condition Pðx Þ þ P c ðx c Þ ¼ 0

8 x c 2 x þ Ker A

ð4Þ

holds. However, if F (x) is nonconcave, the Fenchel–Young inequality F [ ðx Þ  hx, x i  FðxÞ leads  ¼ Pðx Þ þ P c ðx c Þ  0. The nonzero  > 0 is called the duality (or complementarity) gap. This duality gap shows that the Clarke dual action principle does not hold for nonconvex problems. Coincided with the complementarity condition (4), the dual action form P c is also called the complementary action, and the duality gap is referred as the complementarity gap in [17]. Actually, the so-called complementary formulation has been a classical concept in engineering mechanics and physics for about one century, where a problem is said to be a complementary problem means that it is equivalent to the primal problem without any duality gap (cf. [41]). It seems that engineers and physicists only like the perfect duality formulations. As indicated in the very recent article by Ivar Ekeland [10] that if F (x) is nonconvex, the (perfect) dual action form (without complementarity gap) is an open problem in nonconvex systems. Based on the augmented Lagrangian theory and penalty function methods, a so-called nonlinear Lagrange theory has been developed recently for solving nonconvex constrained optimization problems, where the zero duality gap property is equivalent to the lower semicontinuity of a perturbation function (see [36]).

GLOBAL OPTIMIZATION PROBLEMS

471

The second dual formulation is based on the factorization of the self-adjoint (symmetrical) operator A ¼  K, where  : X ! Y is a so-called geometrical operator, which maps each configuration x 2 X into a so-called intermediate space Y; the symmetrical constitutive operator K links Y with its dual space Y  . Let hy; y i denotes the bilinear form in Y  Y  , the balance operator  : Y  ! X  can be defined by hx; y i ¼ hx,  y i, which maps each dual intermediate variable y 2 Y  back to the dual configuration space X  . By the definition introduced in [17,19], a Gaˆteaux differentiable function F : X a ! R is said to be a canonical function on X a if its Gaˆteaux derivative DF : X a ! X a  X is a one-to-one mapping from X a onto its range X a . Thus, if F ðxÞ is a canonical function, the duality relation x ¼ DF ðxÞ is invertible on X a  X a , and its Legendre conjugate F  : X a ! R can be defined uniquely by the classical Legendre transformation   F  ðx Þ ¼ hx, x i  F ðxÞj DF ðxÞ ¼ x , x 2 X a :

ð5Þ

The duality pair ðx, x Þ is called the Legendre canonical duality pair on X a  X a if and only if the Legendre duality relations x ¼ DF ðxÞ () x ¼ DF  ðx Þ () F ðxÞ þ F  ðx Þ ¼ hx, x i

ð6Þ

hold on X a  X a . For examples, if the function WðxÞ in ðPÞ is a canonical function on X a , then F ðxÞ ¼ hx, f i  WðxÞ is also a canonical function for any given f 2 X a . If the operator K : Y a ! Y a is invertible, then the quadratic function U ðyÞ ¼ ð1=2Þhy; Kyi is a canonical function on Y a . Moreover, if the feasible space X k can be written in the canonical form (see [17]): X k ¼ fx 2 X a j x 2 Y a g, then, based on the trio-factorization A ¼  K, the primal problem (P) can be rewritten in the canonical form   min PðxÞ ¼ U ðxÞ  F ðxÞj x 2 X k :

ð7Þ

The criticality condition DPðx Þ ¼  D U ðxÞ  DF ðx Þ ¼ 0, where D U ðxÞ denotes the Gaˆteaux derivative of U with respect to y ¼ x, can be split into the so-called trio-canonical forms (a) geometrical equations: y ¼ x, (b) duality relations: y ¼ DU ðyÞ, (c) balance equation:



x ¼ DF ðxÞ,

ð8Þ

 

x ¼ y

The problem (7) is said to be geometrically linear (resp. nonlinear) if the geometrical operator  is linear (resp. nonlinear); the problem is said to be physically (or constitutively) linear (resp. nonlinear) if the both duality relations are linear (resp. nonlinear); the problem is said to be fully nonlinear if it is both geometrically and physically nonlinear (see [17,19]). Extensive illustrations of the trio-factorization A ¼  K for linear operators in applied mathematics were given in the excellent textbook

472

D.Y. GAO

by Strang [40]. The trio-factorization for nonlinear operators A in nonconvex and nonconservative systems were presented in [17]. The trio-canonical forms (8) serve as a framework for the classical Lagrangian duality theory in geometrically linear systems. Through the classical Lagrangian L : X a  Y a ! R Lðx, y Þ ¼ hx; y i  U  ðy Þ  F ðxÞ,

ð9Þ

the canonical dual function P : Y k  Y a ! R can be defined by   P ðy Þ ¼ Lðx, y Þj Dx Lðx, y Þ ¼ 0, x 2 X a ¼ F  ð y Þ  U  ðy Þ

ð10Þ

on the dual feasible space Y k ¼ fy 2 Y a j  y 2 X a g (see [17]). In geometrically linear static systems, where the canonical function U ðyÞ is usually convex and F ðxÞ is concave. In this case, (the total potential) PðxÞ is convex, and Lðx, y Þ is a saddle function on X a  Y a . The saddle Lagrange duality theory leads to the classical min–max duality theory in convex systems inf PðxÞ ¼ inf sup Lðx, y Þ ¼ sup inf Lðx, y Þ ¼ sup P ðy Þ: x2X a y 2Y  a

y 2Y a x2X a

y 2Y k

The saddle Lagrange duality theory plays an important role in nonsmooth convex systems. As illustrated in [17,19] if the primal problem is nonsmooth, its Legendre dual problem is smooth. In nonlinear programming, if we can choose a geometrical operator  : X ¼ Rn ! Y ¼ Rm with n > m, then the original primal problem in Rn can be converted into a dual problem in Rm . This dimension reduction technique is very important in large-scale nonlinear programming. In geometrically linear dynamical systems and game theory, both U ðyÞ and F ðxÞ are usually convex. In this case, the canonical function PðxÞ is the so-called total action in dynamic systems, which is a d.c. function (i.e. difference of convex functions). The Lagrangian (9) associated with the d.c. function PðxÞ is a so-called super- (or @þ -) Lagrangian, which leads to a so-called bi-duality theory in generalized (both conservative and dissipative) convex Hamiltonian systems1 (see [15]), i.e. if ðx , y Þ is a critical point of L, then either

inf PðxÞ ¼ Lðx , y  Þ ¼ inf P ðy Þ or  

x2X k

y 2Y k

sup PðxÞ ¼ Lðx , y  Þ ¼ sup P ðy Þ: x2X k

y 2Y k

This bi-duality theory plays an important role in periodic convex Hamilton systems. In the problem ðPÞ considered in the present article, since the function W(x) is nonconvex, it turns out that FðxÞ ¼ hx, f i  WðxÞ is no longer a canonical function, and the duality relation x ¼ DFðxÞ is not one-to-one. Thus, the Legendre transformation (5) of the nonconvex function F cannot be uniquely defined (see [38]). In this case, Since the Hamiltonian Hðx, y Þ ¼ hx; y i  Lðx, y Þ ¼ F  ð y Þ þ U  ðy Þ associated with the super Lagrangian Lðx, y Þ is convex, this might be the reason that why most people prefer the convex Hamiltonian Hðx, y Þ instead of the super-Lagrangian Lðx, y Þ in dynamic systems. 1

GLOBAL OPTIMIZATION PROBLEMS

473

the Fenchel–Young inequality for the nonconvex function F produces also a nonzero duality gap between the primal function PðxÞ and its classical Lagrangian dual function P ðy Þ. In this sense, the well-developed classical Lagrange duality can be used mainly for convex problems or d.c. programming. During last three decades, many modified versions of the Fenchel–Rockafellar duality have been proposed, one, the so-called relaxation method in nonconvex mechanics (cf., [1,7]), can be used to solve the relaxed convex problems. However, due to the duality gap, these relaxed solutions are not equivalent to the real solutions. Tremendous efforts have been focused recently on finding the so-called perfect duality theory (i.e. without a duality gap) in global optimization. Some important concepts have been developed in global optimization and variational inequalities (cf. e.g. [2,3,5,8,30,31,35,37,39,42–46], and much more). Generally speaking, the main difficulty is due to the fact that the Legendre conjugate of a general nonconvex function is usually multi-valued. Although a striking example in nonlinear elasticity has been proposed recently by Ekeland [10], as he pointed out, the general methods and theory for solving nonconvex problems remain open. Perfect duality theory, i.e. the so-called complementary variational principle in engineering mechanics and modern physics, and the trio-canonical forms in nonconvex (geometrically nonlinear) systems were originally studied by Gao and Strang [28] in large deformation variational/boundary value problems governed by nonsmooth duality relations (constitutive laws), where A ¼ 0 and the primal problem ðPÞ takes the following stationary variational form ðP sta Þ:

PðuÞ ¼ W ððuÞÞ  hu, f i ! sta

8 u 2 U k,

ð11Þ

in which, the notation PðuÞ ! sta 8 u 2 U k stands for finding all stationary points of P over the feasible space U k ; the internal energy W ðyÞ is a convex functional of the canonical (geometric) strain tensor y, while, for a given input f ; the external energy FðuÞ ¼ hu, f i is a linear functional, and the geometrical measure ðuÞ is a quadratic tensor function of the state variable u. By introducing the so-called complementary operator c c ðuÞ ¼ ðuÞ  t ðuÞu

ð12Þ

where t ðuÞ ¼ DðuÞ denotes the Gaˆteaux derivative of ðuÞ with respect to u, Gao and Strang discovered that the duality gap existing in classical Lagrange duality theory can be naturally recovered by the so-called complementary gap function (see [28]) Gðu, y Þ ¼ hc ðuÞ; y i:

ð13Þ

Therefore, they proved that the original nonconvex problem (11) is equivalent to the following constrained complementary variational problem ( ðP csta Þ:

Pc ðu, y Þ ¼ W ] ðy Þ þ Gðu, y Þ ! sta s:t: t ðuÞy ¼ f ,

8 y 2 Y a ,

ð14Þ

474

D.Y. GAO

FIGURE 1 Framework in fully nonlinear systems.

where the balance operator t ðuÞ is the adjoint operator of t defined by the duality pairing ht ðuÞu; y i ¼ hu, t ðuÞy i, and W ] ðy Þ is the Fenchel super-conjugate: n o W ] ðy Þ ¼ sup hy; y i  W ðyÞj y 2 Y a : Gao and Strang further proved that if ðu , y Þ is a critical point of the extended Lagrangian ðu, y Þ ¼ hðxÞ; y i  W ] ðy Þ  hu, f i,

ð15Þ

then the complementarity condition Pðu Þ þ P c ðu , y  Þ ¼ 0 holds. Moreover, if Gðu , y  Þ  0, then ðu , y  Þ is a saddle point of ðu, y Þ and u is a global minimizer of PðuÞ. Their original work on duality theory in finite field theory leads to a unified framework in fully nonlinear canonical systems (see Fig. 1). In finite deformation theory, if W is a quadratic function of the so-called Cauchy–Green strain tensor y ¼ ðuÞ ¼ ð1=2ÞðruÞT ðruÞ, the extended Lagrangian ðu, y Þ is the well-known Hellinger–Reissner energy. This complementary energy variational principle plays an essential role in large deformation mechanics. Recently, in the study of the postbifurcation in nonconvex mechanics, it was discovered that for a quadratic operator , if the complementary gap function Gðu , y  Þ is negative, then ðu , y Þ is a super (or @þ -) critical point of the extended Lagrangian ðu, y Þ. In this case, ðu , y Þ could be either a local minimizer or local maximizer of Pc ðu, y Þ. Therefore, an interesting triality theory was proposed in finite deformation theory and nonsmooth/nonconvex variational analysis (see [13,14,16]). This triality solved completely the open problem on the extremality condition of the Hellinger–Reissner variational principle in nonconvex mechanics. A self-contained comprehensive presentation of the mathematical theory of duality and triality in general nonconvex, nonsmooth systems was given recently in the monograph [17]. During the writing of this book, a potentially useful method, the so-called canonical dual transformation method, was developed. The key idea of this canonical dual transformation method is to choose a certain (geometrically reasonable) operator y ¼ ðxÞ : X a ! Y a such that a given nonconvex function PðxÞ can be written in the canonical form PðxÞ ¼ ðx, ðxÞÞ, where ðx, yÞ : X a  Y a ! R is a canonical function in each of its variables (see [19]). Very often, ðx, yÞ ¼ W ðyÞ  F ðxÞ. Since both W : Y a ! R and F : X a ! R are canonical functions, their Legendre conjugates can be uniquely defined via the classical Legendre transformation. Thus the extended Lagrangian ðx, y Þ ¼ hðxÞ; y i  W  ðy Þ  F ðxÞ

ð16Þ

GLOBAL OPTIMIZATION PROBLEMS

475

is well defined on X a  Y a . Then by using the so-called -canonical dual transformation (see [17])   F  ðy Þ ¼ hðxÞ; y i  F ðxÞj t ðxÞy  DF ðxÞ ¼ 0, x 2 X a ,

ð17Þ

the canonical dual function of the nonconvex PðxÞ can be well defined by   Pd ðy Þ ¼ ðx, y Þj Dx ðx, y Þ ¼ 0, x 2 X a ¼ F  ðy Þ  W  ðy Þ:

ð18Þ

In the case that F is linear and  is quadratic, the -conjugate F  ðy Þ is equivalent to the complementary gap function, i.e. F  ðy Þ ¼ fGðx, y Þj t ðxÞy ¼ f , x 2 X a g: In mathematical physics, the canonical duality relation y ¼ DW ðyÞ is usually called the constitutive law. By the duality of natural phenomena we know that physical variables (always) exist in pairs. The one-to-one duality relation between each canonical dual pair insures the existence of the geometrical measure y ¼ ðxÞ and the canonical functional for most well-posted systems. Extensive applications of this canonical dual transformation method have been given in nonconvex continuous systems, and some analytical solutions of nonconvex/nonsmooth boundary value problems have been obtained (see [14,16,19]). The generalization of this method was made for nonsmooth global optimization problems suitable for arbitrary nonlinear operator  (see [19]). The original aim of this article is to present applications of the generalized canonical dual transformation method to solve the global optimization problem ðPÞ in finite dimensional space. We will show that by using this method, the nonconvex primal problem ðPÞ can be transformed into a perfect dual problem ðP d Þ, and the coupled nonlinear system (2) in Rn can be converted into a dual algebraic equation in R1 . Therefore, a complete set of critical points for the nonconvex function PðxÞ on the feasible set X k can be obtained. The global minimizer of the primal problem is controlled by the triality theorem. Some concrete examples are presented in Section 5 for unconstrained problems, while in Section 6, a set of complete solutions is obtained for quadratic programming with inequality constraints.

3

PERFECT DUALITY FORMULATION AND COMPLETE SOLUTIONS

In order to use the canonical dual transformation method for solving the primal problem ðPÞ, some additional assumptions are needed. First, we assume that the operator A : X a  X ! X a  X  is invertible. Then, for each given f 2 X a , the function F : X a ! R, defined by 1 F ðxÞ ¼ hx, f i  hx, Axi, 2 is a canonical function on X a since its Gaˆteaux derivative x ¼ DF ðxÞ ¼ f  Ax : X a ! X a  X  is a one-to-one mapping from X a onto the range X a . Thus ðx, x Þ is a Legendre canonical duality pair on X a  X a . We further assume that for the given

476

D.Y. GAO

nonconvex function WðxÞ : X a ! R, there exists a geometrical operator ðxÞ : X ! Y a , which maps each x 2 X a into another metric space Y, such that the nonconvex function W(x) can be written in the canonical form WðxÞ ¼ W ððxÞÞ, where W ðyÞ is a canonical function defined on a subset Y a  Y. By the definition of the canonical function, W : Y a ! R is Gaˆteaux differentiable, and the duality relation y ¼ DW : Y a ! Ya  Y  is invertible. Let h; i : Y  Y  ! R denote the bilinear form on Y  Y  . Then the Legendre conjugate function W  : Y a ! R of the canonical function W can be obtained uniquely by the classical Legendre transformation   W  ðy Þ ¼ hy; y i  W ðyÞj DW ðyÞ ¼ y , y 2 Y a ,

ð19Þ

and the Legendre canonical duality relations y ¼ DW ðyÞ () y ¼ DW  ðy Þ () W ðyÞ þ W  ðy Þ ¼ hy; y i

ð20Þ

hold on Y a  Y a . So the pair ðy, y Þ is also a Legendre canonical dual pair on Y a  Y a . In this article, we limit our attention on finite dimensional problems with quadratic operator  : X ! Y  R 1 ðxÞ ¼ jxj2  , 2

ð21Þ

where j  j : X ! R is the Euclidean norm, and  2 R is a constant. An example for vector-valued nonlinear operator  is given in Section 6. Also, by using the so-called sequential canonical dual transformation method developed in [17,19], the results of this article can be generalized for any so-called canonical polynomial operator ðxÞ (see [14,17]). Finally, we assume that for a given f 2 X a , the feasible set X k can be written as X k ¼ fx 2 X a j ðxÞ 2 Y a g: Thus, in terms of the canonical function W and the geometrical measure y ¼ ðxÞ, we can rewrite the primal minimization problem ðPÞ in the canonical stationary variational form (ðP sta Þ in short): ðP sta Þ:

1 PðxÞ ¼ W ððxÞÞ  F ðxÞ ¼ WðxÞ þ hx, Axi  hx, f i ! sta 2

8 x 2 X k:

ð22Þ

The criticality condition DPðx Þ ¼ 0 leads to the following canonical equation 

 A þ D W ððx ÞÞI x ¼ f ,

ð23Þ

where D W stands for the Gaˆteaux derivative of W ððxÞÞ with respect to ðxÞ, and I is an identity matrix. Clearly, the canonical equation (23) is equivalent to the original Euler equation (2). However, by the canonical dual transformation, a complete set of solutions of this nonlinear system can be obtained via the canonical (i.e. perfect) duality formulation. THEOREM 1 (Perfect Duality Formulation) the dual feasible space

Suppose that for a given f 2 X a such that

  Y k ¼ y 2 Y a j ðA þ y I Þ is invertible and ðA þ y I Þ1 f 2 X a

ð24Þ

GLOBAL OPTIMIZATION PROBLEMS

477

is not empty, then the problem ðP dsta Þ:

1 Pd ðy Þ ¼  hðA þ y I Þ1 f , f i  y  W  ðy Þ ! sta 2

8 y 2 Y k

ð25Þ

is canonically (perfectly) dual to the primal problem ðP sta Þ in the sense that if x 2 X k is a solution of the primal stationary problem ðP sta Þ given in Eq. (22), then y ¼ D W ððx ÞÞ is a solution of the dual problem ðP dsta Þ and Pðx Þ ¼ Pd ðy  Þ:

ð26Þ

Proof Following the standard procedure of the canonical dual transformation described in Section 2, the extended Lagrangian  : X a  Y a ! R can be defined as 1 ðx, y Þ ¼ hðxÞ; y i  W  ðy Þ þ hx, Axi  hx, f i: 2

ð27Þ

The criticality condition Dðx , y  Þ ¼ 0 leads to the canonical Lagrange equations: ðx Þ ¼ DW  ðy  Þ,

ð28Þ

t ðx Þy  ¼ DF ðx Þ ¼ ð f  Ax Þ,

ð29Þ

where t ðx Þ ¼ Dðx Þ ¼ x is the Gaˆteaux derivative of  at x . By the Legendre canonical duality relations (20), the inverse duality equation (28) is equivalent to y  ¼ DW ððx ÞÞ. Substituting this into (29), we obtain the canonical Euler equation (23). This shows that the critical points of ðx, y Þ solves the primal problem, and Pðx Þ ¼ ðx , y  Þ. By the definition, for each fixed y 2 Y a , the canonical dual function Pd is defined by   Pd ðy Þ ¼ ðx, y Þj Dx ðx, y Þ ¼ 0, x 2 X a ¼ F  ðy Þ  W  ðy Þ, where the -canonical dual transformation F  : Y a ! R of the canonical function FðxÞ ¼ hx, f i  ð1=2Þhx, Axi is defined by   F  ðy Þ ¼ hðxÞ; y i  F ðxÞj DF ðxÞ ¼ t ðxÞy , x 2 X a :

ð30Þ

For a given f 2 X a , if the dual feasible space Y k is not empty, then for each y 2 Y k , the linear equation DF ðxÞ ¼ t ðxÞy has a unique solution x ¼ ðA þ y I Þ1 f . Substituting this into the -canonical dual transformation (30), we have 1 F  ðy Þ ¼  hðA þ y I Þ1 f , f i  y : 2 Thus, on the canonical dual feasible space Y k , the canonical dual function Pd is formulated uniquely as form of (25). Moreover, if ðx , y  Þ is a critical point of ðx, y Þ, and y  2 Y k , the canonical Lagrangian equation (29) has a unique solution x ¼ ðA þ y  I Þ1 f . Substituting this

478

D.Y. GAO

into (28), we obtain the dual algebraic equation 1 T f ðA þ y  I Þ2 f  DW  ðy  Þ ¼ : 2

ð31Þ

This is exactly the criticality condition DPd ðy  Þ ¼ 0. Thus, the critical point ðx , y  Þ of the extended Lagrangian ðx, y Þ solves both the primal and dual problems. The Legendre duality relations lead to the equality (26). g This theorem shows that there is no duality gap between the primal problem ðP sta Þ and its canonical dual problem ðP dsta Þ. Since the criticality condition (31) of Pd is an algebraic equation with only one unknown y 2 R, the canonical dual problem ðP dsta Þ has a finite number of critical points in Y k . All these dual solutions y i ði ¼ 1, 2, . . .Þ form a subset of Y k , it is denoted by Y s



 1 T      2     ¼ y 2 Y k j DW ðy Þ þ  ¼ f ðA þ y I Þ f : 2

ð32Þ

The following result shows that the dual solution set Y s leads to a complete set of solutions of the primal problem ðP sta Þ. THEOREM 2 (Complete Solution Set) Suppose that the assumption in Theorem 1 holds. For every solution y  2 Y s , the vector x defined by x ðy  Þ ¼ ðA þ y  I Þ1 f

ð33Þ

solves the primal problem ðP sta Þ. Conversely, every solution x of the primal problem ðP sta Þ can be written in the form (33) for some dual solution y 2 Y s . Proof We first prove that the vector defined by (33) solves (23). Substituting ðA þ y  I Þ1 f ¼ x into the dual algebraic equation (31), we obtain the inverse canonical dual relation 1 ðx Þ ¼ jx j2   ¼ DW  ðy  Þ: 2 Since W ðyÞ is a canonical function, by the Legendre duality relation (20) we know that y  ¼ Dy W ððx ÞÞ. Substituting x ðy  Þ ¼ ðA þ Dy W ððx ÞÞI Þ1 f into the left hand side of the canonical equation (23) leads to f. Thus for every solution y of the dual algebraic equation (31), x ¼ ðA þ y  I Þ1 f solves the canonical equation (23), and is a critical point of P. Conversely, if x is a solution of the couple nonlinear system (23), then it can be written in the form x ¼ ðA þ y  ðx ÞI Þ1 f with y  ðx Þ ¼ DW ððx ÞÞ. By Theorem 1 we know that the pair ðx , y  ðx ÞÞ is a critical point of the extended Lagrangian ðx, y Þ, and is a critical point of Pd ðy Þ on Y k . It turns out that y  ¼ DW ððx Þ has to be a solution of the canonical dual algebraic equation (31). This shows that every solution of the coupled nonlinear system (23) can be written in the form x ¼ ðA þ y  I Þ1 f for some solution y  of the dual algebraic equation (31). g

GLOBAL OPTIMIZATION PROBLEMS

479

This theorem shows that, by the canonical dual transformation, a complete set of solutions to the nonconvex primal problem is obtained as   X s ¼ x 2 X k j x ¼ ðA þ y  ðx ÞI Þ1 f 8 y 2 Y s :

4

ð34Þ

GLOBAL MINIMIZER AND LOCAL EXTREMES

For the given nonconvex problem ðPÞ, each solution x 2 X s could be the only local extremum point (either local minimizer or local maximizer) of the nonconvex function PðxÞ. In order to determine the global minimizers and local extremes of P, we introduce the following subsets Y þ ¼ fy 2 Y k j ðA þ y I Þ is positive definiteg,

ð35Þ

Y  ¼ fy 2 Y k j ðA þ y I Þ is negative definiteg:

ð36Þ

By the triality theory proposed in [13,14,17], the global minimizers and maximizers of the primal problem ðP sta Þ and the dual problem ðP dsta Þ can be clarified by the following theorem. THEOREM 3 (Global Minimizer and Maximizer) Suppose that the canonical function W ðyÞ is convex on Y a , and for each dual solution y  2 Y s , we let x ðy  Þ ¼ ðA þ y  I Þ1 f . If y  2 Y þ , then y is a global maximizer of Pd on Y þ , while x ðy  Þ is a global minimizer of P on X k , and Pðx Þ ¼ min PðxÞ ¼ max Pd ðy Þ ¼ Pd ðy  Þ:   x2X k

ð37Þ

y 2Y þ

Moreover, the dual solution set Y s has at most one element y  2 Y þ . If y  2 Y  , then y  and the associated x are local critical points of Pd and P, respectively. In this case, x is a local maximizer of PðxÞ on its neighborhood 2 X r  X k if and only if y  is a local maximizer of Pd on its neighborhood Y r  Y k , and Pd ðy Þ ¼ Pd ðy  Þ: Pðx Þ ¼ max PðxÞ ¼ max   x2X r

ð38Þ

y 2Y r

Proof The proof of the statement (37) follows the original idea presented in the article along with Strang [28]. By the convexity of the canonical function W ðyÞ, we know that the inequality W ðyÞ  W ðy Þ  hy  y ; DW ðyÞi

ð39Þ

holds for all y, y 2 Y. For any given x 2 X , we let y ¼ ðxÞ, and particularly, for each solution y of the dual algebraic equation (31), we let x ðy Þ ¼ ðA þ y I Þ1 f , and 2

The subspace X r  X k is said to be the neighborhood of the critical point x if x is the only critical point of P on X r . The definition for the neighborhood Y r  Y k is similar. In the case that A is a matrix, the definition for X r and Y r are given in the Remark following this theorem.

480

D.Y. GAO

y ¼ ðx Þ. Since  is a quadratic operator, the Taylor expansion of y ¼ ðxÞ at x has only three terms ðxÞ ¼ ðx Þ þ ðt ðx ÞÞT ðx  x Þ  c ðx  x Þ   1 1 ¼ jxj2   þ x T ðx  x Þ þ jx  x j2 , 2 2 where t ðx Þ ¼ x T is the Gaˆteaux derivative of ðxÞ at x , while c ðxÞ ¼ ð1=2Þjxj2 is the complementary operator of t (see [28]). Thus, substituting y ¼ ðxÞ and y ¼ ðx Þ into the inequality (39) leads to PðxÞ  Pðx Þ  hx  x , ðA þ Dy W ððx ÞÞI Þx  f i 1 þ hx  x , ðA þ Dy W ððx ÞÞI Þðx  x Þi 8 x 2 X k : 2 By Theorem 2 we know that for each solution y  of the dual algebraic equation (31), x ðy  Þ is a critical point of P, and y  ðx Þ ¼ Dy W ððx ÞÞ, thus if A þ y  I is positive definite, we have 1 PðxÞ  Pðx Þ  hx  x , ðA þ y  ðx ÞI Þðx  x Þi  0 2

8 x 2 X k:

This shows that for each solution y of the dual algebraic equation (31), if A þ y  I is positive definite, x ðy  Þ is a global minimizer of PðxÞ over X k . Moreover, if W is strictly convex, then the inequality (39) holds strictly. Thus if y  2 Y s \ Y þ , and x ¼ ðA þ y I Þ1 f , then for all x 2 X k such that x 6¼ x , we have 1 PðxÞ  Pðx Þ > hx  x , ðA þ y  I Þðx  x Þi > 0: 2 This shows that x is a unique global minimizer of P over X k . By the fact that ððx Þ, y Þ is a canonical duality pair on Y a  Y a , we know that the dual solution set Y s has a unique element y  2 Y þ . If y 2 Y  , then ðx , y  Þ is a so-called super-critical point of the extended Lagrangian ðx, y Þ, i.e. ðx, y Þ is locally concave in each of its variables x and y on the neighborhood X r  Y r . In this case, we have Pðx Þ ¼ max max ðx, y Þ ¼ max max ðx, y Þ ¼ Pd ðy  Þ     x2X r y 2Y r

y 2Y r x2X r

by the fact that the maxima of the super-Lagrangian ðx, y Þ can be taken in either order on the open set X r  Y r (see [17]). This proves the rest part of the theorem and (38). g Remark 1 Theorem 3 can also be simply proved by the triality theory developed in [17], i.e. for the convex canonical function W ðyÞ (but WðxÞ ¼ W ðyðxÞÞ may not be convex in x), its Legendre conjugate W  ðy Þ is also convex, then if y  2 Y þ , the extended Lagrangian (27) 1 ðx, y Þ ¼ hðxÞ; y i  W  ðy Þ þ hx, Axi  hx, f i 2

GLOBAL OPTIMIZATION PROBLEMS

481

is a saddle function at the critical point ðx , y  Þ. In this case, the classical saddle min–max theory leads to (37). If y  2 Y  , then ðx, y Þ is a so-called super-Lagrangian in the neighborhood of ðx , y  Þ. In this case, the bi-duality theory developed in [17] proves the double max duality relation (38), as well as the double min duality relation Pðx Þ ¼ min PðxÞ ¼ min Pd ðy Þ ¼ Pd ðy  Þ,   x2X r

ð40Þ

y 2Y r

under certain additional constraints. In the case that the canonical function W ðyÞ is concave, a parallel theorem can be obtained simply by applying the triality theory (see [14]), i.e. if y  2 Y  , then ðx , y  Þ is the so-called left-saddle point of ðx, y Þ, in this case, we have Pðx Þ ¼ max PðxÞ ¼ min Pd ðy Þ ¼ Pd ðy  Þ:   x2X k

ð41Þ

y 2Y 

Dually, if y  2 Y þ , then ðx , y  Þ is the so-called subcritical point of ðx, y Þ, in this case, the bi-duality theory leads to the double min (40) and the double max (38) duality relations under certain conditions. Remark 2 In the case that A 2 Rnn is a symmetric matrix with m  n distinct eigenvalues a1 > a2 >    > am , the dual solution set Y s defined by (32) has at most 2m þ 1 real solutions y i ði ¼ 1, 2, . . . , 2m þ 1Þ, and the largest one y 1 satisfies y 1 þ am > 0, which leads to a global minimizer x 1 ¼ x ðy 1 Þ of P. For each given r ¼ 1, 2, . . . , m  1 < n, the neighborhoods X r and Y r in the theorem can be defined as   X r ¼ x 2 X k j  ar < D W ððxÞÞ < arþ1 ,   Y r ¼ y 2 Y k j  ar < y < arþ1 :

ð42Þ ð43Þ

Then in each Y r , the dual algebraic equation (31) has at most two roots ar < y r1  y r2 < arþ1 . The smaller root y r1 is a local maximizer of Pd Pd ðy r1 Þ ¼ max Pd ðy Þ,  

ð44Þ

y 2Y r

while the root yr2 is a local minimizer of Pd i.e. Pd ðy r2 Þ ¼ min Pd ðy Þ:  

ð45Þ

y 2Y r

Particularly, in the domain Y 0 ¼ fy 2 Y k j  1 < y <  a1 g, the dual algebraic equation (31) also has at most two roots y 01  y 02 < a1 , and both of them are in Y 0 ¼ Y  . Thus, by Theorem 3 we know that y 01 and associated x 01 are the biggest local maximizers of Pd and P, respectively. While in the domain Y m ¼ fy 2 Y k jy > am g, the matrix A þ y I is positive definite. By Theorem 3, the dual algebraic equation (31) has only one root y m > am , which is the smallest local maximizer of Pd on Y k (global maximizer on Y m ¼ Y þ ), and the associated x m is a global minimizer of P on X k .

482

5

D.Y. GAO

APPLICATIONS TO UNCONSTRAINED GLOBAL OPTIMIZATION

The canonical dual transformation method and associated triality theory can be used to solve many nonconvex problems in engineering and science. Some applications in nonconvex mechanics have been given in the recent articles (see [25,26]). This section presents some examples in finite dimensional space. 5.1

Example with Quadratic W ðyÞ

First, let us consider the following unconstrained nonconvex stationary problem  2 1 T 1 1 2 PðxÞ ¼ x Ax þ  jxj   f T x ! sta 8 x 2 Rn , ð46Þ 2 2 2 where A 2 Rnn is a given symmetrical matrix, f 2 Rn is a given vector, and ,  > 0 are positive constants. In this problem, W(x) is a fourth-order canonical polynomial (cf. [17]) in X ¼ Rn  2 1 1 2 WðxÞ ¼  jxj   : ð47Þ 2 2 In continuous systems, this function is the so-called double-well potential, which was first studied by van der Waals in fluids mechanics in 1895. In two dimensional space R2 , Wðx1 , x2 Þ is the so-called ‘‘Mexican hat’’ function in cosmology and theoretical physics (see [18]). In Rn , the nonconvex function PðxÞ may have many local critical points, which depend on the matrix A, (see Fig. 2). In phase transitions of shape memory alloys, each local minimizer of P corresponds to a certain phase state of the material. However, each local maximizer characterizes the critical conditions that lead to the phase transitions. In unilateral post-bifurcation analysis, the solution of the post-buckling state is usually a local minimizer (see [17]). To solve this nonconvex problem by the canonical dual transformation method, we let X ¼ Rn ¼ X  . The geometrical measure y ¼ ðxÞ ¼ ð1=2Þjxj2   is a quadratic operator from X ¼ Rn into Y ¼ R. By the fact that ð1=2Þjxj2 ¼ y þ   0 8 x 2 X , the range of the quadratic mapping  is Y a ¼ fy 2 Rj y þ   0g:

6

(a)

8

(b)

6

4

4 2 2 -3

-2

-1

1

2

3 -3

-2

-2

-1

1

2

3

-2

FIGURE 2 Illustration of the nonconvex function PðxÞ in R: (a) graphs of PðxÞ with  ¼ 0:5; (b) graphs of PðxÞ with f ¼ 0:7.

GLOBAL OPTIMIZATION PROBLEMS

483

Then on Y a , the canonical function W : Y a ! R is simply a quadratic function W ðyÞ ¼ ð1=2Þy2 . For a given f 2 Rn , the function FðxÞ ¼ hx, ci  ð1=2Þhx, Axi ¼ xT f  ð1=2ÞxT Ax is a quadratic function on X a ¼ Rn . By the fact that x ¼ DFðxÞ ¼ f  Ax, the range for the canonical mapping DF : X a ! X  is X a ¼ Rn . The feasible set for the primal problem is X k ¼ fx 2 X a j ðxÞ 2 Y a g ¼ Rn . Thus, the canonical dual problem is to find all critical point of PðxÞ such that ðP sta Þ:

1 1 PðxÞ ¼ hx, Axi þ ððxÞÞ2  hx, f i ! sta 2 2

8 x 2 Rn :

ð48Þ

The Euler equation associated with this nonconvex variational problem is a coupled nonlinear algebraic system in Rn   1 Ax þ  jx j2   x ¼ f : 2 Since the canonical function W ðyÞ ¼ ð1=2Þy2 is quadratic, the canonical dual relation y ¼ y is invertible on Y a . The range of the canonical dual mapping DW : Y a ! Y  is Y a ¼ fy 2 Rj y  g: For each y 2 Y a , the Legendre conjugate of W   1 W  ðy Þ ¼ yy  W ðyÞj DW ðyÞ ¼ y ¼ 1 y2 2 is also a quadratic function. Thus, on the dual feasible space   Y k ¼ y 2 Rj detðA þ y I Þ 6¼ 0, y   the canonical dual problem is formulated as 1 1 ðP dsta Þ: Pd ðy Þ ¼  f T ðA þ y I Þ1 f  1 y 2  y ! sta 2 2

8 y 2 Y k

ð49Þ

The canonical dual algebraic equation associated with this dual problem is 1 1 y þ  ¼ f T ðA þ y I Þ2 f : 2

ð50Þ

For the given f 2 Rn and the parameters ,  > 0, if the symmetric matrix A ¼ AT 2 Rnn has m  n distinct eigenvalues a1 > a2 >    > am , this algebraic equation has at most 2m þ 1 real roots y 1 > y 2  y 3      y 2mþ1 , which can be obtained by using MATHEMATICA. These dual solutions lead to at most 2m þ 1 critical points of PðxÞ: x i ¼ ðA þ y i I Þ1 f ,

i ¼ 1, 2, . . . , 2m þ 1:

ð51Þ

By Theorem 3, if  > 0, then x 1 ¼ ðA þ y 1 I Þ1 f is the global minimizer of PðxÞ, and x 2mþ1 ¼ ðA þ y2mþ1 I Þ1 f is a local maximizer of PðxÞ.

484

D.Y. GAO

Example 1 In the case of n ¼ 1, the nonconvex function PðxÞ ¼ ð1=2Þax2 þ ð1=2Þðð1=2Þx2  Þ2  cx has at most two potential wells and one local maximizer. Its canonical dual function 1 1 y 2  y Pd ðy Þ ¼  c2 ða þ y Þ1  2 2 

ð52Þ

is discontinuous at y ¼ a (see Fig. 3). If we choose a ¼ 0:5,  ¼ 1:3,  ¼ 1:0, and c ¼ 0:2, the dual algebraic equation has three real roots:  y 3 ¼ 1:29378 < y 2 ¼ 0:391255 < y 1 ¼ 0:60253, which gives the three critical points x i ¼ c=ða þ y i Þ ¼ f1:83916,  0:111496, 1:95066g of PðxÞ. Since y 1 þ a > 0, x i þ a < 0 for i ¼ 2, 3, by the Theorem 3, we know that x 1 ¼ 1:95066 is a global minimizer, while x 2 ¼ 0:111496 is a local maximizer and x 3 ¼ 1:83916 is a local minimizer. Example 2 In two dimensional space R2 , the nonconvex PðxÞ has at most  function  2n þ 1 ¼ 5 critical points. If we simply choose A ¼ aij with a11 ¼ 0:5, a22 ¼ 0:6, a12 ¼ a21 ¼ 0, f ¼ f1 , f2 with f1 ¼ 0:2, f2 ¼ 0:15. For a given parameter  ¼ 1:3, and  ¼ 1:0, the graph of PðxÞ is a nonconvex surface (see Fig. 4a) with four potential wells and one local maximizer. The graph of the canonical dual function Pd ðy Þ is a two-dimensional curve (see Fig. 4b). The dual canonical dual algebraic equation (50) has total five real roots: y 5 ¼ 1:26234 < y 4 ¼ 0:680712 < y3 ¼ 0:353665 < y 2 ¼ 0:520982 < y 1 ¼ 0:675737, and we have Pd ðy 5 Þ > Pd ðy 4 Þ ¼ 0:772699 > Pd ðy 3 Þ ¼ 0:272349 > Pd ðy 2 Þ ¼ 0:690204 > Pd ðy 1 Þ: Since ðA þ y 1 I Þ is positive-definite, by Theorem 3, we know that x 1 ¼ ðA þ y1 I Þ1 f ¼ f0:170106,  1:98054g is a global minimizer of PðxÞ, and Pðx1 Þ ¼ Pd ðy 1 Þ ¼ 1:27232. By Theorem 3 (also from the graph of Pd ), we know that x 5 ¼ ðA þ y 5 I Þ1 f ¼ f0:2623, 0:0805g is the biggest local maximizer of P and Pðx 5 Þ ¼ Pd ðy 5 Þ ¼ 0:876567 since y 5 is a local maximizer of Pd and ðA þ y 5 IÞ is negative definite. 1.5 1 0.5 0 -0.5 -1 -1.5 -2 FIGURE 3

-1

0

1

2

Double-well energy PðxÞ and its dual Pd ðy Þ.

GLOBAL OPTIMIZATION PROBLEMS

485

) 1.5

(b)

1 0.5 2

2 1

0

-1.5

-1

-0.5

0.5

1

-0.5

0

-2

-1

-1

-1 0

-1.5 -2

1 2

FIGURE 4 Graphs of the primal function Pðx1 , x2 Þ and its canonical dual for example 2: (a) graph of PðxÞ; (b) graph of Pd ðy Þ.

Example 3 In a high dimensional space n > 2, it is very difficult to find all critical points and global minimizers of PðxÞ. However, the graph of the canonical dual function Pd is only a plane curve. If a1 > a2 >    > am are distinct eigenvalues of A, then within each interval ar < y < arþ1 , the canonical dual function Pd ðy Þ has at most two critical points y r1  y r2 , and y r1 is a local maximizer and yr2 is a local minimizer of Pd . For n ¼ 4, and we let a11 ¼ 2:5, a22 ¼ 1:8, a33 ¼ 0:5, a44 ¼ 1:4, aij ¼ 0 for all i 6¼ j, and f ¼ ð0:2, 0:5, 0:3, 0:2ÞT ,  ¼ 2:8, the graph of PðxÞ is in R5 , which is impossible to be viewed. However, the graph of Pd is shown in Fig. 5. 5.2

Example with Concave W ðyÞ

Now let us consider the following constrained nonconvex problem rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 T 1 ðPÞ: PðxÞ ¼ x Ax þ    jxj2  f T x ! sta 8 x 2 X k , ð53Þ 2 2   where ,  > 0 are given parameters, the p feasible set X k ¼ x 2 Rn j 0  ð1=2Þjxj2   ffiffiffiffiffiffi is an n-dimensional ball with radius  ¼ 2. Thus, by choosing the geometrical measure y ¼ ðxÞ ¼ ð1=2Þjxj2  , the canonical function pffiffiffiffiffiffiffi W ðyÞ ¼ y 8 y  0   is a concave canonical function defined on its domain Y a ¼ y 2 Rj y  0 . The canonical dual variable y ¼ DW ðyÞ ¼ ð1=2ÞðyÞ1=2 . The range of the canonical  dual mapping DW : Y a ! Y a  R is also a set of negative real numbers Y a ¼ y 2 Rj y  0g: The Legendre conjugate for this concave function is   2 W  ðy Þ ¼ hy; y i  W ðyÞj DW ðyÞ ¼ y , y 2 Y a ¼  : 4y Thus, on the dual feasible space     Y k ¼ y 2 Y a j detðA þ y I Þ 6¼ 0 ¼ y 2 Rj detðA þ y I Þ 6¼ 0, y  0 ,

486

D.Y. GAO 5

2.5

-3

-2

-1

1

2

3

-2.5

-5

-7.5

-10

-12.5

FIGURE 5 Graphs of the Pd ðy Þ for four-dimensional problem.

the canonical dual problem is 1 2 Pd ðy Þ ¼  f T ðA þ y I Þ1 f    y ! sta 2 4y

8 y 2 Y k :

The dual algebraic equation takes the following form 2 1   ¼ f T ðA þ y I Þ2 f : 2 2 4y

ð54Þ

The number of solutions of this nonlinear equation depends on the number of positive eigenvalues of A. For n ¼ 1, if we choose A ¼ fa11 g ¼ 1:3 > 0, f ¼ 0:62,  ¼ 1:0,  ¼ 2:0, the Eq. (54) has total four roots y1 ¼ 1:61768 < y 2 ¼ 0:966937 < y 3 ¼ 0:375268 < y4 ¼ 0:359885: = Y k , so for each y i 2 Y k , i ¼ 1, 2, 3, the nonconvex problem Since the positive root y 4 2 ðPÞ has total three solutions in X k :   fx i g ¼ ða þ y i Þ1 f ¼ f1:95165, 1:86151, 0:670465g: Since W ðyÞ is concave, and ða þ y 1 Þ < 0 is negative, then by the theorem we know that x 1 is a global maximizer of PðxÞ on X k (see Fig. 6). In two-dimensional space, we let x ¼ ðx1 , x2 Þ ¼ ðr cos t, r sin tÞ, then the parametric surface of the nonconvex function PðxÞ is shown in Fig. 7(a). If we choose a11 ¼ 1:3, a12 ¼ a21 ¼ 0, a22 ¼ 0:4 and f ¼  0:2Þ,  ¼ 1,  ¼ 2, the canonical  ð0:62,  dual function Pd has four critical points y i ¼ f1:61809,  0:965843,  0:378999, 0:536445g (see Fig. 7b), which leads to four solutions of the primal problem.

GLOBAL OPTIMIZATION PROBLEMS

487

4.5

4

3.5

3

2.5

2

1.5

-2

-1

FIGURE 6 (a)

0

1

2

Graphs of the primal function PðxÞ and its canonical dual for concave W .

20 1 0

2 4

-1

6

-22

(b) 6 4

2

2 0 0 -2

-2

-1.5

-1

-0.5

0

0.5

1

FIGURE 7 Graphs of the primal function Pðx1 , x2 Þ and its canonical dual: (a) parametrical surface of PðxÞ; (b) graph of Pd ðy Þ.

6

APPLICATION TO CONSTRAINED QUADRATIC PROGRAMMING

We now turn our attention to the constrained global optimization problems of the form:  min

 1 hx, Axi  hx, f ij x 2 X k , 2

ð55Þ

where the feasible space X k is defined as   X k ¼ x 2 Rn j Bx  b 2 Rm , jxj2  2 ,

ð56Þ

488

D.Y. GAO

in which, B 2 Rmn is a given matrix, b 2 Rm is a given vector, and  > 0 a constant. It is known that for an arbitrarily given symmetric matrix A 2 Rnn , this nonconvex quadratic programming is very difficult to solve by the traditional direct approaches. However, by the canonical dual transformation method, complete set of solutions can be obtained also. To set this problem in our framework, we let the geometrical operator  : Rn ! Rmþ1 be a vector-valued mapping:   1 y ¼ ðxÞ ¼ Bx  b, jxj2   ¼ ðE, Þ 2 where E ¼ Bx  b and  ¼ ð1=2Þjxj2  . Let Y ¼ Rmþ1 ¼ Y  , and let m m Rm þ ¼ fE 2 R j E  0 2 R g,

R ¼ f 2 Rj   0g

be the positive and negative cone in Rm and R, respectively. Then, the canonical function W : Y ! R [ fþ1g can be defined as the indicator of the convex sets Rm þ and R :  0 if E 2 Rm þ ,  2 R , W ðyÞ ¼ I Rmþ ð"Þ þ I R ðÞ ¼ þ1 otherwise which is convex, lower semicontinuous on Y. Its effective domain is   Y a ¼ dom W ðyÞ ¼ y ¼ ðE, Þ 2 Rmþ1 j E 2 Rm þ ,  2 R ,   and the feasible space of the primal problem is X k ¼ x 2 Rn j ðxÞ 2 Y a . On the space X ¼ Rn , the constrained primal problem (55) can be written in the canonical form: to find global minimizer x such that   1 ð57Þ ðPÞ: Pðx Þ ¼ min W ððxÞÞ þ hx, Axi  hx, f ij x 2 X : 2 By the fact that the canonical function W ðyÞ is convex, lower semicontinuous on Y, the canonical dual variable y 2 Y  is defined by the subdifferential inclusion: y 2 @ W ðyÞ ¼



 if E 2 Rm  ,  2 Rþ , , otherwise

ðE ,  Þ 60

m m     where Rm  ¼ fE 2 R j E  0g and Rþ ¼ f 2 Rj   0g are the dual cones of Rþ ] and R , respectively. The canonical conjugate W of W can be obtained by the Fenchel transformation:

  W ] ðy Þ ¼ sup hy; y i  W ðyÞ ¼ y2Y



0 þ1

 if E 2 Rm  ,  2 Rþ , otherwise.

Its effective domain is   Y a ¼ dom W ] ¼ y ¼ ðE , Þ 2 Rmþ1 j E 2 Rm  ,  2 Rþ :

GLOBAL OPTIMIZATION PROBLEMS

489

Since the Fenchel sup-duality relations y 2 @ W ðyÞ () y 2 @ W ] ðy Þ () W ðyÞ þ W ] ðy Þ ¼ hy; y i

ð58Þ

hold on Y  Y  , we know that ðy, y Þ is a canonical dual pair on Y  Y  . On Y a  Y a , the Fenchel sup-duality relations (58) are equivalent to the following KKT conditions: Y a 3 y ? y 2 Y a : The dual feasible space in this problem has the form   Y k ¼ y ¼ ðE ,  Þ 2 Y a j detðA þ  I Þ 6¼ 0 : For a fixed y 2 Y k , F  ðy Þ can be well defined by the -canonical dual transformation   F  ðy Þ ¼ hðxÞ; y i  F ðxÞj DF ðxÞ ¼ t ðxÞy , x 2 X a ¼

T   1 f  BT E ðA þ  I Þ1 f  BT E    bT E : 2

Since W ] ðy Þ ¼ 0 8 y 2 Y k , the canonical dual function Pd : Y k ! R for this constrained problem can be obtained in the form Pd ðE ,  Þ ¼ 

T   1 f  BT E ðA þ  I Þ1 f  BT E    bT E : 2

ð59Þ

We let   Y kþ :¼ Y k \ Y þ ¼ y 2 Y k j ðA þ  I Þ is positive definite : Clearly, for each given   0 such that y ¼ ðE ,  Þ 2 Y kþ , Pd ðE ,  Þ is strictly concave in E . On the other way, for each given E 2 Rm such that y ¼ ðE ,  Þ 2 Y kþ , Pd ðE ,  Þ is strictly concave in  . Thus, the constrained nonconvex optimization problem ðPÞ in Rn given by (57) can be converted into a canonical dual problem in Rmþ1 , i.e. ðP d Þ:

max Pd ðE ,  Þ

8 ðE ,  Þ 2 Y kþ :

ð60Þ

In the case that B ¼ I 2 Rnn and b ¼ 0 2 Rn . Then the canonical dual function Pd has a simple form: 1 Pd ðE ;  Þ ¼  ð f  E ÞT ðA þ  IÞ1 ð f  E Þ   : 2

ð61Þ

490

D.Y. GAO

A very simple one-dimensional example was given in the book [17] (Chap. 2). Detailed study for this canonical dual problem and solutions in Rn is given in [24]. For twodimensional problem, if we let f ¼ ð0:2, 0:3Þ, A ¼ fa11 ¼ 0:5, a22 ¼ 0:1, a12 ¼ a21 ¼ 0g, B ¼ I 2 R22 , b ¼ 0 2 R2 and  ¼ 4, the canonical dual problem has four critical points (see Fig. 8) 4 ¼ 0:571609 < 3 ¼ 0:427817 < 2 ¼ 0:00717494 < 1 ¼ 0:206601: Since 1 > 0 and ðA þ 1 IÞ is positive definite, so 1 is a global maximizer of Pd on Y þ , which leads to the global minimizer x 1 ¼ ðA þ 1 IÞ1 f ¼ ð0:283045, 2:81423Þ > ð0, 0Þ 2 R2 . From the graph and the contour of PðxÞ we can see (Fig. 9) that the global minimizer is on the boundary of X k (it is easy to verify that ð1=2Þjx j2 ¼ 4). This is the reason that the primal problem is NP-hard. However, the dual problem is a concave maximization programming and the global maximizer is in the interior of the dual feasible set Y k .

4 2

-0.8

-0.6

-0.4

-0.2

0.2

0.4

-2 -4 FIGURE 8

Graphs of Pd ðy Þ for two-dimensional problem.

(a)

(b)

4

3 0 -0.5 -1 -1.5 -2 0

4

2

3 2 0.1

1

1

0.2 0.3 0.4 0

0 0

0.2

0.4

0.6

0.8

FIGURE 9 Quadratic PðxÞ and its contour: (a) graph of PðxÞ; (b) contour of PðxÞ.

1

GLOBAL OPTIMIZATION PROBLEMS

7

491

CONCLUDING REMARKS

We have presented detailed applications of the canonical dual transformation method developed recently to the general nonconvex optimization problem proposed in (1). This problem is directly related to many important applications in mathematical physics. For the quadratic geometrical measure , a canonical dual problem is formulated, i.e. the so-called perfect dual formulation with zero duality gap and without any perturbation. Based on the perfect duality theory, a complete set of solutions is obtained. Several examples are illustrated. The results show that the local minimizers and maximizers appear periodically in the order of the dual solutions. This phenomenon has been verified experimentally in superconductivity governed by Landau– Ginzburg equation (see [26]). The results presented in the last section (Section 6) is particularlypffiffiffiffiffiffi interesting. Quadratic programming with only the norm constraint jxj   ¼ 2 was studied recently by Powell [34]. Since the normality condition pffiffiffiffiffiffi jxj  2 is a general constraint for any real problems in applications, the quadratic operator ðjxjÞ ¼ ð1=2Þjxj2   can be used to solve many nonconvex problems in quadratic and d.c. programming. The idea, results, and method presented in this article can be used and generalized to solve some difficult problems in global optimization, nonconvex mechanics, and scientific computations. The canonical dual transformation method for fully nonlinear systems (where  is a general polynomial operator) was discussed in [14,17,19]. Compared with the traditional direct methods in global optimization problems, the main advantages of the canonical dual transformation method are the following: (1) it provides powerful and efficient primal–dual alternative approaches; (2) it converts nonsmooth/nonconvex constrained problems into smooth concave dual problems; (3) it reduces the dimensions in nonlinear programming. Duality plays a key role in modern mathematics and science. The inner beauty of duality theory owes much to the fact that many different natural phenomena can be put in the unified mathematical framework (cf. Fig. 1 and [17]). Generally speaking, most of the physical variables appear in dual pairs. This one-to-one canonical duality relation serves as the foundation for the canonical dual transformation method. For any given nonlinear problem, as long as the geometrical operator  is chosen properly and the trio-canonical forms can be characterized correctly, the canonical dual transformation method can be used to establish nice theoretical results, and to develop efficient alternative algorithms for robust computations. The extended Lagrange duality and triality theories might have certain impact in some research fields.

Acknowledgments The author is sincerely grateful to Professor R. Greechie at Louisiana Tech University for his important comments and very valuable suggestions on the draft of this article, which will also affect to the style of the author’s future publications. Detailed remarks and important comments from Professor Alex Rubinov at Ballarat University, and from Professor C.J. Goh at University of Western Australia are highly appreciated, which improved definitely the quality of this article.

492

D.Y. GAO

References [1] A.A. Atai and D. Steigmann (1998). Coupled deformations of elastic curves and surfaces. Int. J. Solids and Structures, 35, 1915–1952. [2] G. Auchmuty (1983). Duality for non-convex variational principles. J. Diff. Equations, 50, 80–145. [3] G. Auchmuty (2001). Variational principles for self-adjoint elliptic eigenproblems. In: D.Y. Gao, R.W. Ogden and G. Stavroulakis (Eds.), Nonconvex/Nonsmooth Mechanics: Modelling, Methods and Algorithms, p. 478. Kluwer Academic Publishers. [4] H. Benson (1995). Concave minimization: theory, applications and algorithms. In R. Horst and P. Pardalos (Eds.), Handbook of Global Optimization, pp. 43–148. Kluwer Academic Publishers. [5] F.H. Clarke (1985). The dual action, optimal control, and generalized gradients. Mathematical Control Theory, pp. 109–119. Banach Center Publ., 14, PWN, Warsaw. [6] J.P. Crouzeix (1981). Duality framework in quasiconvex programming. In: S. Schaible and W.T. Ziemba (Eds.), Generalized Convexity in Optimization and Economics, pp. 207–226. Academic Press. [7] D. Dacorogna (1989). Direct Methods in the Calculus of Variations. Springer-Verlag. [8] I. Ekeland (1977). Legendre duality in nonconvex optimization and calculus of variations. SIAM J. Control and Optimization, 15, 905–934. [9] I. Ekeland (1990). Convexity Methods in Hamiltonian Mechanics, p. 247. Springer-Verlag. [10] I. Ekeland (2003). Nonconvex duality. In: D.Y. Gao (Ed.), Proceedings of IUTAM Symposium on Duality, Complementarity and Symmetry in Nonlinear Mechanics (to appear). Kluwer Academic Publishers, Dordrecht/Boston/London. [11] I. Ekeland and R. Temam (1976). Convex Analysis and Variational Problems. North-Holland. [12] C.A. Floudas and V. Visweswaran (1995). Quadratic optimization. In: R. Horst and P.M. Pardalos (Eds.), Handbook of Global Optimization, pp. 217–270. Kluwer Academic Publishers, Dordrecht/ Boston/London. [13] D.Y. Gao (1997). Dual extremum principles in finite deformation theory with applications to post-buckling analysis of extended nonlinear beam theory. Applied Mechanics Reviews, 50, 11, S64–S71. [14] D.Y. Gao (1998). Duality, triality and complementary extremum principles in nonconvex parametric variational problems with applications. IMA J. Appl. Math., 61, 199–235. [15] D.Y. Gao (1999a). Duality mathematics. Wiley Encyclopedia of Electronical and Electronical Engineering, 6, 68–77. [16] D.Y. Gao (1999b). General analytic solutions and complementary variational principles for large deformation nonsmooth mechanics. Meccanica, 34, 169–198. [17] D.Y. Gao (2000a). Duality Principles in Nonconvex Systems: Theory, Methods and Applications, Vol. xviii, p. 454. Kluwer Academic Publishers, Dordrecht/Boston/London. [18] D.Y. Gao (2000b). Analytic solution and triality theory for nonconvex and nonsmooth variational problems with applications. Nonlinear Analysis, 42(7), 1161–1193. [19] D.Y. Gao (2000). Canonical dual transformation method and generalized triality theory in nonsmooth global optimization. J. Global Optimization, 17(1/4), 127–160. [20] D.Y. Gao (2001a). Bi-duality in nonconvex optimization. In: C.A. Floudas and P.D. Pardalos (Eds.), Encyclopedia of Optimization, Vol. 1, pp. 477–482. Kluwer Academic Publishers, Dordrecht/Boston/ London. [21] D.Y. Gao (2001b). Tri-duality in global optimization. In: C.A. Floudas and P.D. Pardalos (Eds.), Encyclopedia of Optimization, Vol. 1, pp. 485–491. Kluwer Academic Publishers, Dordrecht/Boston/ London. [22] D.Y. Gao (2001c). Complementarity, polarity and triality in nonsmooth, nonconvex and nonconservative Hamilton systems. Philosophical Transactions of the Royal Society: Mathematical, Physical and Engineering Sciences, 359, 2347–2367. [23] D.Y. Gao (2002). Duality and triality in non-smooth, nonconvex and nonconservative systems: a survey, new phenomena and new results. In: C. Baniotopoulos (Ed.), Nonsmooth/Nonconvex Mechanics with Applications in Engineering, pp. 1–14. Thessaloniki, Greece. [24] D.Y. Gao (2003a). Perfect duality theory complete solutions for constrained optimization problems. In: D.Y. Gao and K.L. Teo (Eds.), J. Global Optimisation, special issue on duality (to be published). [25] D.Y. Gao (2003b). Canonical dual principle, algorithm, and complete solutions to Landau-Ginzburg equation with applications. In: D. Steigmann (Ed.), Journal of Mathematics and Mechanics of Solids, special issue dedicated to Professor Ray Ogden for the occasion of his 60th birthday. [26] D.Y. Gao, Jie-Fang Li and D. Viehland (2002). Complete solutions and triality theory to LandauGinzburg equation in imperfect ferroelectrics. In: W.Z. Chien (Ed.), Proceedings of the 4th International Conference on Nonlinear Mechanics. Submitted to Physics Review, Shanghaui University Press. [27] D.Y. Gao, R.W. Ogden and G. Stavroulakis (2001). Nonsmooth and Nonconvex Mechanics: Modelling, Analysis and Numerical Methods, Vol. xliv, p. 471. Kluwer Academic Publishers, Boston/ Dordrecht/London.

GLOBAL OPTIMIZATION PROBLEMS

493

[28] D.Y. Gao and G. Strang (1989a). Geometric nonlinearity: potential energy, complementary energy, and the gap function. Quart. Appl. Math., 47(3), 487–504. [29] D.Y. Gao and G. Strang (1989b). Dual extremum principles in finite deformation elastoplastic analysis. Acta Appl. Math., 17, 257–267. [30] R.N. Gasimov (2002). Augmented Lagrangian duality and nondifferentiable optimization methods in nonconvex programming. J. Global Optimization, 24, 187–203. [31] C.J. Goh and X.Q. Yang (2002). Duality in Optimization and Variational Inequalities, p. 329. Taylor and Francis. [32] R. Horst, M. Pardalos, Panos and Nguyen Van Thoai (2000). Introduction to Global Optimization. Kluwer Academic Publishers. [33] R.T. Rockafellar and R.J.B. Wets (1997). Variational Analysis. Springer, Berlin, New York. [34] M.J.D. Powell (2002). UOBYQA: unconstrained optimization by quadratic approximation. Mathematical Programming, Series B, 92(3), 555–582. [35] A.M. Rubinov and R.N. Gasimov (2003). Scalarization and nonlinear scalar duality for vector optimization with preferences that are not necessarily a pre-order relation. J. Global Optimization (special issue on Duality edited by D.Y. Gao and K.L. Teo), to appear. [36] A.M. Rubinov and X.Q. Yang (2003). Lagrange-type Functions in Constrained Non-convex Optimization, p. 285. Kluwer Academic Publishers, Boston/Dordrecht/London. [37] A.M. Rubinov, X.Q. Yang and B.M. Glover (2001). Extended Lagrange and penalty functions in optimization. J. Optim. Theory Appl., 111(2), 381–405. [38] M.J. Sewell (1987). Maximum and Minimum Principles, p. 468. Cambridge Univ. Press. [39] I. Singer (1998). Duality for optimization and best approximation over finite intersections. Numer. Funct. Anal. Optim., 19(7–8), 903–915. [40] G. Strang (1986). Introduction to Applied Mathematics, p. 758. Wellesley-Cambridge Press. [41] B. Tabarrok, and F.P.J. Rimrott (1994). Variational Methods and Complementary Formulations in Dynamics. Kluwer Academic Publishers, Dordrecht. [42] P.T. Thach (1993). Global optimality criterion and a duality with a zero gap in nonconvex optimization. SIAM J. Math. Anal., 24(6), 1537–1556. [43] P.T. Thach (1995). Diewert-Crouzeix conjugation for general quasiconvex duality and applications. J. Optim. Theory Appl., 86(3), 719–743. [44] P.T. Thach, H. Konno and D. Yokota (1996). Dual approach to minimization on the set of Pareto-optimal solutions. J. Optim. Theory Appl., 88(3), 689–707. [45] J.F. Toland (1978). Duality in nonconvex optimization. J. Mathematical Analysis and Applications, 66, 399–415. [46] H. Tuy (1995). D.C. optimization: theory, methods and algorithms. In: R. Horst and P. Pardalos (Eds.), Handbook of Global Optimization, pp. 149–216. Kluwer Academic Publishers. [47] M. Walk (1989). Theory of Duality in Mathematical Programming. Springer-Verlag, Wien/New York. [48] M.H. Wright (1998). The interior-point revolution in constrained optimization. In: R. DeLeone, A. Murli, P.M. Pardalos and G. Toraldo (Eds.), High-performance Algorithms and Software in Nonlinear Optimization, pp. 359–381. Kluwer Academic Publishers, Dordrecht. [49] R.T. Rockaffellar (1974). Conjugate Duality and Optimization, SIAM, Philadelphia.