Global optimal solutions to nonconvex optimisation

0 downloads 0 Views 5MB Size Report
real numbers, and αi and β are positive real numbers. ... The purpose of this paper is to apply the canonical duality theory to solve globally the nonconvex ..... Πd(¯ζ). Thus, ¯x is a global minimizer of Π(x), and equation (27) is true. 2. Suppose ..... 0.8x. The corresponding canonical dual function is given by. Πd(τ,σ) = −. 0.32.
1

Mathematics and Mechanics of Solids

Global optimal solutions to nonconvex optimisation problems with a sum of double-well and log-sum-exp functions Yi Chen, David Y Gao and John Yearwood

arXiv:1308.4732v2 [math.OC] 23 Aug 2013

School of Science, Information Technology and Engineering, University of Ballarat, Victoria, Australia

Abstract This paper presents a canonical dual approach for solving a nonconvex global optimisation problem with a sum of double-well and log-sum-exp functions. Such a problem arises extensively in mechanics, robot designing, information theory and network communication systems. It includes fourth-order polynomial minimisation problems and minimax problems. Based on the canonical duality theory, this nonconvex problem is transformed to an equivalent dual problem, and the triality theory explicates that under certain condition the dual problem can be solved easily and, correspondingly, the global solution of the primal problem can be obtained analytically from the dual solution. It also discusses the relationships between local extremums of the primal problem and the dual problem. Furthermore, two specific problems, a fourth-order polynomial minimisation problem and a minimax problem, are discussed and situations when the condition in the triality theory holds are presented. In the end, several numerical examples are provided to illustrate the application of canonical duality theory on this problem.

Keywords Global optimization, canonical duality theory, double-well function, log-sum-exp function, polynomial minimisation, minimax problems

1

Introduction

In this paper, we are interesting in the following nonconvex global optimization problem:   1 T T n (P) : min Π(x) := x Ax − f x + W (x) + T (x) | x ∈ R 2

(1)

in which A ∈ Rn×n is a symmetric matrix, f ∈ Rn , and the log-sum-exp function T (x) and the double-well function W (x) are defined as  2 r X αi 1 T x Bi x − bTi x + ci , 2 2 i=1 "   # p X 1 1 T T (x) := log 1 + exp β x Qi x − qiT x + di , β 2 i=1 W (x) :=

(2)

(3)

where Qi ∈ Rn×n and Bi ∈ Rn×n are symmetric matrices, qi ∈ Rn , bi ∈ Rn , di and ci are any real numbers, and αi and β are positive real numbers. Corresponding author: David Y Gao, School of Science, Information Technology and Engineering, University of Ballarat, Victoria, Australia, Email: [email protected]

Chen et al

2

The double-well function is well known in modelling potential energy, for example, in [19], it was used to model post-buckling of beams. Whereas the log-sum-exp function is one of the fundamental functions in numerical analysis, and it is widely applied to deal with minimax problems [22, 23, 24, 26], which arise broadly in regions including plasticity theory[27], non smooth variational problems[18], structural optimisation problems[3], robot manipulator designing[1, 2, 25] and so on. Another significant application of the log-sum-exp function is in geometric programming, where objective and constraint functions are posynomial or monomial functions and they are rewritten as convex log-sum-exp functions [5, 6, 20]. The canonical duality theory was originally developed for handling general nonconvex and/or nonsmooth systems [8]. The canonical duality theory has been applied successfully to many problems arising in global optimization and nonconvex nonsmooth analysis, such as quadratic problems [7, 10, 12, 14], polynomial optimisation [11], transportation problems [15], location problems [13], and max-cut problem [28]. There are also some efficient algorithms developed which are based on the canonical duality theory [16]. The purpose of this paper is to apply the canonical duality theory to solve globally the nonconvex optimisation problem presented above. By introducing geometrical operators and canonical functions, the canonical dual problem can be constructed, which is equivalent to the primal problem. The triality theory explicates that under certain condition the dual problem can be solved easily. This condition is the appearance of a critical point in the positive semidefinite region of the feasible space of the dual problem. Correspondingly, the global solution of the primal problem can be obtained analytically from the critical point. This is called min-max duality. The triality theory also discusses the relationships between local extremums of the primal problem and the dual problem, which are stated as the double-min and double-max dualities. Then, two problems, a fourth-order polynomial minimisation problem and a minimax problem, are particularly discussed. For these two specific problems, the conditions of a critical point existing in the interior of the positive semidefinite region are presented, which are derived from the coefficients of the primal problems. In the end of the paper, several numerical examples are provided to illustrate the application of the canonical duality theory to the problems discussed in this paper. The rest of this paper is arranged as follows. In Section 2, the canonical duality theory is applied to the problem (P). The relationships of solutions of the primal and dual problems are discussed by the triality theory in Section 3. Then, in Section 4, two specific problems, a fourthorder polynomial minimisation problem and a minimax problem, is discussed. In Section 5, several examples are provided to illustrate the canonical duality theory. Finally, some conclusions are given in Section 6.

2

Canonical dual problem and analytical solutions

In this section, we apply the canonical duality theory to the problem (P). We first introduce the following two geometrical operators: p  1 T T x Qi x − qi x + di : Rn → E1 ⊆ Rp , (4) ξ(x) := 2 i=1  r 1 T η(x) := x Bi x − bTi x + ci : Rn → E2 ⊆ Rr , (5) 2 i=1 and the following two canonical functions: " # p X 1 V1 (ξ) := log 1 + exp (βξi ) , β i=1 V2 (η) :=

r X αi i=1

2

ηi2 ,

(6) (7)

where ξi is the ith component of ξ, and ηi is the ith component of η. It is easy to verify that ξ(x) and η(x) are twice Gˆ ateaux differentiable, and V1 (ξ) and V2 (η) are convex.

3

Mathematics and Mechanics of Solids

We define the following two operators  p exp(βξi ) Pp τ := ∇V1 (ξ) = : E1 → E1∗ , 1 + k=1 exp(βξk ) i=1 r σ := ∇V2 (η) = {αi ηi }i=1 : E2 → E2∗ ,

(8) (9)

where E1∗ ⊆ {τ ∈ Rp | τ > 0, eT τ < 1}, and E2∗ ⊆ Rr . Thus, from the Legendre transformation, we have the following relationships: V1 (ξ) + V1∗ (τ ) = ξ T τ ,

(10)

V2 (η) + V2∗ (σ) = η T σ,

(11)

where V1∗ and V2∗ are the conjugate functions of V1 and V2 , respectively, # " p p p X X 1 X ∗ τi log(τi ) + (1 − τi ) log(1 − τi ) , V1 (τ ) := β i=1 i=1 i=1 V2∗ (σ) :=

r X 1 2 σ . 2αi i i=1

(12) (13)

Let m = p + r, ζ = (τ , σ), Ea = E1 × E2 , and Ea∗ = E1∗ × E2∗ . The so-called generalized total complementary function Ξ : Rn × Ea∗ → R can be defined as 1 Ξ(x, ζ) := V1 (ξ(x)) + V2 (η(x)) + xT Ax − f T x 2 1 τ T ξ(x) − V1∗ (τ ) + σ T η(x) − V2∗ (σ) + xT Ax − f T x 2 1 T T T ∗ T = x Ga x − fa x + d τ − V1 (τ ) + c σ − V2∗ (σ), 2 Pp Pr Pp Pr where Ga := A + i=1 τi Qi + i=1 σi Bi and fa := f + i=1 τi qi + i=1 σi bi . Thus, for any given ζ ∈ Ea∗ , the canonical dual function Π d (ζ) is defined as =

Π d (ζ) := sta {Ξ(x, ζ) | x ∈ Rn } ,

(14)

(15)

where the notation sta{·} represents the task of finding stationary points of Ξ(x, ζ) with respect to x. Notice that for any given ζ, the total complementary function Ξ(x, ζ) is a quadratic function of x and its stationary points are the solutions of the following equation system ∇x Ξ(x, ζ) = Ga x − fa = 0.

(16)

If fa ∈ Col (Ga ), then x can be solved analytically as x = G†a fa , in which G†a denotes the generalized inverse of Ga . Thus the canonical dual function Π d (ζ) can be written explicitly as r X 1 2 1 T † T σi + dT τ Π (ζ) = − fa Ga fa + c σ − 2 2α i i=1 " p # p p X X 1 X τi log(τi ) + (1 − τi ) log(1 − τi ) . − β i=1 i=1 i=1 d

(17)

Let Sa := {ζ | ζ ∈ Ea∗ , fa ∈ Col (Ga )} . The canonical dual problem for the primal problem (P), therefore, is defined as follows:  (P d ) : sta Π d (ζ) | ζ ∈ Sa .

(18)

The following theorem states that there is no duality gap between the primal problem (P) and the canonical dual problem (P d ). The proof is omitted here, which is analogous with that in [9].

Chen et al

4

Theorem 1 (Analytical Solution and Complementary-Dual Principle [9, 17]) The problem (P d ) is canonically dual to the problem (P) in the sense that if ζ¯ ∈ Sa is a critical point of Π d (ζ), then ¯ = G†a fa x (19) ¯ is a critical point of Ξ(x, ζ), and we have ¯ ζ) is a critical point of Π(x), the pair (x, ¯ = Π d (ζ). ¯ ¯ = Ξ(x, ¯ ζ) Π(x)

3

(20)

Triality theory

In this section we study the conditions for local and global optima of the primal and dual problems. We focus the discussion on the following two regions of the dual space: Sa+ := {ζ ∈ Sa | Ga  0} , Sa− := {ζ ∈ Sa | Ga ≺ 0} . For convenience, we firstly give the first and second derivatives of functions Π(x) and Π d (ζ): ∇Π(x) = Ga x − fa , 2

(21) T

∇ Π(x) = Ga + F DF ,  n 1 1 T † T † † 2 fa Ga Qi Ga fa − qi Ga fa + di − β log or ∇Π d (ζ) =  n σi 1 T † T † † 2 fa Ga Bi Ga fa − bi Ga fa + ci − αi

(22) 1−

Pτpi

i=1

op τi

i=1

 ,

(23)

i=1

∇2 Π d (ζ) = −F T G†a F − D−1 ,

(24)

where F ∈ Rn×m and D ∈ Rm×m are defined as   F := Q1 x − q1 , . . . , Qp x − qp , B1 x − b1 , . . . , Br x − br ,    β diag(τ ) − τ τ T 0 . D := 0 diag(α) We also need the following two lemmas. Their proofs are omitted, which are analogous with that in [17]. Lemma 2 Suppose that m < n, ζ¯ ∈ Sa− is a critical point and a local minimizer of Π d (ζ), and ¯ = G†a fa . Then, there exists a matrix L ∈ Rn×m with rank(L) = m such that x ¯  0. LT ∇2 Π(x)L

(25)

¯ = G†a fa is a local Lemma 3 Suppose that m > n, ζ¯ ∈ Sa− is a critical point of Π d (ζ), and x m×n minimizer of Π(x). Then, there exists a matrix P ∈ R with rank(P ) = n such that ¯  0. P T ∇2 Π d (ζ)P

(26)

Since rank(L) = m, the columns of matrix L are linearly independent. Similarly, the columns of matrix P are linearly independent too. We defined the column spaces of L and P , respectively, as: ¯ + Lθ, θ ∈ Rm } , XL := {x ∈ Rn | x  SP := ζ ∈ Rm | ζ¯ + P ϑ, ϑ ∈ Rn . ¯ = G†a fa . Theorem 4 (Triality Theorem) Suppose that ζ¯ is a critical point of Π d (ζ), and x 1. If ζ¯ ∈ Sa+ , then the canonical min-max duality holds in the form of ¯ ¯ = minn Π(x) = max Π d (ζ) = Π d (ζ). Π(x) + x∈R

ζ∈Sa

(27)

5

Mathematics and Mechanics of Solids

¯ such that the ¯ ζ) 2. If ζ¯ ∈ Sa− , then there exists a neighborhood X0 × S0 ⊂ Rn × Sa− of (x, double-max duality holds in the form of ¯ ¯ = max Π(x) = max Π d (ζ) = Π d (ζ). Π(x) x∈X0

ζ∈S0

(28)

3. If ζ¯ ∈ Sa− , then the double-min duality statement holds conditionally as: ¯ such that ¯ ζ) (a) if m = n, then there exists a neighborhood X0 × S0 ⊂ Rn × Sa− of (x, ¯ ¯ = min Π(x) = min Π d (ζ) = Π d (ζ). Π(x) x∈X0

ζ∈S0

(29)

¯ is a saddle point of Π(x), and (b) if m < n and ζ¯ is a local minimizer of Π d (ζ), then x ¯ such that ¯ ζ) there exists a neighborhood (X0 ∩ XL ) × S0 ⊂ Rn × Sa− of (x, ¯ = Π(x)

min

x∈X0 ∩XL

¯ Π(x) = min Π d (ζ) = Π d (ζ); ζ∈S0

(30)

¯ is a local minimizer of Π(x), then ζ¯ is a saddle point of Π d (ζ), and (c) if m > n and x ¯ such that ¯ ζ) there exists a neighborhood X0 × (S0 ) ∩ SP ⊂ Rn × Sa− of (x, ¯ = min Π(x) = Π(x) x∈X0

min

ζ∈S0 ∩SP

¯ Π d (ζ) = Π d (ζ).

(31)

Proof: 1. Since ζ¯ ∈ Sa+ , we have Ga  0 and D  0, which implies that Π d (ζ) is strictly concave on ¯ Sa+ . Thus, ζ¯ is a global maximizer of Π d (ζ) on Sa+ . Similarly, it can be proved that Ξ(x, ζ) n ¯ ¯ is a critical point of Ξ(x, ζ), implies that for is convex on R , which, plus the fact that x ¯ ≥ Ξ(x, ¯ Furthermore, from the Fenchel’s inequality, it is ¯ ζ). any x ∈ Rn , we have Ξ(x, ζ) true that Ξ(x, ζ) ≤ Π(x), ∀(x, ζ) ∈ Rn × Sa . Therefore, for any x ∈ Rn , we have ¯ ≥ Ξ(x, ¯ = Π(x), ¯ ζ) ¯ Π(x) ≥ Ξ(x, ζ)

(32)

¯ = where the last equality has been proved in Theorem 1, which also has proved that Π(x) ¯ Thus, x ¯ is a global minimizer of Π(x), and equation (27) is true. Π d (ζ). ¯ = −F T G−1 F − 2. Suppose ζ¯ is a local maximizer of Π d (ζ) on Sa− . Then we have ∇2 Π(ζ) a D−1  0, and there exists a neighborhood S0 ⊂ Sa such that for all ζ ∈ S0 , ∇2 Π d (ζ)  0. n Since for any ζ ∈ S0 the matrix Ga is nonsingular, thus the map x = G−1 a fa : Sa → R is one-to-one. Let X0 be the image of the map x = G−1 f on S . Obviously, X is a a 0 0 a ¯ Next, we want to prove that for any x ∈ X0 , ∇2 Π(x)  0, which neighborhood of x. ¯ is a critical point of Π(x) implies that x ¯ is a maximizer of Π(x) plus the fact that x over X0 . For any x ∈ X0 , let ζ be the corresponding point under the map x = G−1 a fa . −1 F − D  0. By singular value decomposition, there exists Thus, ∇2 Π d (ζ) = −F T G−1 a orthogonal matrices E ∈ Rn×n , K ∈ Rm×m and R ∈ Rn×m with  δi , i = j and i = 1, . . . , r, Rij = (33) 0, otherwise, where δi > 0 for i = 1, . . . , r and r = rank(F ), such that 1

F D− 2 = ERK. Then we have

(34)

1

1

−2  0. − D−1 − D− 2 K T RT E T G−1 a ERKD − 21

Being multiplied by KD from the left and D can be converted equivalently into

− 12

K

T

(35)

from the right, the equation (35)

− Im − RT E T G−1 a ER  0,

(36)

Chen et al

6

which, by Lemma 7 in Appendix, is further equivalent to E T Ga E + RRT  0.

(37)

Multiplying the equation (37) by E from the left and E T from the right, we obtain 1

1

0  Ga + ERKD− 2 DD− 2 K T RT E T = Ga + F DF T = ∇2 Π(x).

(38)

¯ is a maximizer of Π(x) over S0 . Therefore, x ¯ is a maximizer of Π(x) over X0 , ζ¯ is a maximizer of Π d (ζ) Similarly, we can prove that if x over S0 . The equation (28) can be proved similarly as the equation (27). 3. Then we prove the double-min duality. (a) Suppose that m = n and ζ¯ is a local minimizer of Π d (ζ) over Sa− . Then there exists a neighborhood S0 ⊂ Sa of ζ¯ such that for any ζ ∈ S0 , ∇2 Π d (ζ)  0. Denote X0 as the image of the map x = G−1 a fa over S0 . For any x ∈ X0 , let ζ be the corresponding 2 d T −1 −1 point under the map x = G−1  0, we have a fa . From ∇ Π (ζ) = −F Ga F − D T −1 −1 −F Ga F  D  0, which implies that the matrix F is invertible. Then we obtain T −1 −1 −1 − G−1 D F , a  (F )

(39)

− Ga  F DF T .

(40)

which is further equivalent to Thus, we prove that ∇2 Π(x) = Ga + F DF T  0. The converse can be proved similarly. The equation (29) can be proved similarly as the equation (27). ¯ (b) Suppose that m < n and ζ¯ is a local minimizer of Π d (ζ) over Sa− . We claim that x ¯ is a local minimizer of Π(x), we would have is not a local minimizer of Π(x). If x ¯ = Ga + F DF T  0, which is equivalent to F DF T  −Ga . Since −Ga  0, ∇2 Π(x) it is true that matrix F has full rank and n = rank(−Ga ) = rank(F DF T ) ≤ min {rank(F ), rank(D)} = m,

(41)

¯ must be a saddle which is a contradiction. Therefore, plus the previous discussion, x point of Π(x). Let ¯ + Lt). ϕ(t) = Π(x (42) It can be proved that 0 ∈ Rm is a local minimizer of the function ϕ(t), because we have ¯ = 0, ∇ϕ(0) = LT ∇Π(x) ¯  0. ∇2 ϕ(0) = LT ∇2 Π(x)L

(43) (44)

Thus the equation (30) is proved. (c) The proof is similar to case (b) and it is omitted. The theorem is proved.

4

2

Two specific problems

In this section, two specific problems, a fourth-order polynomial minimisation problem and a minimax problem, will be discussed. For these two problems, existence conditions for the critical point in Sa+ are derived, which also can be used to separate easy and hard problems.

7

Mathematics and Mechanics of Solids

4.1

A fourth-order polynomial minimisation problem

The fourth-order polynomial minimisation problem considered here is  2 α 1 T 1 T T T (Pp ) minn Πp (x) = x Ax − f x + x Bx − b x + c x∈R 2 2 2

(45)

where matrix B is symmetric and positive definite. Without loss of generality, we can assume B = I, the identity matrix, and b = 0. Let Ga = A + σI, and Sa+ = {σ | Ga  0, σ ≥ αc} . The canonical dual problem for the problem (45) is defined as (Ppd )

1 2 1 σ min+ Πpd (σ) = − f T G−1 a f + cσ − 2 2α σ∈Sa

(46)

From the symmetry of the matrix A, we have a diagonal matrix Λ and an orthogonal matrix U such that A = U ΛU T . The diagonal entities of Λ are the eigenvalues of the matrix A in nondecreasing order, λ1 = · · · = λk < λk+1 ≤ · · · ≤ λn . The columns of U are the corresponding eigenvectors. If we let fˆ = U T f , the dual function can be rewritten as Πpd (σ) = −

n 1 2 1 X fˆi2 − σ 2 i=1 λi + σ 2α

(47)

The first-order and second-order derivatives of the dual function Πpd (σ) are δΠpd (σ) = δ

2

Πpd (σ)

n 1X fˆi2 1 − σ 2 i=1 (λi + σ)2 α

n 1 1X fˆi2 − =− 2 i=1 (λi + σ)3 α

Since α is assumed to be positive, δ 2 Πpd (σ) is negative over Sa+ . Thus the dual function is concave in Sa+ . If αc > −λ1 , Sa+ = [αc, +∞) and the maximiser of Πpd (σ) in Sa+ is corresponding to the unique global solution of the primal problem, since Ga is positive definite. If αc ≤ −λ1 , Sa+ = [−λ1 , +∞) and we have the following theorem about the existence of a critical point in Sa+ . Its proof is similar to that in [4]. Proposition 5 (Existence Condition) Suppose that λi are defined as above and −λ1 ≤ αc. Pk Then there exist a critical point of Πpd (τ ) in the interior of Sa+ if and only if i=1 fˆi2 6= 0 or Pn 1 2 d + ˆ2 i=k+1 fi /(−λ1 + λi ) + λ1 /α > 0. If Πp (τ ) has a critical point in Sa , the critical point is 2 ∗ ∗ −1 unique. Let σ denote the critical point. Then x = Ga f is a global solution of the problem (Pp ).

4.2

A minimax problem

In this section, we consider an minimax problem   1 T 1 T T T min max x A1 x − f1 x + d1 , x A2 x − f2 x + d2 x∈Rn 2 2

(48)

where the matrix A1 and the matrix A2 are symmetric. Since A1 is positive definite, we can rotate and move the coordinate system such that A1 becomes the identity matrix and f1 vanishes. Thus not losing any generality, we consider the following problem with simpler formulation:   1 T 1 minn max x x + d1 , xT A2 x − f2T x + d2 (49) x∈R 2 2

Chen et al

8

In order to make sure that the problem is not trivial, we further assume that d2 > d1 . For the problem (49), an existing condition for a critical point being in the positive semidefinite region Sa+ will be presented. Furthermore, if the existing condition does not hold, perturbations are introduced and the perturbed problem will always have a critical point in Sa+ . If we let Q = A2 − I, q = f2 and d = d2 − d1 , then the problem (49) can be approximated as the problem (P) with p = 1, r = 0, f = 0 and A = I.     1 1 T 1 x Qx − q T x + d . (50) (Pm ) minn Πm (x) = xT x + d1 + log 1 + exp β x∈R 2 β 2 The dual function will be an univariate function, the matrix Ga and the vector fa are simplified as Ga = I + τ Q, and fa = τ q. We are only interesting in the behaviour of the dual function on the positive semidefinite region in the dual space, which, for this specific case, is defined as Sa+ = {τ | 0 < τ < 1, Ga  0}.

(51)

The canonical dual problem is defined as 1 1 d d [τ log(τ ) + (1 − τ ) log(1 − τ )] + d1 (Pm ) max+ Πm (τ ) = − faT G−1 a fa + dτ − 2 β τ ∈Sa

(52)

Similar with the discussion in last subsection, we denote the eigendecompostion of Q as Q = U ΛU T . The diagonal entities of Λ are the eigenvalues of the matrix Q in nondecreasing order, λ1 = · · · = λk < λk+1 ≤ · · · ≤ λn . ˆ the dual The columns of U are the corresponding eigenvectors. If we let qˆ = U T q and fˆa = τ q, function can be rewritten as ! Pk n X ˆi2 τ2 qˆi2 1 d i=1 q Πm (τ ) = − + + dτ − [τ log(τ ) + (1 − τ ) log(1 − τ )] + d1 (53) 2 1 + τ λ1 1 + τ λi β i=k+1

It can be noticed that if λ1 ≥ −1, the matrix Ga is always positive definite as τ ∈ (0, 1). d (τ ) has a critical point on Sa+ . The We then can prove that if λ1 ≥ −1, the dual function Πm d first-order and second-order derivatives of Πm (τ ) are d δΠm (τ ) =

k n X −1/τ − λ1 /2 X 2 −1/τ − λi /2 2 1 q ˆ + qˆ + log(1/τ − 1) + d (1/τ + λ1 )2 i=1 i (1/τ + λi )2 i β i=k+1

d T −1 −1 δ 2 Πm (τ ) = −(QG−1 a fa − q) Ga (QGa fa − q) −

1 1 1 ( + ) β τ 1−τ

d d We notice that δ 2 Πm (τ ) is negative in Sa+ , which indicates that Πm (τ ) is concave over Sa+ . As d τ is close enough to 1, the value of δΠm (τ ) will be negative, and as τ approaches to 0, the first d d two items in δΠm (τ ) will approach to zero and the value of δΠm (τ ) will be positive with large d enough β. Thus, with large enough β, there will be a critical point of the dual function Πm (τ ) + on Sa . Next we consider the situation where λ1 < −1. The following proposition gives the conditions d for the existence of the critical point of the dual function Πm (τ ) on Sa+ . Its proof is similar to that in [4] and omitted.

Proposition 6 (Existence Condition) Suppose that λi and qˆi are defined as above and λ1 < Pk d −1. of Πm (τ ) in the interior of Sa+ if and only if i=1 qˆi2 6= 0 or Pn Then2 there exist a critical point d ˆi (λ1 − λi /2)/(−λ1 + λi )2 + log(−λ1 − 1)/β + d > 0. If Πm (τ ) has a critical point in the i=k+1 q + ∗ interior of Sa , the critical point is unique. Let τ denote the critical point. Then x∗ = G−1 a fa is a global solution of the problem (Pm ).

9

Mathematics and Mechanics of Solids

7 6 5 4 3 2 1

-1.0

-0.5

0.5

1.0

Figure 1: The graph of Π(x) in Example 1. 1.0

0.0 0.5 1.0

0.5

10

5

0.0

0 -0.5

0 -5 -10

0.0

0.2

0.4

0.6

0.8

1.0

Figure 2: The graph of Π d (τ, σ) (left) and the contour plot of Π d (τ, σ) on the region Sa+ (right) in Example 1.

5

Examples

In this section, sevral examples are provided to illustrate the perfect duality of the canonical duality theory.

Example 1 Consider the 1-dimensional problem:   2 min Π(x) = log 1 + exp 0.5x2 − 0.1 + 5 x2 − 1 − 0.8x. x∈R

The corresponding canonical dual function is given by Π d (τ, σ) = −

0.32 − σ − 0.05σ 2 − 0.1τ − [τ log(τ ) + (1 − τ ) log(1 − τ )] . τ + 2σ

The graph of function Π(x) is shown in Figure 1, and the graph and contour plot of Π d (τ, σ) is shown in Figure 2. There are three critical points of the dual function Π d (τ, σ):             τ¯1 0.599866 τ¯2 0.475231 τ¯3 0.590128 = , = , and = . σ ¯1 0.098119 σ ¯2 −9.983154 σ ¯3 −0.71007 They are corresponding to the solutions of the primal problem: x ¯1 = 1.004894, x ¯2 = −0.041044, and x ¯3 = −0.963843. It is noticed that (¯ τ1 , σ ¯1 ) is in the region Sa+ and x ¯1 is the global solution of the primal problem, which is the min-max duality. The double-max duality can be seen from the fact that (¯ τ2 , σ ¯2 )

Chen et al

10

5

2000 1500 0

1000 500

-150

-5

-100

-50 -500 -1000

0

-5

5

Figure 3: The contour plot of Π(x) and the graph of the dual function in the Example 2. and x ¯2 are local maximisers of functions Π(x) and Π d (τ, σ). Since n = 1, m = 2 and x ¯3 is a local minimiser of the function Π(x), the fact that (¯ τ3 , σ ¯3 ) ∈ Sa− is a saddle point of the function Π d (τ, σ) illustrates the double-min duality. Moreover, we have Π(¯ x1 ) = Π d (¯ τ1 , σ ¯1 ) = 0.112521, Π(¯ x2 ) = Π d (¯ τ2 , σ ¯2 ) = 5.660800, Π(¯ x3 ) = Π d (¯ τ3 , σ ¯3 ) = 1.688196.

Example 2 Consider a randomly generated fourth-order polynomial problem: 1 α minn Π(x) = xT Ax − f T x + x∈R 2 2



2 1 T x x+c 2

with  A=

   −16 −5 14 , f= , c = −14, and α = 10; −5 −14 −6

The contour plot of Π(x) and the graph of the dual function are shown in Figure 3. There are five critical points of the dual function: σ ¯1 = 19.093, σ ¯2 = 14.495, σ ¯3 = −13.184, σ ¯4 = −16.459, and σ ¯5 = −139.945. The eigenvalues of the corresponding matrix Ga are           2.282 −2.32 −29.99 −33.27 −156.76 λ1 = , λ2 = , λ3 = , λ4 = , and λ5 = , 33.904 29.31 1.63 −1.65 −125.13 and the corresponding critical points of the primal problem are:           5.6 −5.44 0.38 −1.18 −0.09 x1 = , x2 = , x3 = , x4 = , and x5 = , 0.67 −1.16 −5.02 4.83 0.05 ¯ 1 is the global solution of the primal problem, which illustrates We notice that σ ¯1 is in Sa+ and x the min-max duality. Both σ ¯4 and σ ¯5 are in Sa− , the double-min duality is demonstrated by the fact that σ ¯4 is a local minimiser and x4 is a saddle point, and the double-max duality is demonstrated by the fact that σ ¯5 is a local maximiser and x5 is a local maximiser. Moreover, the values of the primal function and dual function are equal on each pair of solutions.

11

Mathematics and Mechanics of Solids

6 4 4 2 -1

2

0 2

0.2

0

0.4

0.6

0.8

1.0

1 -2 0

1 -1

Figure 4: The graph of Π(x) (left) and the graph of Π d (τ ) (right) in Example 3. The parameter is set as β = 100.

Example 3 ([21]) We consider a nonconvex and nonsmooth optimisation problem:  min2 max x21 + x22 − x2 , − x21 − x22 + 3x2 . x∈R

It’s easy to verify that the optimal solution is (0, 0) with value 0. Here, we use the log-sum-exp function to approximate the function max {·, ·}, and then get the following smooth optimisation problem:   1 min Π(x) = log 1 + exp β 2x21 + 2x22 − 4x2 − x21 − x22 + 3x2 . β x∈R2 Its canonical dual function is Π d (τ ) = −

1 1 (4τ − 3)2 − [τ log τ + (1 − τ ) log(1 − τ )] . 2 4τ − 2 β

The graphs of the approximation function Π(x) and the dual function Π d (τ ) are shown in Figure 4. With β = 100, the two critical points of the dual function Π d (τ ) are τ¯1 = 0.749318, and τ¯2 = 0.249308. The corresponding solutions of the primal problem are     0 0 ¯1 = ¯2 = x , and x . −0.002734 1.99724 The min-max duality is true by the fact that τ1 ∈ Sa+ and x1 is the global solution of the primal problem. The double-min duality is also true because of the fact that m = 1, n = 2, τ2 ∈ Sa− is a local minimiser and x2 is a saddle point of the primal function. Moreover, we have ¯ 1 ) = Π d (¯ ¯ 2 ) = Π d (¯ Π(x τ1 ) = 0.005627, and Π(x τ2 ) = 2.00562.

6

Conclutions

A very general nonconvex global optimisation problem with a sum of double-well and log-sum-exp functions is discussed. The canonical duality theory is applied to solve this challenging problem. For the general problem, the triality theory concludes that if there is a critical point in the positive semidefinte region Sa+ , the dual problem can be solved easily and, correspondingly, the global solution of the primal problem can be found analytically from this critical point. For the

Chen et al

12

two specific problems, a fourth-order polynomial minimisation problem and a minimax problem, existence conditions for the citical point are presented. If these conditions hold, there must be a critical point in the positive semidefinite region and the global solution of the primal problem can be easily obtained by solving the dual problem. The numerical examples demonstrate the efficiency of the canonical duality approach. Besides the duality about the global solution, which is called the min-max duality, the triality theory also discusses the relationships of local extremums, which are called double-min duality and double-max duality. But the saddle points which are not in the Sa+ and Sa− have not been clarified. There are some interesting phenomenas, which hint that these saddle points may can be sorted according to certain orders. Thus, our next work about this problem is to investigate the order of the extremums in the dual space. Another work would be constructing existence conditions of a critical point for the general problem.

References [1] Abdi, H, and Nahavandi, S. Designing optimal fault tolerant jacobian for robotic manipulators. In IEEE/ASME International Conference on Advanced Intelligent Mechatronics 2010; pp.426–431. [2] Abdi, H, Nahavandi, S, and Maciejewski, AA. Optimal fault-tolerant jacobian matrix generators for redundant manipulators. In IEEE International Conference on Robotics and Automation 2011; pp.4688–4693. [3] Banichuk, NV. Minimax approach to structural optimization problems. J Optimiz Theory App 1976; 20:111–127. [4] Chen, Y, and Gao, DY. Global solutions of quadratic problems with sphere constraint via canonical dual approach . Working paper. [5] Chiang, M. Geometric programming for communication systems. Now Publishers Inc, 2005. [6] Chiang, M, Tan, CW, Palomar, DP, O’Neill, D, and Julian, D. Power control by geometric programming. In IEEE Transactions on Wireless Communications 2007; 6:2640–2651. [7] Fang, SC, Gao, DY, Sheu, RL, and Wu, SY. Canonical dual approach to solving 0-1 quadratic programming problems. J Ind Manag Optim 2008; 4:125–142. [8] Gao, DY. Canonical dual transformation method and generalized triality theory in nonsmooth global optimization. J Global Optim 2000; 17:127–160. [9] Gao, DY. Duality principles in nonconvex systems: theory, methods, and applications. Springer Netherlands, 2000. [10] Gao, DY. Canonical duality theory and solutions to constrained nonconvex quadratic programming. J Global Optim 2004; 29:377–399. [11] Gao, DY. Complete solutions and extremality criteria to polynomial optimization problems. J Global Optim 2006; 35:131–143. [12] Gao, DY, and Ruan, N. Solutions to quadratic minimization problems with box and integer constraints. J Global Optim 2010; 47:463–484. [13] Gao, DY, Ruan, N, and Pardalos, PM. Canonical dual solutions to sum of fourth-order polynomials minimization problems with applications to sensor network localization. Sensors: Theory, Algorithms, and Applications. Springer New York, 2012; pp.37–54. [14] Gao, DY, Ruan, N, and Sherali, HD. Solutions and optimality criteria for nonconvex constrained global optimization problems with connections between canonical and Lagrangian duality. J Global Optim 2009; 45:473–497. [15] Gao, DY, Ruan, N, and Sherali, HD. Canonical dual solutions for fixed cost quadratic programs. In Optimization and Optimal Control 2010; pp.139–156.

13

Mathematics and Mechanics of Solids

[16] Gao, DY, Watson, LT, Easterling, DR, Thacker, WI, and Billups, SC. Solving the canonical dual of box- and integer-constrained nonconvex quadratic programs via a deterministic direct search algorithm. Optim Method Softw 2011; pp.1–14. [17] Gao, DY, and Wu, C. On the triality theory for a quartic polynomial optimisation problem. J Ind Manag Optim 2012; 8:229–242. [18] Gao, DY. Minimax and triality theory in nonsmooth variational problems. ReformulationNonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods 1998; pp.161–180. [19] Gao, DY. Finite deformation beam models and triality theory in dynamical post-buckling analysis. Int J Nonlinear Mech 2000; 35:103–131. [20] Hsiung, KL, Kim, SJ, and Boyd, SP. Tractable approximate robust geometric programming. Optim Eng 2008; 9:95–118. [21] Kiwiel, KC. Methods of descent for nondifferentiable optimization. Springer-Verlag Berlin, 1985. [22] Pee, EY, and Royset, JO. On solving large-scale finite minimax problems using exponential smoothing. J Optimiz Theory App 2011; 148:390–421. [23] Polak, E. Optimization: algorithms and consistent approximations. vol. 124. Springer Verlag, 1997. [24] Polak, E, Royset, JO, and Womersley, RS. Algorithms with adaptive smoothing for finite minimax problems. J Optimiz Theory App 2003; 119:459–484. [25] Roberts, RG, Yu, HG, and Maciejewski, AA. Characterizing optimally fault-tolerant manipulators based on relative manipulability indices. In IEEE/RSJ International Conference on Intelligent Robots and Systems 2007; pp.3925–3930. [26] Royset, JO, Polak, E, and Kiureghian, AD. Adaptive approximations and exact penalization for the solution of generalized semi-infinite min-max problems. SIAM J Optimiz 2004; 14:1– 34. [27] Strang, G. A minimax problem in plasticity theory. In Functional analysis methods in numerical analysis. Springer. 1979; pp.319–333. [28] Wang, Z, Fang, SC, Gao, DY, and Xing, W. Canonical dual approach to solving the maximum cut problem. J Global Optim 2011; pp.1–11.

Appendix The following lemma is a generalization of Lemma 6 in [17]. Lemma 7 Suppose that P ∈ Rn×n , U ∈ Rm×m and D ∈ Rn×m are given symmetric matrices with       P11 P12 U11 0 D11 0 P = ≺ 0, U =  0, and D = , P21 P22 0 U22 0 0 where P11 , U11 and D11 are r × r-dimensional matrices, and D11 is nonsingular. Then, P + DU DT  0 ⇔ −DT P −1 D − U −1  0. Proof:

Obviously, P + DU DT  0 is equivalent to  T −P11 − D11 U11 D11 T − P − DU D = −P21

 −P12  0. −P22

(54)

(55)

Since P ≺ 0, we have P22 ≺ 0. By Schur lemma, equation (55) is equivalent to −1 T − P11 − D11 U11 D11 + P12 P22 P21  0.

(56)

Chen et al

14

The inverse of matrix P is  −1 (P11 − P12 P22 P21 )−1 −1 P = −1 −1 −(P22 − P21 P11 P12 )−1 P21 P11

 −1 −1 −P11 P12 (P22 − P21 P11 P12 )−1 . −1 (P22 − P21 P11 P12 )−1

−1 Then, it is easy to prove that −P11 + P12 P22 P21  0. Since D11 is nonsingular and U11  0, we T have D11 U11 D11  0. Thus, by lemma ??, the equation (56) is equivalent to −1 T −1 (−P11 + P12 P22 P21 )−1  (D11 U11 D11 ) ,

(57)

which is further equivalent to −1 −1 T D11 (−P11 + P12 P22 P21 )−1 D11  U11 .

(58)

−1 T Since D11 (−P11 + P12 P22 P21 )−1 D11 = −DT P −1 D and U22  0, the equation (58) is equivalent to − DT P −1 D − U −1  0. (59)

The lemma is proved.

2