An efficient algorithm for solving nonlinear ... - VU Research Portal

4 downloads 0 Views 206KB Size Report
The conjugate residual with optimal trial vectors CROP algorithm is developed. In this algorithm, .... nation of the residual rn and the optimal search directions of.
THE JOURNAL OF CHEMICAL PHYSICS 128, 204105 共2008兲

An efficient algorithm for solving nonlinear equations with a minimal number of trial vectors: Applications to atomic-orbital based coupled-cluster theory Marcin Ziółkowski,1,a兲 Ville Weijo,1,2 Poul Jørgensen,1 and Jeppe Olsen1 1

The Lundbeck Foundation Center for Theoretical Chemistry, Department of Chemistry, University of Aarhus, Langelandsgade 140, DK-8000 Århus C, Denmark 2 Laboratory of Physics, Helsinki University of Technology, P.O. Box 1100 (Otakaari 1 M), 02015 Espoo, Finland

共Received 22 February 2008; accepted 23 April 2008; published online 23 May 2008兲 The conjugate residual with optimal trial vectors 共CROP兲 algorithm is developed. In this algorithm, the optimal trial vectors of the iterations are used as basis vectors in the iterative subspace. For linear equations and nonlinear equations with a small-to-medium nonlinearity, the iterative subspace may be truncated to a three-dimensional subspace with no or little loss of convergence rate, and the norm of the residual decreases in each iteration. The efficiency of the algorithm is demonstrated by solving the equations of coupled-cluster theory with single and double excitations in the atomic orbital basis. By performing calculations on H2O with various bond lengths, the algorithm is tested for varying degrees of nonlinearity. In general, the CROP algorithm with a three-dimensional subspace exhibits fast and stable convergence and outperforms the standard direct inversion in iterative subspace method. © 2008 American Institute of Physics. 关DOI: 10.1063/1.2928803兴 I. INTRODUCTION

The solution of linear and nonlinear equations is a central task of electronic structure theory. For example, in algorithms for diagonalization free optimization of the Hartree– Fock and Kohn–Sham energies linear1 or nonlinear2 equations are solved, whereas in coupled-cluster theory3 nonlinear equations are solved. The solution of linear equations with a symmetric and positive definite matrix may be formulated as the minimization of a quadratic function. The standard method for this minimization is the conjugate gradient 共CG兲 method.4–7 In each iteration of the CG algorithm, a new direction is added to the previous directions. By ensuring that the new direction is conjugate to the previous directions, the CG algorithm obtains the attractive feature that a simple unidirectional minimization is mathematically equivalent to a minimization in the space spanned by all directions generated in the current and previous iterations. Further, only information from the last iteration is needed to identify the unidirectional search direction. The storage and manipulation of all generated directions are therefore avoided without loss of convergence. In quantum chemistry, Pople et al.8 used an iterative subspace method for solving the coupled perturbed Hartree– Fock equations which Wormer et al.9 showed was a special implementation of the CG method. If the matrix defining the linear equations is nonsymmetric, the CG method cannot be applied in its standard form, and if the matrix is symmetric but not positive definite, the CG algorithm in its standard form is not guaranteed to be able to determine the solution. The CG method has been reformulated to forms that are a兲

Author to whom correspondence should be addressed. Electronic mail: [email protected].

0021-9606/2008/128共20兲/204105/12/$23.00

more suitable for symmetric matrices that are not positive definite.7,10 However, for matrices that are nonsymmetric or not positive definite, it may be more attractive to use minimal residual 共MR兲 methods.7 For general nonsymmetric matrices, the generalized minimal residual method is a standard choice.11 For symmetric matrices, the conjugate residual 共CR兲 method12 shares the very attractive property with the CG method that each iteration can be expressed in terms of a unidirectional search, where the search direction may be determined from information from the last iteration, and where the storage and manipulation of a long list of directions and residuals therefore is avoided. A number of variants of the CR methods have been developed for the nonsymmetric matrices.11,13,14 We refer to numerical mathematical texts for further detail.7 In quantum chemistry, linear and nonlinear equations are often solved by using the direct inversion of iterative subspace 共DIIS兲 method.15,16 The DIIS method was originally developed to improve the local convergence of selfconsistant field calculations, but it has proven useful for the solution of many other problems in electronic structure theory including geometry optimization17,18 and the solution of the coupled-cluster equations.19 A short review on the properties and use of the DIIS method has recently been published.20 In each iteration of the DIIS algorithm, a residual is minimized in the subspace of the current and previous trial vectors. As the dimension of the trial vectors may be large, it is often only possible to store the information from the last few iterations. However, in contrast to the CG and CR methods, a reduction in the rate of convergence is typically observed when vectors from earlier iterations are discarded, even when linear equations with symmetric matri-

128, 204105-1

© 2008 American Institute of Physics

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-2

J. Chem. Phys. 128, 204105 共2008兲

Ziółkowski et al.

and the trial solution xn+1 is parametrized as a linear combination of the residual rn and the optimal search directions of the previous iterations 兵p其 , i = 0 , 1 , . . . , n − 1

ces are solved. There are thus important differences between the DIIS method and the standard MR algorithms which will be analyzed in this paper. The CR algorithm will be derived and reparametrized to a form where trial solutions rather than directions are stored. The resulting algorithm, the CR algorithm with Optimal trial vectors 共CROP兲, is mathematically equivalent to the CR algorithm for linear equations but is straightforward to extend to nonlinear equations. For nonlinear equations with a small nonlinearity only a minor degradation in the performance is seen using the CROP algorithm storing only the last trial solution, as the degradation is due solely to the nonlinearity of the equations. For linear equations, we show that the CROP method with the last trial solution stored gives the same solution as the standard DIIS algorithm where information from all previous iterations are stored. The CROP method is a general method for solving linear and nonlinear equations. The efficiency of the method is demonstrated and compared to the one of the standard DIIS method for linear equations solving orthogonalized atomic orbital second order Møller–Plesset21 共MP2兲 equations and for nonlinear equations solving orthogonalized atomic orbital coupled-cluster singles doubles22 共CCSD兲 equations.

such that the minimization of the norm of the residual g共xn+1兲 with xn+1 calculated either by Eq. 共5兲 or Eq. 共6兲 gives a mathematically identical result. In the CR algorithm, as in the more commonly used CG method, all but the last direction may be discarded without loss of convergence rate. We will now describe the CR algorithm in a fashion that is in accordance with the standard quantum chemical focus on subspace optimizations. We will, in particular, show how the optimal search directions may be obtained such that the multidirectional search of Eq. 共5兲 is identical to the unidirectional search of Eq. 共6兲.

II. MINIMAL RESIDUAL METHODS FOR LINEAR EQUATIONS

2. Iteration 1

In this section, we discuss and compare various minimal residual algorithms for determining the solution x쐓 of the linear equation Ax쐓 − b = 0,

共1兲

where A has a dimension d and is symmetric but not necessarily positive definite. We will assume that A is nonsingular, although the following derivation also holds if A is singular and b is orthogonal to the space of the null space of A, i.e., the space of eigenvectors of A with eigenvalues equal to zero. The residual for a general vector x is given as r = b − Ax

共2兲

and may be used as a measure of the accuracy of the solution. Minimizing the squared residual norm g共x兲 = rTr

共3兲

gives

⳵g共x兲 = 2A共Ax − b兲 = 0. ⳵x

n−1

共n兲 xn+1 = xn + 兺 ␣共n兲 i pi + ␣n rn .

The idea of the CR algorithm is to identify an optimal search direction pn replacing the multiple search directions in Eq. 共5兲 xn+1 = xn + ␣共n兲 n pn

Minimization of g共x兲 may therefore be used to determine the solution of Eq. 共1兲. A. The conjugate residual algorithm 1. Introduction

In this section, we derive the CR algorithm. We first present the CR algorithm in its standard form7 and then reparametrize it to a form where the optimal solution of each iteration rather than the optimal direction is stored. In iteration n + 1 of the CR algorithm, with the approximate solution xn, the residual rn = b − Axn is first calculated

共6兲

Let x0 be our starting guess. The initial iteration differs from the remaining iterations by having a single search direction r0 = b − Ax0, which, therefore, is optimal, i.e., p0 = r0. The solution vector at iteration 1, therefore, has the form x1 = x0 + ␣共0兲 0 p0 .

共7兲

Minimization of g共x1兲 in the direction p0 gives

⳵g共x1兲 T 2 T = ␣共0兲 0 2p0 A p0 − 2r0 Ap0 = 0, ⳵␣共0兲 0

共8兲

which determines the optimal steplength in the direction p0,

␣共0兲 0 =

rT0 Ap0 pT0 A2p0

.

共9兲

Note that pT0 A2p0 is nonvanishing as we have assumed that A is nonsingular. The residual at x1, r1 = b − Ax1 ,

共4兲

共5兲

i=0

共10兲

is shown using Eqs. 共9兲 and 共7兲 to be conjugate to the search direction p0, pT0 Ar1 = rT0 Ar1 = 0.

共11兲

Further, from Eq. 共7兲, we see that Ap0 =

1

␣共0兲 0

共r0 − r1兲.

共12兲

3. Iteration 2

In iteration 2, the trial vector is initially written as having components along p0 and r1,

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-3

共1兲 x2 = x1 + ␣共1兲 0 p0 + ␣1 r1 .

共13兲

共1兲 Minimization of g共x2兲 with respect to ␣共1兲 0 and ␣1 gives the subspace equations



pT0 A2p0 pT0 A2r1 rT1 A2r1

pT0 A2r1

冊冉 冊 冉 冊 ␣共1兲 0

␣共1兲 1

=

0

rT1 Ar1

.

共14兲

From the first row of Eq. 共14兲, the coefficient ␣共1兲 0 may be written in terms of ␣共1兲 , 1

␣共1兲 0

J. Chem. Phys. 128, 204105 共2008兲

Solving nonlinear equations with a minimal number of trial vectors

=−

共15兲

x2 = x1 + ␣共1兲 1 p1 ,

共16兲

pT0 A2r1

共17兲

p0 . pT0 A2p0

Having x2 simplified to a unidirectional search, the minimization of g共x2兲 may proceed as in iteration 1 关see Eq. 共7兲兴. In analogy to Eq. 共9兲, the optimal steplength in the direction p1 therefore becomes rT1 Ap1 , pT1 A2p1

共18兲

and x2 has thus been determined. The residual at x2 becomes 共19兲

r2 = b − Ax2 , or using Eq. 共16兲 r2 = r1 − ␣共1兲 1 Ap1 .

共20兲

There are a number of relations that will be used when deriving the subspace equations for the following iterations. Multiplying Eq. 共17兲 with pT0 A2 gives 共21兲

pT0 A2p1 = 0. From Eqs. 共20兲 and 共18兲, one obtains

共22兲

rT2 Ap1 = 0, and Eqs. 共20兲, 共11兲, and 共21兲 give

共23兲

rT2 Ap0 = rT2 Ar0 = 0.

By using Eq. 共17兲 to obtain an expression for r1 and using Eqs. 共22兲 and 共23兲, one obtains 共24兲

rT2 Ar1 = 0.

Finally, by expressing Ap0 in the form of Eq. 共12兲 and using Eqs. 共23兲 and 共24兲, one obtains 共25兲

rT2 A2p0 = 0. 4. Iteration n + 1

Let us now consider iteration n + 1. The previous directions and residuals fulfill the relations i, j = 0,1, . . . ,n,

i ⬎ j + 1,

共28兲

i, j = 0,1, . . . ,n − 1.

共29兲

i, j = 0,1, . . . ,n,

pTi A2p j = pTi A2p j␦ij,

The new trial vector is initially written as a general vector in the space spanned by the previous search directions 兵pi其 , i = 0 , 1 , . . . , n − 1 and the current residual rn 共n兲 xn+1 = xn + 兺 ␣共n兲 i pi + ␣n rn .

共30兲

Minimizing g共xn+1兲 with respect to the n + 1 free parameters and using Eqs. 共26兲, 共28兲, and 共29兲, one obtains the subspace equations



 A2

where the direction p1 is

rTi Ap j = 0,

rTi A2p j = 0,

共27兲

i=0

The trial vector of Eq. 共13兲 minimizing the residual norm may therefore be expressed as a unidirectional search

␣共1兲 1 =

i ⬎ j,

i, j = 0,1, . . . ,n,

n−1

pT0 A2r1 ␣共1兲 . 1 pT0 A2p0

p1 = r1 −

rTi Ar j = 0,

i ⬎ j,

共26兲

0 0

0

0

T pn−1 A2pn−1 T A 2r n pn−1

T pn−1 A 2r n rTn A2rn

冣 冢 冣 ␣

共n兲

0 0

=

,

共31兲

rTn Arn

where elements of matrix A˜2 are defined as  A2ij = pTi A2p j␦ij,

i, j = 0,1, . . . ,n − 2.

共32兲

Due to the form of the subspace equations, xn+1 may be expressed in terms of a single search direction pn and of the optimal steplength ␣共n兲 n as xn+1 = xn + ␣共n兲 n pn ,

共33兲

where pn = rn −

␣共n兲 n =

T A 2r n pn−1 T pn−1 A2pn−1

rTn Apn pTn A2pn

pn−1 ,

.

共34兲

共35兲

The residual of xn+1 is then obtained as rn+1 = rn − ␣共n兲 n Apn .

共36兲

Equations 共26兲–共29兲 are valid when n is increased by 1 and the iterative procedure of the CR algorithm is therefore established by induction.23 In iteration n, we need to store the following vectors: xn, pn−1, Apn−1, and rn. CR method requires the storage of one vector, Ap, more than the CG method. In iteration n + 1, the linear transformation Arn may be carried out, and xn+1, pn , Apn may be obtained by using Eqs. 共33兲–共35兲. Finally, rn+1 may be obtained by using Eq. 共36兲. Each iteration represents a minimization of the residual in a subspace containing the subspace of the previous iteration with one new direction added and the residual therefore decreases in each iteration. It is seen from Eq. 共29兲 that the directions pi constitute a set of linear independent vectors. In iteration d, one is therefore minimizing the residual in the full d-dimensional vector space, and the exact solution x쐓 is therefore obtained in this iteration. In exact arithmetic, the CR algorithm therefore converges to the exact solution in a number of iterations

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-4

J. Chem. Phys. 128, 204105 共2008兲

Ziółkowski et al.

given by the dimension of the matrix. If iteration d + 1 is started, the residual rd+1 is vanishing, so Eqs. 共35兲 and 共34兲 cannot be used to obtain a direction pd+1 that is conjugate to pd. When the CR method is used, usually in a preconditioned form 共see Sec. II C兲, a good approximation to the exact solution is obtained in a number of iterations that is much smaller than the dimension of the matrix. In Sec. IV, we will discuss the rate of convergence for the CG and CR methods. B. The conjugate residual with optimal trial vectors algorithm

In the CR algorithm described in the previous subsection, iteration n + 1 may be viewed as a minimization of the residual norm in the space spanned by the pi , i = 0 , 1 , . . . , n − 1 and rn. As the norm of a vector is independent of the basis in which it is expressed, any other basis that spans the same space produces the same trial solution xn+1. There is therefore considerable flexibility in the way search directions may be chosen. We consider a choice—the CROP method— where a generalization of the CR method to nonlinear equations is straightforward. Consider a space of the form of Eq. 共30兲 but restrict the space to contain only the residual rn and the last n − k search directions with 0 艋 k 艋 n − 1,

space spanned by the ˜xn+1 − xn and xi − xn , i = k , k + 1 , . . . , n − 1. For k = 0, we have that the minimization of the residual norm is carried out in the space n−1

˜ n+1 − xn兲, xn+1 = xn + 兺 ci共xi − xn兲 + cn+1共x

共43兲

i=0

giving an approximate solution xn+1 that is identical to the one obtained in the CR method. Furthermore, as we have shown that the optimal trial vector of iteration n + 1 is contained in the space spanned by rn and pn−1, the optimal trial vector is contained in the space spanned by ˜xn+1 − xn and xn−1 − xn, ˜ n+1 − xn兲. xn+1 = xn + cn−1共xn−1 − xn兲 + cn+1共x

共44兲

We denote the parametrization of the CR algorithm, where OPtimal trial vectors are used as the CROP algorithm, and discuss now the equations that may be used to determine the coefficients ci of Eq. 共43兲. The residual corresponding to Eq. 共43兲 may be written as n

˜ n+1 − rn兲, rn+1 = rn + 兺 ci共ri − rn兲 + cn+1共r

共45兲

i=0

where

n−1

共n兲 xn+1 = xn + 兺 ␣共n兲 i pi + ␣n rn .

共37兲

i=k

From the residual rn, we further introduce a preliminary improvement of xn as 共38兲

˜xn+1 = xn + rn . Using Eq. 共33兲 for a general index pi =

When the c-parametrization in Eq. 共43兲 is used, the minimization of the norm of rn+1 in Eq. 共45兲 requires that a small set of linear equations are solved. Minimizing 储rn+1储2 with respect to ci gives the set of linear equations of dimension n +1

共39兲

␣共i兲 i

where

gives xn+1 = xn + 兺 i=k

␣共n兲 i ˜ n+1 − xn兲, 共xi+1 − xi兲 + ␣共n兲 n 共x ␣共i兲 i

共40兲

Dij =

which may be written as n−1

␣共n兲 i 共i兲 共xi+1 i=k ␣i

xn+1 = xn + 兺

˜ n+1 − xn兲 − xn + xn − xi兲 + ␣共n兲 n 共x

n−1

˜ n+1 − xn兲, = xn + 兺 ci共xi − xn兲 + cn+1共x

共41兲

i=k

where

ci =



␣共n兲 n 共n兲 ␣i−1 共i−1兲 − ␣i−1 ␣共n兲 − k共k兲 ␣k

if i = n + 1

␣共n兲 i if k ⬍ i ⬍ n ␣共n兲 i if i = k.

共47兲

Dc = h,

xi+1 − xi

n−1

共46兲

˜rn+1 = b − Ax ˜ n+1 .



hi =





共ri − rn兲T共r j − rn兲

if i, j 艋 n

˜ n − rn兲 if i 艋 n, j = n + 1 共ri − rn兲 共r T

˜ n − rn兲T共r j − rn兲 if i = n + 1, j 艋 n 共r ˜ n − rn兲T共r ˜ n − rn兲 if i = j = n + 1, 共r

− 共ri − rn兲Trn if i 艋 n ˜ n − rn兲Trn if i = n + 1. − 共r





共48兲

共49兲

The equations for determining the coefficients c may be solved in an alternative fashion which emphasizes the similarity to the DIIS method of Pulay.15,16 By introducing in Eq. 共43兲, the coefficient n−1

cn = 1 − 兺 ci − cn+1 ,

共50兲

i=0

共42兲

Note that no coefficient cn is defined. The space spanned by rn and the last n − k search directions is thus identical to the

a symmetric expression is obtained n

xn+1 = 兺 cixi + cn+1˜xn+1 ,

共51兲

i=0

where the parameters ci are constrained by the relation obtained from Eq. 共50兲

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-5

J. Chem. Phys. 128, 204105 共2008兲

Solving nonlinear equations with a minimal number of trial vectors

n+1

ci = 1, 兺 i=0

共52兲

and the expansion coefficients are determined by minimizing the residual norm

using Eq. 共61兲, writing the residual in the Y basis as rP = bP − APY = PTr,

共62兲

and introducing the matrix C C−1 = PPT .

共63兲

n

rn+1 = 兺 ciri + cn+1˜rn+1 .

共53兲

The modified CR equations then read as −1 xn+1 = xn + ␣共n兲 n C pn ,

i=0

The reduced expression for xn+1 of Eq. 共44兲 may similarly be written as xn+1 = cn−1xn−1 + cnxn + cn+1˜xn+1 ,

共64兲

where

共54兲

pn = rn −

共55兲

␣共n兲 n =

T C−1AC−1AC−1rn pn−1 T pn−1 C−1AC−1AC−1pn−1

pn−1 ,

共65兲

with the constraint cn−1 + cn + cn+1 = 1,

where the expansion coefficients are determined by minimizing the norm of the residual 共56兲

rn+1 = cn−1rn−1 + cnrn + cn+1˜rn+1 .

The minimization of the norm of residual of xn+1 in the parametrization of Eq. 共51兲 with the constraint of Eq. 共52兲 may be carried out as a standard DIIS determination of the expansion coefficients. Using an undetermined Lagrange multiplier ␭ to ensure the fulfillment of the constraint Eq. 共52兲, the minimization is expressed in terms of the solution of a set of n + 1 linear equations



BCR − 1 −1

0

冊冉 冊 冉 冊

0 c = , −1 ␭

BCR ij =



if i, j 艋 n if i 艋 n, j = n + 1 if i = n + 1, j 艋 n if i = j = n + 1.



P Ax − P b = 0.

P

共59兲

共60兲

A Y − b = 0, where AP = PTAP,

bP = PTb.

共67兲

Choosing P such that C is a good approximation to A ensures that the linear equations are solved on a basis where the matrix A has a lower condition number. In the preconditioned CR algorithm, we store in iteration n: xn, C−1pn−1, AC−1pn−1, C−1rn, and carry out the AC−1rn transformation. T

−1

共58兲

Introducing the new set of coordinates Y = P−1x, Eq. 共59兲 gives P

−1 rn+1 = rn − ␣共n兲 n AC pn .

In Sec. II C, we have introduced preconditioning to the CR algorithm. In this section, we precondition the CROP algorithm. We use the CROP algorithm in the Eq. 共57兲 as this is most commonly used. Since the CROP algorithm uses Eq. 共57兲 to determine the expansion coefficients, we have to calculate elements of the B matrix on the Y basis. For example, the element for i , j ⬍ n + 1 becomes

To improve convergence, it is a standard practice to introduce preconditioning, i.e., to introduce a coordinate transformation that produces a new set of equations with a matrix that has a lower condition number. To accomplish this transformation, the linear equations, Eq. 共1兲, are multiplied with the transpose of a nonsingular matrix P T

and

D. The preconditioned conjugate residual with optimal trial vectors method

C. The preconditioned conjugate residual algorithm

T

共66兲

共57兲

where rTi r j rTi ˜rn+1 T ˜rn+1 rj T ˜rn+1˜rn+1

rTn C−1AC−1pn , T −1 pn C AC−1AC−1pn

共61兲

We may thus solve Eq. 共60兲 by using the CR algorithm and then backtransform this solution to the original coordinates. An alternative approach is to solve Eq. 共60兲 in the original basis using the modified CR equations. This may be done

Bij = 共rPi 兲TrPj = 共PTri兲TPTr j = rTi PPTr j = rTi C−1r j ,

共68兲

where we have used Eq. 共63兲. Equation 共58兲 takes the form

Bij =



rTi C−1r j

if i, j 艋 n

rTi C−1˜rn+1 T ˜rn+1 C−1r j T ˜rn+1 C−1˜rn+1

if i 艋 n, j = n + 1 if i = n + 1, j 艋 n if i = j = n + 1.



共69兲

The next trial vector on the Y basis may, according to Eq. 共38兲, be written as ˜ Y n+1 = Yn + rn .

共70兲

Using Y = P x and Eq. 共63兲, we can backtransform Eq. 共70兲 to the following x basis: −1

˜xn+1 = xn + PPTrn = xn + C−1rn .

共71兲

In our implementation using the reduced expressions of Eq. 共54兲, we store the vectors 兵xn−1 , xn , ˜xn+1其 and the unpreconditioned residuals 兵rn−1 , rn ,˜rn+1其. The preconditioner is instead directly used when B is determined in Eq. 共69兲 and

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-6

J. Chem. Phys. 128, 204105 共2008兲

Ziółkowski et al.

when a new trial vector is calculated using Eq. 共71兲. Note that it is only when the matrix B includes the preconditioner as in Eq. 共69兲 that the results of using the three-dimensional basis 兵xn−1 , xn , ˜xn+1其 and the full basis are identical.

n−1

˜ i − ˜xn兲, xn = ˜xn + 兺 ci共x

共78兲

i=0

and may be found by minimizing the norm of the residual n−1

r共xn兲 = V共xn兲 + V 共xn兲 兺 ci共xi − xn兲. 共1兲

III. MINIMAL RESIDUAL METHODS FOR NONLINEAR EQUATIONS

共79兲

i=1

Expanding V共x兲 as We now turn our attention to the solution of sets of nonlinear equations. For a vector-value function V = V共x兲, we are thus interested in finding a vector x쐓 such that 쐓

V共x 兲 = 0.

共72兲 쐓

The solution vector x may be determined using an iterative procedure. At iteration n, we have the approximate solution xn, and at iteration n + 1, the approximate solution may be expressed as follows: xn+1 = xn + ⌬xn .

共73兲

˜ i兲 = V共x ˜ n兲 + V共1兲共x ˜ n兲共x ˜ i − ˜xn兲 + O共储x ˜ i − ˜xn储2兲 V共x

and truncating after linear terms gives the quasi-Newton condition ˜ i兲 − V共x ˜ n兲 = V共1兲共x ˜ i兲共x ˜ i − ˜xn兲. V共x

n−1

˜ n兲 + 兺 ci共V共x ˜ i兲 − V共x ˜ n兲兲 r共xn兲 = V共x

r共xn兲 = V共xn兲 + V共1兲共xn兲⌬xn .

共75兲

⌬xn may be determined by minimizing the residual norm

⳵ T r 共xn+1兲r共xn+1兲 = 2V共1兲共xn兲T共V共xn兲 + V共1兲共xn兲⌬xn兲 ⳵⌬xn = 0,

共76兲

which is satisfied when the Newton equations are solved V共xn兲 + V共1兲共xn兲⌬xn = 0.

共77兲

Minimizing the residual norm leads to a quadratically convergent iterative scheme. Each iteration of the minimum residual scheme requires that a set of linear equations 关Eq. 共77兲兴 are solved. Although the set of linear equations may be solved using direct methods without explicit construction of V共1兲, this is not an efficient approach for most quantum chemical problems. Significantly, more efficient methods may be obtained using quasi-Newton methods where the exact Jacobian is replaced by an approximate Jacobian. In the next subsection, we show that the DIIS algorithm may be viewed as a quasi-Newton method. Further, if optimal solution vectors are stored in the DIIS algorithm, a generalization of the CROP algorithm is obtained.

A. The DIIS method for nonlinear equations

Let us assume we have carried out an iterative procedure ˜ 0 , ˜x1 , ˜x2 , . . . , ˜xn其 and and at iteration n have stored ˜xn : 兵x n ˜ ˜ ˜ ˜ ˜ V : 兵V共x0兲 , V共x1兲 , V共x2兲 , . . . , V共xn兲其. The optimal solution in the subspace ˜xn may be parametrized as

i=0



n−1



n−1

n

i=1

i=0

˜ n兲 + 兺 ciV共x ˜ i兲 = 兺 ciV共x ˜ i兲, = 1 − 兺 ci V共x

共74兲

where V共1兲共xn兲 is the Jacobian. For a linear vector function, the determination of x쐓 is equivalent to solving linear equations. For a nonlinear expansion, the residual is defined here in terms of the linearized form of V共x兲

共81兲

Using the quasi-Newton condition, the residual in Eq. 共79兲 may be written as

The vector function may be expanded around xn as V共xn+1兲 = V共xn兲 + V共1兲共xn兲⌬xn + O共储⌬xn储2兲,

共80兲

i=0

共82兲 where we have introduced the parameter cn and introduced the constraint n

ci = 1. 兺 i=0

共83兲

Minimizing the residual norm rT共xn兲r共xn兲 with respect to the parameters ci , i = 0 , 1 , . . . , n with the constraint Eq. 共83兲 represents a standard DIIS determination of the 兵ci其 coefficients ˜ 0兲 , V共x ˜ 1兲 , . . . , V共x ˜ n兲其 as error vectors.15,16 considering 兵V共x Using an undetermined Lagrange multiplier to ensure that the constraint Eq. 共83兲 is fulfilled, the expansion coefficients in DIIS are determined from the equations



BDIIS − 1 −1

0

冊冉 冊 冉 冊

0 c = , −1 ␭

共84兲

where ˜ i兲V共x ˜ j兲, BDIIS = VT共x ij

i, j 艋 n.

共85兲

The preliminary solution at iteration n + 1 may be expressed as ˜xn+1 = xn + r共xn兲,

共86兲

where r共xn兲 is obtained from Eq. 共82兲 with the coefficients determined from Eq. 共84兲. The DIIS iterative procedure is ˜ n+1兲 to the subspaces ˜xn established by adding ˜xn+1 and V共x ˜ n, respectively. Within the minimal residual frameand V work, the DIIS algorithm may therefore be viewed as a quasi-Newton method where the quasi-Newton condition is applied to the ˜xn subspace. For the case where the expansion of V共x兲 contains only linear terms, the DIIS algorithm gives the same result as the CROP parametrization of the CR algorithm. To see this, Eq. 共86兲 may be compared to Eq. 共38兲. In both cases, ˜xn+1 con-

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-7

J. Chem. Phys. 128, 204105 共2008兲

Solving nonlinear equations with a minimal number of trial vectors

sists of the approximate solution xn with the residual r共xn兲 added. In Eq. 共86兲, xn is the optimal solution in the subspace ˜ 0 , ˜x1 , . . . , ˜xn−1 , ˜xn其, while xn in Eq. 共38兲 is the optimal so兵x lution in the subspace 兵x0 , x1 , . . . , xn−1 , ˜xn其 关in Eq. 共51兲 the subspace is given for xn+1兴. These two subspaces span the same space, and the same solution xn is therefore obtained as long as no truncations are carried out in the subspaces. The DIIS and the CROP algorithms therefore give identical iteration sequences. When a truncation is carried out in the subspaces, different results are obtained whether it is performed in the ˜ 0 , ˜x1 , . . . , ˜xn−1 , ˜xn其 or ln the 兵x0 , x1 , . . . , xn−1 , ˜xn其 subspace. 兵x For an expansion V共x兲 with only linear terms and a symmetric Jacobian, all except the three last vectors may be discarded in the 兵x0 , x1 , . . . , xn−1 , ˜xn其 subspace, still giving the same xn value, as shown in Sec. II B for the CROP method. For the subspace 兵x0 , x1 , . . . , xn−1 , ˜xn其, we also have that the vector norms in the transformed subspace 储r共xi兲储 de˜ 兲 is a linear function. Concrease for increasing i when V共x ˜ 0 , ˜x1 , . . . , ˜xn−1 , ˜xn其, the vector norms trary in the subspace 兵x ˜ 兲储 may increase for increasin the transformed subspace 储V共x ing i. Trial vectors ˜xi with large values of i are therefore not necessarily better approximations to x쐓 than those with lower i. This has serious consequences for the convergence and no truncations can be carried out in the subspace ˜ 0 , ˜x1 , . . . , ˜xn−1 , ˜xn其 without having dramatic consequences 兵x for the convergence of the iterative scheme. Summarizing, it may be an advantage to set up an iteration algorithm where rather than the subspace 兵x0 , x1 , . . . , xn−1 , ˜xn其 ˜ 0 , ˜x1 , . . . , ˜xn−1 , ˜xn其 is used. We describe in the next subsec兵x tion, the CROP generalization of the DIIS algorithm where this is performed.

B. The conjugate residual with optimal trial vector algorithm for nonlinear equations

Let us assume we have carried out an iterative procedure where the subspace is built from the optimal vectors xi. In iteration n, we have thus available the subspaces: xn : 兵x0 , x1 , x2 , . . . , xn其 and the transformed subspace rn : 兵r共x0兲 , r共x1兲 , r共x2兲 , . . . , r共xn兲其. Our next vector is given as ˜xn+1 = xn + r共xn兲.

共87兲

˜ n+1兲. We next From ˜xn+1, we may calculate the vector V共x introduce the intermediate subspace: ¯xn+1 : 兵x0 , x1 , x2 , . . . , xn , ˜xn+1其 and the corresponding transformed subspace ¯rn+1 : 兵r共x0兲 , r共x1兲 , r共x2兲 , . . . , r共xn兲 , V共x ˜ n+1兲其. We may proceed finding the optimal solution in the ¯xn+1 subspace using an outline similar to the one in the previous section, where the optimal solution vector is parametrized as n

xn+1 = 兺 cixi + cn+1˜xn+1 , i=0

with the constraint

共88兲

n+1

ci = 1, 兺 i=0

共89兲

and where the coefficients are determined by minimizing the norm of the residual n

˜ n+1兲. r共xn+1兲 = 兺 cir共xi兲 + cn+1V共x

共90兲

i=0

The expansion coefficients in Eq. 共90兲 may be determined by solving the linear equation in Eq. 共57兲, where BCR is replaced by

= BCROP ij



rT共xi兲r共x j兲

if i, j 艋 n

˜ n+1兲 r 共xi兲V共x

if i 艋 n, j = n + 1

˜ n+1兲r共x j兲 V 共x

if i = n + 1, j 艋 n

T

T

˜ n+1兲V共x ˜ n+1兲 if i = j = n + 1. VT共x



共91兲

After having calculated xn+1 and r共xn+1兲, these may be added to the xn and rn subspaces and the iterative procedure is established. It may appear to be an undesirable feature of the above algorithm that residuals r共xi兲 and not the exact vector function V共xi兲 are used in Eq. 共90兲. However, for a nonlinear vector function, terms that are quadratic in xi − xn are anyhow neglected when the vector function V is linearized and the quasi-Newton condition is applied. It is also important to note that for the point of expansion ˜xn+1 in the intermediate subspace, the exact vector function is used. At this point, it may therefore be established if the iteration sequence is converged. The vectors in the transformed subspace are obtained using a norm-minimization algorithm, and the norms of the transformed vectors 储r共xi兲储 therefore decrease for increasing i for linear vector functions. Furthermore, as described previously, the algorithm reduces to the CROP parametrization of the CR algorithm for linear vector functions. For vector functions with small nonlinearities and a nearly symmetric Jacobian, only a limited loss of convergence may therefore be expected by reducing the subspace to the threedimensional subspace containing xn−1, xn, ˜xn+1. Note, however, that the subspace and transformed subspace vectors do not satisfy conjugacy relations even for a vector function with a small nonlinearity, and that the subspace equations representing the solution of Eq. 共88兲 关or Eq. 共88兲 in a truncated form兴 therefore must be solved explicitly. The CROP algorithm has the same form for sets of linear and nonlinear equations. The only difference is in the last element of the intermediate subspaces where the linear or nonlinear transformation is explicitly carried out. For the rest of the elements in the subspace, a linear mixing is carried out of the transformed vectors. To improve convergence, all calculations should be carried out by applying a preconditioner. In the CROP algorithm, the application of the preconditioner 关Eq. 共63兲兴 modifies the subspace equations in terms of the elements of the B matrix, that according to Eq. 共69兲 becomes

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-8

J. Chem. Phys. 128, 204105 共2008兲

Ziółkowski et al.

共92兲

Bij = rTi C−1r j .

New trial vectors are obtained, according to Eq. 共71兲, as ˜xn+1 = xn + C−1r共xn兲.

共93兲

IV. CONVERGENCE RATE OF THE CONJUGATE RESIDUAL ALGORITHM

A. The CG and CR methods as error-minimization methods

A measure of the error of the approximate solution xn to Eq. 共1兲 is the error vector 共94兲

en = xn − x ,

where x쐓 is the exact solution. In the nth iteration of the CG method, the function 21 xTn Axn − xTn b is minimized.4,5 This minimization may be expressed in terms of the error vector as 1 T 2 xn Axn

− xTn b = 21 eTn Aen − 21 x쐓Tb.

共95兲

Introducing the Frobenius norm for a vector x and a symmetric and positive definite matrix M, 储x储 M = 冑xTMx,

共96兲

it is seen that the CG method in iteration n minimizes the norm 储en储A. The residual may be expressed in terms of the error vector as 共97兲

− rn = Aen .

The CR method minimizes in iteration n the gradient norm 储rn储, which using Eqs. 共96兲 and 共97兲 may be rewritten as 储rn储 = 冑共Aen兲T共Aen兲 = 储en储A2 ,

共98兲

showing that the CR method in iteration n minimizes the norm 储en储A2. The CG and the CR methods thus minimize various vector space norms 储en储Aq



q = 1 for CG q = 2 for CR.





en = Pn共A兲e0 = 1 − 兺 ␥iAi e0 . i=1

Pn共0兲 = 1.

共101兲

储en储Aq = min储Pn共A兲e0储Aq ,

共100兲

i=0

where the expansion coefficients are determined by minimizing the appropriate vector space norm. The vector xn − x0 is a vector in the Krylov space Kn共A , r0兲 = span兵r0 , Ar0 , A2r0 , . . . , An−1r0其. Using Eq.共100兲, the error

共103兲

Pn

with the constraint Eq. 共102兲. To proceed, we introduce the spectral resolution of the matrix A in terms of eigenvalues ␣k and orthonormal eigenvectors vk d

A = 兺 ␣kvkvTk ,

共104兲

vTk vl = ␦kl .

共105兲

k=1

Writing the initial error vector on the eigenvector basis d

e 0 = 兺 ␰ kv k ,

共106兲

k=1

the vector Aie0 may be expressed as d

A e0 = 兺 ␣ik␰kvk ,

共107兲

i

k=1

where we have used Eqs. 共104兲 and 共105兲. Using Eq. 共101兲, the error vector of iteration n may, in the spectral representation, be expressed as d

e n = 兺 P n共 ␣ k兲 ␰ kv k ,

共108兲

k=1

where Pn共␣k兲 is the scalar polynomial corresponding to the matrix polynomial Pn共A兲 and therefore satisfies 共109兲

P共0兲 = 1.

From Eq. 共103兲, it is seen that Pn may be defined as the polynomial that is the solution to the minimization problem

冉兺 d

2

n−1

共102兲

The polynomial Pn is determined in the various iterations of the CG or CR algorithm. In iteration n, the norm 储en储Aq is minimized, and we may therefore consider the polynomial to be the solution of the variational problem

共99兲

In both the CG and CR methods, the approximate solution xn may be written as x n = x 0 + 兺 ␥ iA ir 0 ,



n

From Eq. 共101兲, it is seen that

We now examine the convergence rate of the CR method and compare it to the one of the CG method. The CR and CROP methods are mathematically identical for linear equations, and the CROP method is therefore not considered in this subsection. Although the CR algorithm converges for general symmetric matrices, in the following, it will be assumed that A is symmetric and positive definite.



vector in iteration n may be written as a polynomial Pn of the order n in the matrix A times the error vector of the initial iteration

储en储Aq = min Pn

共Pn共␣k兲兲2␣qk ␰2k

k=1



共110兲

with the constraint Eq. 共109兲. B. Convergence of the CR and CG methods

From Eq. 共110兲, it is seen that the convergence of an iteration sequence is very dependent on the number of different eigenvalues of A. As ␣k and ␰2k are positive quantities, convergence is obtained when the polynomial Pn共␣k兲 has all

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-9

J. Chem. Phys. 128, 204105 共2008兲

Solving nonlinear equations with a minimal number of trial vectors

eigenvalues as roots. As a polynomial of the order m has at most m roots, convergence is obtained when the order of the polynomial Pn, 共the number of iterations兲 is equal to the number of different eigenvalues of A. The CG and the CR methods therefore converge fast when there is only a few distinct eigenvalues. The number of eigenvalues of A that are different is usually much larger than the number of iterations and it, therefore, becomes important to estimate the rate of convergence of the iteration sequence. To this end, we first note that if Ꭽ is the set of all eigenvalues of A then for arbitrary polynomials Pn we have d

d

兺 共Pn共␣k兲兲2␣qk ␰2k 艋 max共Pn共␣兲兲2兺 ␣qk ␰2k ␣苸Ꭽ

k=1

k=1

= max共Pn共␣兲兲 储eo储Aq , 2

␣苸Ꭽ

共111兲

where ␣ is the eigenvalue for which the polynomial Pn has its largest eigenvalue and Eqs. 共104兲 and 共106兲 are used to 2 identify 储e0储A q. Inserting Eq. 共111兲 in Eq. 共110兲 gives an upper bound for the norm of the error vector 2

2

储en储Aq 艋 min max共Pn共␣兲兲2储e0储Aq, Pn

␣苸Ꭽ

where Pn共0兲 = 1.

Tn共x兲 = 21 关共x + 冑x2 − 1兲n + 共x + 冑x2 − 1兲−n兴.

The most important feature of the Chebyshev polynomials in the present context is their minimax property which may be stated as follows.6 Consider polynomials Pn共x兲 of order n defined in an interval of −1 艋 x 艋 1 having a fixed predefined value C at a given point xC, i.e., Pn共xC兲 = C. Among these polynomials, the polynomial which has the smallest largest numerical value is the scaled Chebyshev polynomial, cTn, where c is a constant that is determined by the predefined value of the polynomial. To apply the minimax property of the Chebyshev polynomials on the minimax problem in Eq. 共113兲, we first have to carry out a variable substitution where the variable ␣ defined in the interval 关␣min , ␣max兴 is substituted with another variable defined in the interval 关−1 , 1兴. This variable substitution may be carried out introducing

␻=

␣max + ␣min − 2␣ , ␣max − ␣min

储en储Aq 储e0储Aq

艋 min

max

Pn ␣苸关␣min,␣max兴

兩Pn共␣兲兩.

共113兲

Pn共␣兲 = cTn

Tn共x兲 = cos共n arccos共x兲兲.

共114兲

In the interval 关0 , ␲兴, where arccos x is defined, it may easily be shown cos共n␪兲 = 21 关共cos ␪ + 冑cos2 ␪ − 1兲n + 共cos ␪ + 冑cos2 ␪ − 1兲−n兴,

共115兲

using exp共i␪兲 = cos ␪ + 冑cos2 ␪ − 1 and cos n␪ = 21 关exp共in␪兲 + exp共−in␪兲兴. Inserting Eq. 共115兲 in Eq. 共114兲 and setting ␪ = arccos x gives the following closed form for the Chebyshev polynomials:





␣max + ␣min − 2␣ , ␣max − ␣min

共118兲

where the coefficient c is determined from the requirement that Pn共0兲 = 1. The c value therefore becomes c= Tn



1 , ␣max + ␣min ␣max − ␣min



共119兲

and the polynomial in Eq. 共118兲 becomes Tn

This replacement is feasible if the dimension of the matrix is large and the eigenvalues are evenly distributed in the interval between ␣min and ␣max. In the following, we will assume that these conditions are fulfilled. The polynomials that have the smallest maximum numerical value in a given interval are the so-called Chebyshev polynomials of the first degree or just the Chebyshev polynomials and they are well studied in the mathematical literature.24 We give a brief introduction to these. The Chebyshev polynomials Tn共x兲 are defined in the interval 关−1 , 1兴 in terms of trigonometric functions

共117兲

where ␻ goes from 1 to −1 when ␣ goes from ␣min to ␣max. The polynomial Pn共␣兲 may therefore be expressed as

共112兲 The polynomial that minimizes the error-norm of Eq. 共112兲 is not known. To obtain an estimate of it, we replace the above minimax problem for a discrete set of parameters Ꭽ with a minimax problem for a variable ␣ defined on the interval between the smallest and largest eigenvalue of A, ␣min, ␣max,

共116兲

P n共 ␣ 兲 =





␣max + ␣min − 2␣ ␣max − ␣min . ␣max + ␣min Tn ␣max − ␣min





共120兲

The polynomial in Eq. 共120兲 is thus the polynomial of the order n which has the smallest maximum value in the interval ␣min 艋 ␣ 艋 ␣max. Inserting this polynomial in Eq. 共113兲 gives 储en储Aq 储e0储Aq



max

␣苸关␣min,␣max兴



Tn





冊 冨

␣max + ␣min − 2␣ ␣max − ␣min ␣max + ␣min Tn ␣max − ␣min



.

共121兲

From Eq. 共114兲, it is seen that the numerical values of the Chebyshev polynomial are less than or equal to 1. Tn关共␣min + ␣min − 2␣兲 / 共␣max − ␣min兲兴 is therefore less than or equal to 1 for ␣ 苸 关␣min , ␣max兴 and Eq. 共121兲 may therefore be simplified as 储en储Aq 储e0储Aq



冉冏 冉 Tn

␣max + ␣min ␣max − ␣min

冊冏冊

−1

.

共122兲

Introducing the condition number of the matrix

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-10

␬=

J. Chem. Phys. 128, 204105 共2008兲

Ziółkowski et al.

␣max , ␣min

共123兲

Eq. 共122兲 becomes 储en储Aq 储e0储Aq



冉冏 冉 冊冏冊 Tn

␬+1 ␬−1

−1

共124兲

.

Using Eq. 共116兲 for the Chebyshev polynomial gives after a bit of algebra 储en储Aq 储e0储Aq



2

冉 冊 冉 冊 冑␬ + 1 冑␬ − 1

n

+

冑␬ − 1 冑␬ + 1

n.

共125兲

As the term 共冑␬ − 1兲 / 共冑␬ + 1兲 goes to zero for large values of n, it is standard to neglect this term from the convergence rate, giving the standard expression for the convergence of the CG and CR method 储en储Aq 储e0储Aq

艋2

冉 冊 冑␬ − 1 冑␬ + 1

n

.

共126兲

We thus conclude that the CG and CR methods have the same rate of convergence for a matrix of large dimensions and with evenly distributed eigenvalues. In the above, we have analyzed errors in terms of the error vectors ei, which may be computed only after convergence. The errors may alternatively be expressed in terms of the residuals ri 关see Eq. 共97兲兴 which are computed in each iteration. In the CR algorithm, the norms 储ri储 are by definition minimized in each iteration. Using the CG algorithm that minimizes the norm 储ei储A, it is seen that each iteration in this method minimizes the norm 储ri储A−1, which is not trivial to compute. In actual calculations, significantly faster convergence is often observed than predicted by Eq. 共126兲. In fact, the CG method, in general, converges superlinearly.25 This arises because the CG algorithm during the iterative process accurately solves the linear equations for the extreme eigenvalues and the effective condition number of the matrix therefore becomes significantly reduced during the iterative procedure. V. TEST CALCULATIONS

In this section, we present numerical studies of the convergence of the presented algorithms. We consider the second order Møller–Plesset 共MP2兲 amplitude equations in the projected Löwdin orthogonalized atomic orbital 共PLAO兲21 basis as an example of linear equations and the coupledcluster singles doubles 共CCSD兲 amplitude equations also in the PLAO basis as an example of nonlinear equations.22 For both the MP2 and CCSD amplitude equations, we used the diagonal preconditioner described by Weijo et al.21 The MP2 amplitude equations were solved for C6H2 using the ccpVTZ basis,26 at the equilibrium geometry obtained at the MP2/cc-pVTZ level. The CCSD amplitude equations were solved for H2O using the cc-pVDZ basis,25 at the equilibrium geometry obtained at the MP2/cc-pVTZ level. At the equilibrium geometry, the CCSD amplitude equations have a weak nonlinearity. To investigate the performance for in-

FIG. 1. Convergence of the MP2 equations in the PLAO basis using the CROP algorithm with a three vector history and DIIS with a three and five vector history.

creasing nonlinearity, we consider also calculations for H2O where for a fixed H-O-H bond angle, the O-H bond distance is stretched to 1.5, 2.0, and 2.5 times its equilibrium value Re maintaining the C2v geometry. The convergence of the MP2 amplitude equations for C6H2 is displayed in Fig. 1 using the CROP algorithm with a three vector subspace, CROP共3兲. For comparison, the DIIS results are also reported using a subspace of three vectors, DIIS共3兲, and using five vectors, DIIS共5兲. The iteration sequence for the CROP共3兲 calculation is identical to the one obtained without truncation in the subspace. Without truncation in the subspace, DIIS also reproduces the CROP共3兲 results. However, contrary to the CROP algorithm, the DIIS method gives a significant degradation in performance for a truncated subspace, as seen in Fig. 1. With a three vector subspace, DIIS共3兲, convergence to a residual norm 10−5 thus takes 70 iterations while the CROP共3兲 calculation uses 24 iterations to obtain the same threshold. A 5 vector subspace, DIIS共5兲, uses 42 iterations. For sets of linear equations, the CROP algorithm thus preserves the history of all previous iterations with a three vector subspace, whereas a significant degradation is obtained in DIIS when truncation is performed in the subspace. The convergence of the CCSD amplitude equations at the equilibrium geometry and with the O–H bond lengths stretched to 1.5Re, 2.0Re, and 2.5Re is given in Fig. 2. The nonlinearity of the amplitude equations increases when the bond length is stretched as reflected by the maximum value of the two electron excitation amplitudes of 1.8⫻ 10−2, 2.9 ⫻ 10−2, 5.4⫻ 10−1, and 2.2⫻ 10−1 for the 1.0Re, 1.5Re, 2.0Re, and 2.5Re calculations, respectively. In Fig. 2, we give results for the CROP and DIIS algorithm with no truncation of the subspace and for truncated subspaces. For the bond distances 1.0Re, 1.5Re, and 2.0Re, a three vector subspace, CROP共3兲, leads to a minor degradation in performance. For the DIIS method, the degradation is large with a three vector subspace, DIIS共3兲 and still significant with a five vector subspace, DIIS共5兲. For the O–H bond distance 2.5Re, the nonlinearity is large. Keeping all vectors in the CROP algorithm leads to a monotonic but slow convergence, whereas for a 20 vector subspace, CROP共20兲, convergence could not be ob-

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-11

Solving nonlinear equations with a minimal number of trial vectors

J. Chem. Phys. 128, 204105 共2008兲

tained. For the DIIS approach with a full subspace, a rather erratic behavior is observed in the initial 50 iterations after which the DIIS algorithm converges fast. With a 20 vector subspace, DIIS共20兲, the erratic behavior is extended up to about iteration 70. To summarize, it appears that for a weak to moderate nonlinearity, it is sufficient to keep a three vector subspace in the CROP algorithm. For a stronger nonlinearity a larger subspace is required. In all cases the CROP algorithm outperformed the DIIS approach. VI. CONCLUSIONS

Minimal residual algorithms for the solution of linear and nonlinear equations have been examined. In particular, it has been shown that the CR has the same attractive feature as the CG algorithm, namely, that for linear equations only information from the last iteration is required to ensure that the solution is conjugate to the directions of all previous iterations. This feature is used to set up the conjugate residual with optimal trial vectors 共CROP兲 algorithm where optimal trial vectors of each iteration rather than the directions are stored and where information from the last iteration is sufficient to reproduce the CR result. The CR and CG algorithms are shown to have the same convergence characteristics. The CROP algorithm for linear equations gives the same result as the DIIS algorithm where no truncation is made in the subspace. However, for a truncated subspace, the DIIS algorithm shows a significant degradation of performance as demonstrated by the test calculations for the MP2 amplitude equations. The CROP algorithm thus constitutes an optimal way of preserving the information from the previous iterations. The use of a subspace containing trial vectors, allows a straightforward extension of the CROP algorithm to nonlinear equations. Test calculations for nonlinear CCSD amplitude equations show that for a weak to moderate nonlinearity, only a minor degradation in performance is obtained with a three vector subspace. For a strong nonlinearity, a larger subspace may be required. Summarizing, the CROP algorithm constitutes an efficient algorithm for solving nonlinear equations outperforming the DIIS algorithm through a more efficient way of preserving the information from the previous iterations. ACKNOWLEDGMENTS

This work has been supported by the Lundbeck Foundation and The Danish Natural Science Research Council 共Grant No. 21-04-0268兲. FIG. 2. Convergence of the CCSD equations in the PLAO basis for the H2O molecule using the CROP and DIIS algorithms with various sizes of the subspace. Four geometries with increasing nonlinearity of the CCSD equations are considered. These geometries correspond to the bond distances of 1.0Re, 1.5Re, 2.0Re, and 2.5Re, where Re is the equilibrium distance at the MP2/cc-pVTZ level of theory.

1

P. Sałek, S. Høst, L. Thøgersen, P. Jørgensen, P. Manninen, J. Olsen, B. Jansík, S. Reine, F. Pawłowski, E. Tellgren, T. Helgaker, and S. Coriani, J. Chem. Phys. 126, 114110 共2007兲. 2 A. Daniels and G. Scuseria, Phys. Chem. Chem. Phys. 2, 2173 共2000兲. 3 T. Helgaker, P. Jørgensen, and J. Olsen, Molecular Electronic-Structure Theory 共Willey, Chichester, 2002兲. 4 M. R. Hestenes and E. Stiefel, J. Res. Natl. Bur. Stand. 49, 409 共1952兲. 5 W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C + +: The Art of Scientific Computing, 2nd ed. 共Cambridge University Press, Cambridge, 2002兲. 6 J. R. Shewchuk, An Introduction to the Conjugate Gradient Method With-

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

204105-12

J. Chem. Phys. 128, 204105 共2008兲

Ziółkowski et al.

out the Agonizing Pain 共Carnegie Mellon University, Pittsburgh, PA, 1994兲. 7 Y. Saad, Iterative Methods for Sparse Linear Systems, 2nd ed. 共SIAM, Philadelphia, 2003兲. 8 J. Pople, R. Krishnan, H. Schlegel, and J. S. Binkley, Int. J. Quantum Chem., Quantum Chem. Symp. 13, 225 共1979兲. 9 P. E. S. Wormer, F. Visser, and J. Paldus, J. Comput. Phys. 48, 23 共1982兲. 10 C. C. Paige and M. A. Saunders, SIAM 共Soc. Ind. Appl. Math.兲 J. Numer. Anal. 12, 617 共1975兲. 11 Y. Saad and M. Schultz, SIAM 共Soc. Ind. Appl. Math.兲 J. Sci. Stat. Comput. 7, 856 共1986兲. 12 E. Stiefel, Comment. Math. Helv. 29, 157 共1955兲. 13 P. Vinsome, Proc. Fourth Symposium on Reservoir Simulation, Society of Petroleum Enginers of AIME 1976 共unpublished兲, pp. 149–159. 14 S. C. Eisenstat, H. C. Elman, and M. H. Schultz, SIAM 共Soc. Ind. Appl. Math.兲 J. Numer. Anal. 20, 345 共1983兲. 15 P. Pulay, Chem. Phys. Lett. 73, 393 共1980兲. 16 P. Pulay, J. Comput. Chem. 3, 556 共1982兲.

P. Csaszar and P. Pulay, J. Mol. Struct. 114, 31 共1984兲. O. Farkas and H. B. Schlegel, Phys. Chem. Chem. Phys. 11, 11 共2002兲. 19 G. E. Scuseria, T. J. Lee, and H. F. Schaefer, Chem. Phys. Lett. 130, 236 共1986兲. 20 J. Olsen, in Molecular Quantum Mechanics: Analytic Gradients and Beyond, edited by A. G. Csaszar, G. Fogarasi, H. F. Schaefer, and P. G. Szalay 共ELTE Institute of Chemistry, Budapest, 2007兲, pp. 85–89. 21 V. Weijo, P. Manninen, P. Jørgensen, O. Christiansen, and J. Olsen, J. Chem. Phys. 127, 074106 共2007兲. 22 O. Christiansen, P. Manninen, P. Jørgensen, and J. Olsen, J. Chem. Phys. 124, 084103 共2006兲. 23 See EPAPS Document No. E-JCPSA6-128-054821 for induction proofs of the conjugate residual algorithm. For more information on EPAPS, see http://www.aip/pubservs/epaps.html. 24 G. Arfken, Mathematical Methods for Physicists, 3rd ed. 共Academic, Orlando, Florida, 1985兲, pp. 731–748. 25 A. van der Sluis and H. A. van der Vorst, Numer. Math. 48, 543 共1986兲. 26 J. Thom, H. Dunning, J. Chem. Phys. 90, 1007 共1989兲. 17 18

Downloaded 24 Mar 2011 to 130.37.129.78. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions