WCIP acceleration - IEEE Xplore

TH4C-2

Proceedings of Asia-Pacific Microwave Conference 2010

WCIP acceleration Nathalie Raveu #1, Luc Giraud *2, Henri Baudrand #3 #²

Université de Toulouse, INPT, Laplace, ENSEEIHT, CNRS 2 rue Charles Camichel, 31071 Toulouse cedex, France 1 [email protected] 3 [email protected] * INRIA Bordeaux – Sud Ouest joint INRIA CERFACS, lab. on high performance computing 42 avenue Coriolis 31057 Toulouse cedex 2 [email protected]

[

Abstract — The Wave Concept Iterative Procedure is an integral method based on a two equation system solution. The original formulation can be interpreted as a Richardson procedure for which the convergence condition is always satisfied despite a very slow convergence rate. In this work several improvements are presented such as an automated mesh, preconditioning and interpolation of the Richardson procedure and a similar work developed on the GMRES system solution. Performances are significantly improved with a reduction up to 96% of the number of iteration; accuracy is attested through comparison with HFSS results. Index Terms — Integral method acceleration, preconditioning, Interpolation, GMRES solution.

I.

]

⎧ Δ + k 02 ε r u = 0 ⎪ ∂uδΓ ⎪ u = 0 or 1 =0 ⎪ δΓ1 ∂n ⎪ ⎪ ∂uδΓ 2 ⎪⎪ uδΓ = 0 or =0 2 ⎨ ∂n ⎪ ⎪ ∂uδΓ =0 ⎪ uδΓ = 0 or ∂n ⎪ ⎪ ⎪ uδS = 0 and ∂uδS = 0 ⎪⎩ ∂n

INTRODUCTION

For the last ten years, the Wave Concept Iterative Procedure has been very promising modeling method for multilayered and multilevel circuit characterization [1]. Although it always converges its main drawback remains its computation time. Several improvements were investigated to reduce computation time such as avoiding higher order modes or inserting partially metal-insulate meshes or modifying physical parameters [2] however performances were not significantly improved, computation time remains relatively slow. In this work we investigated an alternative matrix-free solver used in conjunction with acceleration techniques to improve computation time at a moderate extra memory cost. Computation time and number of iterations are compared between the different numerical techniques and accuracy is insured through comparison with the Finite Element Method reference software for circuit design called HFSS.

(1)

u stands for electric and magnetic fields; εr corresponds to the relative permittivity of the media; k0 is the free space wave number; n designs the normal vector to the considered surface. Boundary conditions depend on the circuit to solve. δΓ2 δΓ

δΓ

δΓ

δΓ

δS II. WCIP PRINCIPLES δΓ1

The WCIP has been developed to solve Helmotz equation for circuits printed over homogeneous substrate confined in metallic or magnetic boxes as represented on fig. 1. System under considerations is therefore depicted in (1). Fig. 1.

Circuit configuration

Operators have been defined on incident A and reflected B waves (2) instead of electromagnetic fields ET and HT to insure convergence though spectral radius lowers than 1. Both couples satisfy system (1). Z0 is generally chosen equal

Copyright 2010 IEICE

971

III. IMPROVEMENTS

to the free space impedance and nδS is the normal vector to the interface δS.

A. Mesh refinement automation

1 ⎧ ⎪ A = 2 Z ET + Z 0 H T ∧ nδS ⎪ 0 ⎨ 1 ⎪B = ET − Z 0 H T ∧ nδS ⎪⎩ 2 Z0

(

)

(

)

Accurate results are directly linked to meshes conditions. Several rules must be met: - Minimum mesh size must be fixed around λ/100. Usually the mesh is done once at the highest frequency of the band. - Printed metallization should be described on a minimum of 5 meshes for an accurate current description. For large bandwidth the mesh was upper constraint without accuracy improvement and despite computation time. As the calculation is performed at each frequency independently, the mesh has been adapted at each frequency to these two rules with respect of the circuit size.

(2)

The model is therefore performed at the interface δS and separated into two kind of operator: one takes into account the homogeneous media around in the interface and one the boundary conditions at the interface. A. Representation of the homogeneous media around in the interface

B. Initial Guess choice

As the circuit is enclosed in a define box, electromagnetic fields can be described above and upper the interface on a modal basis whose expression depends on the boundary conditions over the frontier δΓ (electric, magnetic or periodic boundaries). Homogeneous media around are depicted through modal reflection coefficient though (3) which takes into account boundaries on δΓ1 and δΓ2 according to [1]. On those interfaces electric, magnetic or open boundaries can be considered. Finally, around the interface δS, system (1) is

Initial guess of the process was currently chosen equal to the source term which is the same at each frequency. Several other initial guess have been tested such as zeros, random and stationary waves on the metallic printed circuit without any reduction of the number of iteration. As results are calculated at regularly separated frequency, the previous solution is used as an initial guess when the mesh remains identical. This solution allowed considerable computation time reduction.

partially described on two spectral operators respectively for media 1 and 2.

C Interpolation between spectral and spatial domain

Γˆ i = ∑

∑

n , m α = TE ,TM

f nm

α

Γnm

i ,α

f nm

α

Γˆ 1 and Γˆ 2

The number of meshes and modes are generally equal in all the previous works however modal coefficients are directly approximated through a Fourier Transform whose accuracy for highest modes can be contested. To better approximate these coefficients an interpolation is processed per iteration so that waves are described on twice modes than meshes. First, incident waves are linearly interpolated in the spatial domain to apply the Fourier Transform. Modal reflected waves are evaluated through an operator defined on twice number of modes, inverse Fourier transform is performed and wave amplitudes are kept on the original mesh. One interpolation is added per iteration, its cost in computation time is caught up with a very important decrease in iteration number.

(3)

Where fnmα is the function of the modal basis of order n and m; Γnmα is the modal reflection coefficient of order n and m. B. Representation of the boundary conditions at the interface The interface δS is regularly meshed to easily access to the modal coefficients from the spatial one. Specific property (metallic, dielectric or source) is associated to each mesh according to the circuit representation though a spatial

operator Sˆ . System (1) solution can therefore be explained by system (4) on incident and reflected waves.

⎧⎪ A = Sˆ B + A0 ⎨ ⎪⎩ B = Γˆ A

D. GMRES use instead of Richardson process Although smaller than one [1], the spectral radius associated with Richardson iteration scheme is often close to one. Consequently, the convergence is very slow. To reduce the computation time while still complying with the matrix free solution constraint, we consider a scheme based on GMRES [3].

(4)

A0 represents the source term. A Richardson process is generally used to solve system (4) and by the way system (1) - however the convergence is very slow. Several modifications have been proceeded to improve computation time through reduction of the iteration number.

IV. RESULTS AND REMARKS The previously described numerical techniques are applied to the Tee junction of the suspended microstrip line described in fig. 2. Reflection (S11) and transmission (S21) coefficients

972

are directly linked to the line length and the accuracy of the Tee junction description. This case is very sensitive and attests from the numerical technique accuracy. The substrate height is of 0.5mm.

Fig. 2.

meshes are necessary with the WCIP at 4GHz and 12000 meshes at 11GHz. If a discrete simulation is performed with HFSS, which corresponds to the closest form of the WCIP simulation, the computation time is around 6 minutes 34s, however the mesh is performed once which reduce computation time.

Tee junction of the suspended microstrip line.

For a stopping criterion on the relative residual lower than 0.01 the Richardson scheme did not converge within 20000 iterations for each frequency. Extrapolation without and with preconditioning are applied to Richardson solution. Convergence is reached at each frequency calculation point; the total number of iterations (over 40 frequency points) is respectively of 101581 and 37489 iterations. Using Preconditioning with Richardson reduces the number of iterations of 63%. As reported in Table 1 the computation time is approximately of 13 hours and 52 minutes without preconditioning and 5 hours and 7 minutes with preconditioning. We also report on numerical experiments on GMRES using extrapolation with and without preconditioning. Convergence is reached at each frequency point; the total number of iterations (over 40 frequency points) is respectively of 3316 and 2036 iterations. In summary, the preconditioning by initial guess enables a 38% saving in iteration count for Richardson. The improvement of GMRES with respect of Richardson is around 96% without preconditioning and 94.5% with the initial guess preconditioning. This saving in iteration count also translates in a significant saving of computational time (the experiments were run using Matlab). TABLE I ITERATIONS AND ELAPSED TIME Richardson GMRES No Precond. 101581 iterations 3316 iterations 13h 52mn 1h 20mn Precond. 37489 iterations 2036 iterations 5h 7mn 50mn

Fig. 3. Comparison of the reflection (S11) and transmission (S21) coefficients with the different numerical techniques

Results obtained with Richardson solution get lower accuracy than the GMRES solution. This difference is probably due to the smooth convergence of the GMRES solution compared to the Richardson one. Examples of convergence curves are presented in fig. 4 at the frequency of 4GHz. Convergence is reached with 83 iterations for preconditioned extrapolated GMRES while 1744 iterations are necessary for preconditioned extrapolated Richardson. 2 |S11| (dB) |S21| (dB)

0 -2

S parameters

-4 -6 -8 -10 -12 -14 0

10

20

30

40 50 Iterations number

a)

From a solution quality view point, WCIP results are very accurate when compared to HFSS results (Finite Element Method reference in circuit design), they are presented in Fig. 3. The number of meshes in HFSS is of 6793 while only 3000

973

60

70

80

FDTLM can also be preconditioned by the WCIP solution of the homogeneous equivalent substrate to improve its computation time. Investigations are under process.

2 0 -2

VI. CONCLUSION

S parameters

-4 |S11| (dB) |S21| (dB)

-6

Several improvements have been processed on the integral method called WCIP. It performances get closer to HFSS simulation – the equivalent Finite element method commercial software – however the number of mesh remains much lower and the results accuracy are similar. Computation time is of same order.

-8 -10 -12 -14 0

200

400

600

800 1000 Iterations number

1200

1400

1600

REFERENCES b) Fig. 4. Comparison of the convergence curves obtained with a) GMRES and b) Richardson solution.

[1] H. Baudrand and S. R. N. Gongo, “Application of the Wave Iterative Procedure in planar circuits” , Transword research network. Special issue on recent research developments in Microwave Theory and techniques 1: 187-197, 1999. [2] N. Raveu and H. Baudrand, “Improvement of the WCIP convergence”, IEEE AP-S International symposium on Antenna and Propagation, 1-5 June 2009, Charlestown, USA.. [3] Y. Saad and M. Schultz,” A generalized minimal residual algorithm for solving non symmetric linear systems”, SIAM J. Sci. Stat. Comput. 7:856-869, 1986 [4] A. Zugari, M. Khalladi, M.I. Yaich, N. Raveu, H. Baudrand, « New approach: WCIP and FDTLM hybridization”, Microwave Symposium (MMS), p 1-4, 2009.

V. PERSPECTIVES The WCIP is very similar to the FDTLM formulation. The main differences remain in the surface formulation of the WCIP while the FDTLM treat volume cells; as a consequence the FDTLM can characterize inhomogeneous media while the WCIP can not. Therefore a hybrid FDTLM-WCIP formulation [4] has been developed to extend the WCIP performances to inhomogeneous media while improving computation time of the FDTLM. The

974