A Parallel FEM Scheme for the Simulation of Large ...

4 downloads 486 Views 1MB Size Report
thermochemical heat storage, which has been implemented in open source FEM software .... The performance test showed a decent speedup, which indicates ...
Available online at www.sciencedirect.com

ScienceDirect Energy Procedia 75 (2015) 2080 – 2086

The 7th International Conference on Applied Energy – ICAE2015

A parallel FEM scheme for the simulation of large scale thermochemical energy storage with complex geometries using PETSc routines Wenqing Wanga, Olaf Kolditza,b, Thomas Nagela, * a

Department of Environmental Informatics, Helmholtz Centre for Environmental Research – UFZ, Permoserstr. 15, 04318 Leipzig, Germany b Applied Environmental Systems Analysis, Technische Universität Dresden, Germany

Abstract Thermal energy storage technologies are of current interest in order to improve the integration of renewable energy sources as well as energy efficiency. Numerical simulations of thermochemical heat storage are especially challenging and time consuming due to the complexity of the mathematical description of the strongly coupled and highly nonlinear processes characteristic for such systems. These difficulties are exacerbated once practically relevant complex or large geometries are considered as they can occur around heat exchangers or due to internal heterogeneities of the reactive bed. To allow a computationally efficient simulation of such applications, an existing finite element implementation of a thermochemical heat storage model was parallelised using PETSc routines. Input/output, global assembly and the linear solver all work in a distributed fashion. The approach is implemented into the open source framework OpenGeoSys. The performance of the present parallelisation approach is tested by simulating the discharge of a heat store based on calcium-oxide and water as a benchmark problem. The computational time required for the simulation could be reduced significantly. © 2015 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license © 2015 The Authors. Published by Elsevier Ltd. (http://creativecommons.org/licenses/by-nc-nd/4.0/). Selection and/or peer-review under responsibility of ICAE Peer-review under responsibility of Applied Energy Innovation Institute

Keywords: Parallel finite element method; PETSc; thermal energy storage; reactive transport; porous media; OpenGeoSys

1. Introduction Nomenclature ܿ୮஑ specific isobaric heat capacity of phase ߙ ࡰ

diffusivity tensor of the reactive component in the gas phase

* Corresponding author. E-mail address: [email protected].

1876-6102 © 2015 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of Applied Energy Innovation Institute doi:10.1016/j.egypro.2015.07.317

Wenqing Wang et al. / Energy Procedia 75 (2015) 2080 – 2086

ȟ݄ FEM G,S ߶஑ ࣅୣ୤୤ ݊ ‫݌‬ ߩ஑ ߩ஑ୖ ߩොୗ ܶ ࢜ୋ ‫ݔ‬୫ୖ

specific enthalpy of reaction finite element method indices referring to the gas or solid phase volume fraction of phase ߙ effective heat conductivity tensor of the mixture number of nodes gas pressure apparent density of phase ߙ real/effective density of phase ߙ density production of the solid phase absolute temperature gas velocity reactive gas mass fraction

The current ambitions towards a transformation of the energy supply system to renewable resources have increased the need for technologies to decouple energy supply and demand as well as for means to increase energy efficiency. Building climatisation, hot water supply and industrial process heat generation consume a significant share of the primary energy supply and offer considerable potentials for efficiency improvements. Intense research effort is invested into numerous kinds of thermal energy storage systems in order to improve our technological capabilities in this regard [1]. Among the technologies considered for heat storage are various chemical reactions as well as adsorption and absorption processes [2]. The simulation of such systems requires specialised software capable of capturing the strongly coupled nonlinear processes characteristic for their operation. Computational models can be used to design and simulate heat storage devices based on a physicalchemical characterisation of the storage material in the laboratory. The model domains become large when practical applications are considered instead of laboratory examples. Heat exchangers and structures that manage flow patterns can have complex geometries. The reactive bed itself can develop inner structures and heterogeneities due to, e.g., adverse flow conditions. A detailed investigation of large scale thermochemical energy storage systems and their interaction with geometrically complex system components like heat exchangers and flow diverters requires simulations based on sufficiently refined grids. Most current simulations of thermochemical heat storage focus on the geometrically simple reaction bed and use either 1D [3, 4] or 2D [5, 6] mesh topologies. Only very few 3D simulations have been published, see [7] for hydrogen storage in metal hydrides and [8] for an example of thermochemical heat storage. One obstacle for more complex analyses certainly originates in computational limitations. The complexity of the mathematical description of the strongly coupled and highly nonlinear processes characteristic for thermochemical heat storage render a computational solution numerically challenging and, above all, time-consuming. These difficulties are exacerbated when complex or large geometries are being considered. The objective of the present contribution is to provide a tool for the simulation of large thermochemical heat storage units with a high computational efficiency. For this purpose, we present the first parallelisation of the finite element analysis of a reactive mass and heat transport model for thermochemical heat storage, which has been implemented in open source FEM software (OpenGeoSys, OGS [9]). Similar to previous work on the parallelisation of FEM models of two-phase flow in porous

2081

2082

Wenqing Wang et al. / Energy Procedia 75 (2015) 2080 – 2086

media [10], the present scheme takes the domain decomposition approach and employs PETSc routines for global assembly and the solution of the linear equation system. 2. Methods 2.1. Governing equations The governing equations describe non-isothermal two-component gas transport through a fixed bed of solid particles coupled to a chemical reaction. They consist of a gas mass balance, a mass balance for the reactive component, and an energy balance for the mixture: ߲ߩୋ + div (ߩୋ ࢜ୋ ) = െߩොୗ ߲‫ݐ‬ ߩୋ

߲‫ݔ‬୫ୖ െ div(ߩୋୖ ࡰgrad ‫ݔ‬୫ୖ ) + ߩୋ grad ‫ݔ‬୫ୖ ‫࢜ ڄ‬ୋ = െ(1 െ ‫ݔ‬୫ୖ ) ߩොୗ ߲‫ݐ‬

ൣߩୋ ܿ୮ୋ + ߩୗ ܿ୮ୗ ൧

߲ܶ ߲(߶ୋ ‫)݌‬ + ߩୋ ܿ୮ୋ grad ܶ ‫࢜ ڄ‬ୋ െ െ div[ࣅୣ୤୤ grad ܶ] = ߩොୗ ȟ݄ ߲‫ݐ‬ ߲‫ݐ‬

(1) (2) (3)

This is a simplified version of the model presented in reference [4] where local thermal nonequilibrium has been taken into account. The deviations from thermal equilibrium in the systems we currently consider are small enough to justify the use of only one energy balance in order to limit computing times. The gas pressure ‫݌‬, the temperature ܶ, and the mass fraction of the reactive gas component ‫ݔ‬୫ୖ are chosen as primary variables. Test functions that vanish on the Dirichlet boundary of their corresponding trial functions are used to construct the weak forms of Eqs. (1) to (3). The system is then discretised using the Galerkin mixed finite element approach resulting in the monolithically solved equation system ࡹ୮୮ ቌࡹ୘୮ ૙

ࡹ୮୘ ࡹ୘୘ ૙

ࡹ୮୶ ෝሶ ࢖ ࡷ୮୮ ૙ ቍቌ ࢀ ෡ሶ ቍ + ቌ ૙ ૙ ࡹ୶୶ ෝሶ୫ୖ ࢞

૙ ࡷ୘୘ ૙

ࢌ୮ ૙ ෝ ࢖ ෡ ቍ = ቌ ࢌ୘ ቍ ૙ ቍቌ ࢀ ࡷ୶୶ ෝ୫ୖ ࢌ୶ ࢞

(4)

Here, the coefficient matrices result from the discretized weak form of Eqs. (1)-(3) and quantities with a hat denote nodal degrees of freedom. An implicit single-step time integration procedure is used and nonlinearities resolved with Picard iterations. The reaction rate used to determine the density production term ߩොୗ is described by an ODE that is solved at the integration point level. 2.2. Parallel scheme Nearly the entire solution process has been parallelised in OGS [10]. In the present approach, the domain decomposition method is applied by partitioning the mesh into subdomains in a node-wise manner. As a consequence, subdomains overlap in some elements. Special attention is paid to the numerical integration over these ghost elements so that the matrices and vectors of the subdomains remain without intersections in order to reduce the communication load during the solution process. The mesh is a major contributor to memory usage in the computation of large problems. Therefore, the partitioning of

2083

Wenqing Wang et al. / Energy Procedia 75 (2015) 2080 – 2086

the mesh is performed before the start of the FE simulation so as to distribute the memory occupied by the mesh among the involved compute nodes. The most time consuming parts in a finite element computation are global assembly and linear solver. In the current implementation, local assembly of element matrices and vectors within each subdomain is performed by each compute core concurrently. Assembly into the global system based on PETSc data structures occurs by nodal degrees of freedom: (‫݌‬Ƹଵ , ܶ෠ଵ , ‫ݔ‬ොଵ , ‫݌‬Ƹଶ , ܶ෠ଶ , ‫ݔ‬ොଶ , … , ‫݌‬Ƹ௡ , ܶ෠௡ , ‫ݔ‬ො௡ )

(5)

This leads to a sparse pattern of the stiffness matrix such that the non-zero entries are distributed close to the diagonal. Once global assembly is finished, the global matrix and vector are passed to PETSc Krylov subspace routines to solve the system of linear equations. After the solution has been obtained the only MPI related operation that remains is to calculate error norms for the convergence control of the nonlinear iterations and time stepping. The communication effort for this operation is negligible. The obtained solution is output to either binary or ASCII files using parallel output functions. The described parallel process is illustrated in Fig. 1.

Fig. 1: Process diagramme of the parallel FEM scheme using PETSc.

2084

Wenqing Wang et al. / Energy Procedia 75 (2015) 2080 – 2086

(a)

(b)

Fig. 2: (a): Speedup of the short test. (b) Reaction rate profile over the mesh split into 12 domains after 3000 s.

2.3. Example simulations A calcium hydroxide reaction system similar to the one considered in references [4, 6] was used for testing the parallel version of the code. A cylindrical lab scale reactor of 5.5 cm in diameter and 16 cm in height was represented with an axisymmetric mesh. The geometrical domain was initially discretised into 11,646 quadrilateral elements with 11,925 nodes. Material properties were as described in reference [6]. Boundary conditions were as follows: The outlet pressure was fixed at 1 bar while a gas mass flux of 0.212 g/s was supplied at the inlet. It had a vapour mass fraction of 0.36 and a temperature of 620 K. The wall temperature was set to 620 K as well. Simulations were run with a time step size of 0.5s. Computations were carried out on the Linux cluster EVE at the UFZ which has 85 nodes with 12 Intel® Xeon® 2.67 GHz CPU cores per node, 5 TB RAM and 40 GB Infiniband network interconnect. First, we conducted a short test of two time steps to investigate the computational efficiency of parallel I/O, assembly, and linear solver regardless of time stepping. Since the stiffness matrix is not symmetric and large differences exist in the magnitude of the matrix and vector entries, a strong preconditioner was required to accelerate convergence of the linear solver. In this short test, the GMRES solver was used in conjunction with the hypre/boomerang preconditioner. The relative tolerance was set to 10-8, and the maximum nonlinear iterations to 20. The test was conducted with 2, 4, 12, and 24 compute cores, respectively. The average iteration number for convergence of the linear solver was around 800. For different partitionings, the residual of the linear equation system was different leading to different numbers of Picard iterations as well as rejected time steps. Wall clock times for the short test are listed in Table 1, which also shows that assembly exhibits excellent speedup, while the speedup of the entire computation remained less due to the difficulties of the linear solver. The elapsed time in assembly and solver listed in Table 1 and hereafter is the accumulated time during the entire run. The speedup of the short test is shown in Fig. 2(a). Additionally, Fig. 2(b) shows the reaction rate profile within the reactor after 3000 s based on a simulation using 12 cores that was run until 4000 s of simulated time.

Wenqing Wang et al. / Energy Procedia 75 (2015) 2080 – 2086

Table 1: Wall clock time during the short test (in seconds).

Compute cores 2

Equation assembly 202.6

Linear solver 101.6

Total computation 304.6.

4

42.0

42.5

85.0

12

4.9

16

21.4

24

1.4

14.6

16.1

In order to check the performance of the computation with a higher number of compute nodes, we refined the mesh twice and used more compute cores for the second test. After refinement, the mesh had 186,367 quadrilateral elements with 187,473 nodes. The test simulated 100 s of the addressed problem in 559 time steps. For this test, the BiCGStab solver with an ASM preconditioner/LU subpreconditioner was adopted. The assembly again showed a good speedup, while the linear solver lost speedup when the number of compute cores was increased above 32 (Table 2). Table 2: Wall clock time during the second test with the refined mesh (562,419 DOF; times in seconds).

Compute cores 12

Equation assembly 9159.78

Linear solver 4036.91

Total computation 13629.46

32

1393.91

3143.29

4726.25

64

694.57

3815.71

4672.80

3. Concluding remarks A parallel FE approach for the numerical modelling of large scale thermochemical heat storage using OGS and PETSc was presented. Initial verification and performance testing was done using a two dimensional mesh for an axisymmetric simulation. The performance test showed a decent speedup, which indicates that the present approach can be used for simulations of practical problems with a high computational efficiency. However, the linear solver part appeared to be the bottle neck for speedup with an increasing number of compute cores. A possible reason is the strong preconditioner required to accelerate linear solver convergence which increases the node communication load. Furthermore, the number of nonlinear iterations and rejected time steps varied in computations based on different partitionings which is a known phenomenon [11]. Currently, the performance in three-dimensional simulations involving nontrivial geometries (cf. [8]) is investigated. Additionally, the model linearisation is changed from a Picard to a Newton-Raphson scheme in order to accelerate convergence of the nonlinear iterations. This will be coupled with an advanced time stepping scheme to further decrease computation time [12]. Finally, alternative ways of global assembly are investigated with the aim of additional efficiency improvements. These measures will improve the performance of both the serial and the parallel implementation.

2085

2086

Wenqing Wang et al. / Energy Procedia 75 (2015) 2080 – 2086

Acknowledgements Funding was provided by the Helmholtz Initiating and Networking Fund through the NUMTHECHSTORE project. References [1] K. Kaygusuz. The viability of thermal energy storage. Energy Sources, 21(8):745–755, 1999. [2] N. Yu, R.Z. Wang, and L.W. Wang. Sorption thermal storage for solar energy. Progress in Energy and Combustion Science, 39(5):489–514, 2013. [3] F. Schaube, A. Wörner, and R. Tamme. High Temperature Thermochemical Heat Storage for Concentrated Solar Power Using Gas–Solid Reactions. Journal of Solar Energy Engineering, 133(3):031006, 2011. [4] H. Shao, T. Nagel, C. Roßkopf, M. Linder, A. Wörner, and O. Kolditz. Non-equilibrium thermo-chemical heat storage in porous media: Part 2 – A 1D computational model for a calcium hydroxide reaction system. Energy, 60:271–282, 2013. [5] F. Schaube, I. Utz, A. Wörner, and H. Müller-Steinhagen. De- and rehydration of Ca(OH)2 in a reactor with direct heat transfer for thermo-chemical heat storage. Part B: Validation of model. Chemical Engineering Research and Design, 91(5):865– 873, 2013. [6] T. Nagel, H. Shao, C. Roßkopf, M. Linder, A. Wörner, and O. Kolditz. The influence of gas-solid reaction kinetics in models of thermochemical heat storage under monotonic and cyclic loading. Applied Energy, 136(0):289 – 302, 2014. [7] J. Nam, J. Ko, and H. Ju. Three-dimensional modeling and simulation of hydrogen absorption in metal hydride hydrogen storage vessels. Applied Energy, 89(1):164–175, January 2012. [8] L. Bilke, T. Fischer, C. Helbig, C. Krawczyk, T. Nagel, D. Naumov, S. Paulick, K. Rink, A. Sachse, S. Schelenz, M. Walther, N. Watanabe, B. Zehner, J. Ziesch, and O. Kolditz. Tessin vislab – laboratory for scientific visualization. Environmental Earth Sciences, pages 1–19, 2014. [9] O. Kolditz, S. Bauer, L. Bilke, N. Böttcher, J.O. Delfs, T. Fischer, U.J. Görke, T. Kalbacher, G. Kosakowski, C.I. McDermott, et al. OpenGeoSys: an open-source initiative for numerical simulation of thermo-hydro-mechanical/chemical (THM/C) processes in porous media. Environmental Earth Sciences, 67(2):589–599, 2012. [10] W. Wang, T. Fischer, B. Zehner, N. Böttcher, U.-J. Görke, and O. Kolditz. A parallel finite element method for twophase flow processes in porous media: Opengeosys with petsc. Environmental Earth Sciences, accepted (online first), 2014. [11] http://www.mcs.anl.gov/petsc/documentation/faq.html: How come when I run the same linear solver on a different number of processes it takes a different number of iterations? [12] W. Wang, J. Taron, U.-J. Görke, and O. Kolditz. Automatic time stepping with newton-raphson method for two-phase fluid flow in porous media. AIP Conference Proceedings, 1558(1):1986–1988, 2013.

Biography Dr. Thomas Nagel is work group leader at the Department of Environmental Informatics of the Helmholtz Centre for Environmental research. He holds a master’s degree in Mechanical Engineering from Chemnitz University of Technology and a PhD in Bioengineering from Trinity College Dublin.