Adaptive Mesh Refinement applications with CanoP ...

1 downloads 0 Views 13MB Size Report
p4est. canoP. De quoi parle-t-on ? Une mini revue de codes / frameworks 1 AMR existants. Un travail de R& D par Alex Fikl, stage de M2 en 2014, implanter un.
p4est canoP

Adaptive Mesh Refinement applications with CanoP leveraging P4EST library capabilities ... et discussions sur l’évolution des codes de simulations ?

Pierre Kestener1 1 CEA Saclay, DSM, Maison de la Simulation

15 juillet 2015

1 / 61

p4est canoP

De quoi parle-t-on ?

Une mini revue de codes / frameworks 1 AMR existants Un travail de R& D par Alex Fikl, stage de M2 en 2014, implanter un schéma pour écoulement diphasique (2 fluides non-miscibles, volumes finis, relaxation de Suliciu) Une courte présentation de P4EST (couche AMR - bas niveau, pas de physique) CanoP (couche applicative - haut niveau)

Exemple: implanter le schéma hydro (Godunov, 2nd ordre, HLLC) dans canoP Finalement, qu’est-ce qui est important ? Quels ingrédients pour un code type AMR pour les nouvelles architectures / accélérateurs ?

1 Pas dédié à un champ applicatif 2 / 61

p4est canoP

Block-structured AMR : AMRClaw R.J. LeVeque / M. Berger sources : https://github.com/clawpack/clawpack Euler2d, piecewise constant per quadrant, problem #3 See ref: Lax and Liu, Solution of two-dimensional riemann problems of gas dynamics by positive schemes,SIAM journal on scientific computing, 1998, vol. 19, no2, pp. 319-340 t0 : grid of 40 × 40 coarse cells, 3 levels of AMR (×42 , 640 × 640) tend : 265 000 cells

3 / 61

p4est canoP

Block-structured AMR : AMRClaw R.J. LeVeque / M. Berger sources : https://github.com/clawpack/clawpack Euler2d, piecewise constant per quadrant, problem #3 See ref: Lax and Liu, Solution of two-dimensional riemann problems of gas dynamics by positive schemes,SIAM journal on scientific computing, 1998, vol. 19, no2, pp. 319-340 t0 : grid of 40 × 40 coarse cells, 3 levels of AMR (×42 , 640 × 640) tend : 265 000 cells

4 / 61

p4est canoP

Cell-based AMR : p4est/canoP Euler2d, piecewise constant per quadrant, problem #3 t0 , coarse grid 32 × 32, up to 9 levels of AMR tend : 85 000 cells, 2500 time steps

5 / 61

p4est canoP

Cell-based AMR : p4est/canoP Euler2d, piecewise constant per quadrant, problem #3 t0 , coarse grid 32 × 32, up to 9 levels of AMR tend : 85 000 cells, 2500 time steps

6 / 61

p4est canoP

Cell-based AMR : p4est/canoP Euler2d, piecewise constant per quadrant, problem #3 left: AMRClaw right: canoP

7 / 61

p4est canoP

Frameworks AMR - Mini état des lieux Patch-based or Block-structured AMR SAMRAI (LLNL), 300 kSLOC apps: ALE-AMR, Radiation hydrodynamics, ICF miniapp : CleverLeaf (CloverLeaf + AMR) BoxLib (LBNL), 230 kSLOC apps: CASTRO (compressible radiation hydro, 60 kSLOC), MAESTRO (low-Mach hydro, 110 kSLOC) Chombo (LBNL), 230+70 kSLOC

ref: Dona Calhoun (with AMRClaw)

paramesh, ... astro-specific: Flash, Enzo, Pluto, AstroBEAR, NIRVANA, ... capability to solve hyperbolic, parabolic or elliptic problems. See ref A survey of high level frameworks in block-structured adaptive mesh refinement packages, in Journal of 8 / 61

p4est canoP

Frameworks AMR - Mini état des lieux Cell-based AMR RAMSES, R. Teyssier, 2002 Fully Threaded Tree, Khokhlov, JCP, 1998 Difficultés: parallelisation des algorithmes sous-jacents à l’AMR Que veut-on idéalement ? séparer / sédimenter une couche bas-niveau pour le parallélisme distribué et multi-threading massif intra-nœud (accélérateurs) une couche haut-niveau pour spécifier les schémas numériques découpler les développements logiciels

ref: A. Fikl, transport / p4est, Zalesak disk advection 9 / 61

p4est canoP

AMR cell-based - Mini état des lieux Others AMR-featured frameworks, application agnostic or so DENDRO: G. Biros (U. Austin), 2008-2010, Geometric multigrid library for Finite Elements on Octree meshes. p4est: Carsten Burstedde, since 2010, C, octree-based, Morton z-order curve for memory layout as well as distributed load balancing principal user of p4est is Deal.II (Finite Element framework)

10 / 61

p4est canoP

AMR cell-based - Mini état des lieux Others AMR-featured frameworks, application agnostic or so PEANO: Tobias Weinzierl, since 2012, C++, based on k-spacetree, an octree generalization PEANO SFC on leaves, all parents (cells + vertices) stored 6= linearized tree SFC index is a key to a hash map application: shallow water on triangular mesh tries to address issues related to XeonPhi hardware / SIMD / Vectorization software engineering: Hollywood principle which reads Don’t call us, we call you ! (usage of callback routines, as in GUIs apps)

ref: Bader, SIAMPP 2014 11 / 61

p4est canoP

AMR cell-based - Mini état des lieux Others AMR-featured frameworks, application agnostic or so PEANO: Tobias Weinzierl, since 2012, C++, based on k-spacetree, an octree generalization PEANO SFC on leaves, all parents (cells + vertices) stored 6= linearized tree SFC index is a key to a hash map application: shallow water on triangular mesh tries to address issues related to XeonPhi hardware / SIMD / Vectorization software engineering: Hollywood principle which reads Don’t call us, we call you ! (usage of callback routines, as in GUIs apps)

ref: Bader, SIAMPP 2014

12 / 61

p4est canoP

AMR cell-based - Mini état des lieux Others AMR-featured frameworks, application agnostic or so PEANO: Tobias Weinzierl, since 2012, C++, based on k-spacetree, an octree generalization PEANO SFC on leaves, all parents (cells + vertices) stored 6= linearized tree SFC index is a key to a hash map application: shallow water on triangular mesh tries to address issues related to XeonPhi hardware / SIMD / Vectorization software engineering: Hollywood principle which reads Don’t call us, we call you ! (usage of callback routines, as in GUIs apps)

13 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Synthesis: Forest of octrees From tree...

=

I

Limitation: Cube-like geometric shapes

14 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Synthesis: Forest of octrees ...to forest

=

I

Advantage: Geometric flexibility

I

Challenge: Non-matching coordinate systems between octrees

15 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Parallel AMR algorithms Cluster level

Patch-based (overlapping) AMR

I

Optimal cache hit rate due to uniform patches

I

Efficient vectorization due to uniform patches

I

Load balance each mesh level individually

× So far used with FV: hard to extend to non-cubic geometries

16 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Parallel AMR algorithms Cluster level

Octree-based (composite) AMR

I

Optimal cache hit rate due to space filling curve ordering

I

Hanging nodes straightforward for finite elements (FE)

I

Extended to complex geometries (forest of octrees)

I

So far (MPI) scalability seems unlimited (in practice)

17 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Parallel AMR algorithms Cluster level

Octree-based AMR

I

Octree maps to cube-like geometry

I

1:1 relation between octree leaves and mesh elements

18 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Parallel AMR algorithms Cluster level

Octree-based AMR

I

Octree maps to cube-like geometry

I

1:1 relation between octree leaves and mesh elements

19 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Parallel AMR algorithms Cluster level

Octree-based AMR Proc 0

Proc 1

Proc 2

I

Space-filling curve (SFC): Fast parallel partitioning

I

Fast parallel tree algorithms for sorting and searching

20 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Octree-based AMR Recursive subdivision and space-filling curves (SFC) Proc 0

I I I I

Proc 1

Proc 2

1:1 relation between leaves and elements → efficient encoding Map a 1D curve into 2D or 3D space → total ordering Recursive self-similar structure → scale-free Tree leaf traversal → cache-efficient

21 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

“p4est”—forest-of-octrees algorithms p4est core API (for “write access”) I

p4est new: Create a uniformly refined, partitioned forest

I

p4est refine: Refine per-element acc. to 0/1 callbacks

I

p4est coarsen: Coarsen 2d elements acc. to 0/1 callbacks

I

p4est balance: Establish 2:1 neighbor sizes by add. refines

I

p4est partition: Parallel redistribution acc. to weights

I

p4est ghost: Gather one layer of off-processor elements

p4est “read access” partially formalized I

Loop through p4est data structures as needed

I

p4est iterate: over element volumes, faces, edges, corners

22 / 61

p4est canoP

P4est overview Using slides from Carsten Burstedde (main author of p4est library ⇒ https://github.com/cburstedde/p4est)

Adaptive geometric multigrid Weak and strong scalability for Poisson’s equation Smoother Setup

Transfer Coarse

60

AMG strong GMG strong

40 time(sec)→

time(sec)→

[1]

50

40

30 20

20 10 0

100%

97%

8

64

90%

76%

512 4096 cores→

65%

55%

32K 262K

103

104 cores→

105

[1] H. Sundar, G. Biros, B., et. al., SC ’12

23 / 61

p4est canoP

P4est: main ideas Use z-curve (Morton index) Space Filling Curve for cells layout for indexing topological objects: faces, edges, corners, ...

Develop an entiere app: 1 p4est connectivity:

24 / 61

p4est canoP

P4est: main ideas Use z-curve (Morton index) Space Filling Curve for cells layout for indexing topological objects: faces, edges, corners, ...

Develop an entiere app: 1 p4est connectivity: create / specify a coarse mesh, each hexagonal cell will be the root of an octree. Connectivity can be just a regular single square domain, up to more complex structured domain e.g. cylindrical, spherical or even unstructured at coarse level (see below). Need a geometrical transformation to go from logical octree space to physical space.

25 / 61

p4est canoP

P4est: main ideas Use z-curve (Morton index) Space Filling Curve for cells layout for indexing topological objects: faces, edges, corners, ...

Develop an entiere app: 1 p4est connectivity: 2 p4est: collective operations to build/modify the forest of octrees

26 / 61

p4est canoP

P4est: main ideas Use z-curve (Morton index) Space Filling Curve for cells layout for indexing topological objects: faces, edges, corners, ...

Develop an entiere app: 1 p4est connectivity: 2 p4est iterate: sweep forest of octrees (kind of smart for loop)

27 / 61

p4est canoP

p4est connectivity: a non-trivial 2D example 2D shell connectivity Define geometric transformation from (α, β) to (x, y) Define 6 points (2 squares)

A0 = (−1, 1, 0) A1 = (0, 1, 0) A2 = (1, 1, 0) A3 = (−1, 2, 0) A4 = (0, 2, 0) A5 = (1, 2, 0)

28 / 61

p4est canoP

p4est connectivity: a non-trivial 2D example 2D shell connectivity Define geometric transformation from (α, β) to (x, y)

β ∈ [1, 2], α ∈ [−1, 1],

R= r

R12 R2

³

´ R2 β R1

⇒ R ∈ [R1 , R2 ]

= tan(α π4 ),

q=

q p 1+x2

tree 0 and 1:

(x = q, y = q ∗ r)

tree 2 and 3:

(x = −q ∗ r, y = q)

... tree 6 and 7:

(x = q ∗ r, y = −q)

29 / 61

p4est canoP

p4est connectivity: a non-trivial 2D example

2D shell connectivity Define geometric transformation from (α, β) to (x, y)

30 / 61

p4est canoP

p4est - quadrant data structure

/** The 2D quadrant datatype */ typedef struct p4est_quadrant { p4est_qcoord_t x, y; /**< coordinates */ int8_t level; /**< level of refinement */ union p4est_quadrant_data { void *user_data; /**< never changed by p4est */ /* ... */ } p; }

31 / 61

p4est canoP

p4est - tree data structure High-level user doesn’t need to know details about this...

/** The p4est tree datatype */ typedef struct p4est_tree { sc_array_t quadrants; /**< p4est_quadrant_t first_desc, /**< last_desc; /**< p4est_locidx_t quadrants_offset;

locally stored quadrants * first local descendant */ last local descendant */ /**< cumulative sum over e trees on this process (locals only) */ p4est_locidx_t quadrants_per_level[P4EST_MAXLEVEL + 1]; /**< locals only */ int8_t maxlevel; /**< highest local quadrant lev

} p4est_tree_t;

32 / 61

p4est canoP

p4est - forest of octree data structure /** The p4est forest datatype */ typedef struct p4est { sc_MPI_Comm mpicomm; /**< MPI communicator */ int mpisize, /**< number of MPI processes */ mpirank; /**< this process’s MPI rank */ size_t data_size; /**< size of per-quadrant p.user_data */ void

*user_pointer;/**< convenience pointer for users, never touched by p4est */

p4est_topidx_t

first_local_tree; /**< 0-based index of first local tree, must be -1 for an empty proc*/ p4est_topidx_t last_local_tree; /**< 0-based index of last local tree, must be -2 for an empty proc*/ p4est_locidx_t local_num_quadrants; /**< number of quadrants on all trees on this processor */ p4est_gloidx_t global_num_quadrants; /**< number of quadrants on all trees on all processors */ p4est_gloidx_t *global_first_quadrant; /**< first global quadrant index for each process and 1 beyond */ p4est_quadrant_t *global_first_position; /**< first smallest possible quad for each process and 1 beyond */ p4est_connectivity_t *connectivity; /**< connectivity structure, not owned */ sc_array_t *trees; /**< array of all trees */

} p4est_t; 33 / 61

p4est canoP

p4est - iterate over geometric items /** Execute user supplied callbacks at every volume, face, and corner in the * local forest. * \param[in] p4est the forest * \param[in] ghost_layer optional: when not given, callbacks at the * boundaries of the local partition cannot provide * quadrant data about ghost quadrants: missing * (p4est_quadrant_t *) pointers are set to NULL, * missing indices are set to -1. * \param[in,out] user_data optional context to supply to each callback * \param[in] iter_volume callback function for every quadrant’s interior * \param[in] iter_face callback function for every face between * quadrants * \param[in] iter_corner callback function for every corner between * quadrants */ void p4est_iterate (p4est_t * p4est, p4est_ghost_t * ghost_layer, void *user_data, p4est_iter_volume_t iter_volume, p4est_iter_face_t iter_face, p4est_iter_corner_t iter_corner);

34 / 61

p4est canoP

p4est iterate example - compute CFL void iterator_ramses_cfl (p4est_iter_volume_info_t * info, void *user_data) { p4est_t *p4est = info->p4est; /* get forest structure */ solver_wrap_t

*solver_wrap = (solver_wrap_t *)

p4est->user_pointer;

/* geometry (if NULL, it means cartesian by default) */ p4est_geometry_t *geom = solver_wrap->geom; solver_t double

*solver = solver_from_p4est (p4est); *max_cfl = (double *) user_data;

qdata_t *data = quadrant_get_data (info->quad); qdata_reconstructed_t w_prim; double velocity=0.0; /* get the primitive variables in the current quadrant */ w_prim = reconstructed_variables (data->w); for (int d = 0; d < P4EST_DIM; ++d) { velocity += w_prim.c + fabs(w_prim.velocity[d]); } /* update the global max */ double dx = quadrant_length (info->quad); *max_cfl = max (*max_cfl, velocity/dx); } //iterator_ramses_cfl 35 / 61

p4est canoP

Double Mach reflection test

Well-known test: a Mach 10 shock wave reflected by a wedge for compressible fluid simulation (e.g. Euler equation).

36 / 61

p4est canoP

Double Mach reflection test Well-known test: a Mach 10 shock wave reflected by a wedge for compressible fluid simulation (e.g. Euler equation). Full resolution (700k cells)

37 / 61

p4est canoP

Double Mach reflection test with AMR 3 different simulations with 20k, 30k and 50k cells at the end full grid high resolution was 700k cells

38 / 61

p4est canoP

Double Mach reflection test with AMR With / without AMR

39 / 61

p4est canoP

Dam Break test Gravity-driven two phase flow - Suliciu relaxation scheme ∂t W + ∇.F (W) = S (W) £ ¤T W = ρ, ρY , ρux , ρuy , ρuz

work in progress: F. Drui, S. Kokh, A. Larat, M. Massot 40 / 61

p4est canoP

P4EST - advection scheme - guided tour

cell data structure

/** Per-quadrant data for this example. * * In this problem, we keep track of the state variable u, its * derivatives in space, and in time. */ typedef struct step3_data { double u; /**< the state variable */ double du[P4EST_DIM]; /**< the spatial derivatives */ double dudt; /**< the time derivative */ } step3_data_t;

41 / 61

p4est canoP

P4EST - advection scheme - guided tour

1

Create inter-tree connectivity (usually predefined for complex ones) and (optionnal) geometry

2

Create p4est object with initial condition

3

Refine / coarsen initial grid

4

2:1 balance + partition

5

time loop once every ∆ia adapt cycle: refine/coarsen/balance once every ∆ir repartition once every ∆iio dump data to file every i: sync ghost every i: sweep grid (p4est_iterate) to compute flux every i: sweep grid (p4est_iterate) to update u every i: sync ghost every i: sweep grid (p4est_iterate) to compute du/dx

42 / 61

p4est canoP

P4EST - advection scheme - guided tour

main routine: p4est object create

p4est = p4est_new_ext ( mpicomm, /* communicator */ conn, /* connectivity */ 0, /* minimum quadrants per MPI process */ 4, /* minimum level of refinement */ 1, /* fill uniform */ sizeof (step3_data_t), /* data size */ step3_init_initial_condition, /* initializes data */ (void *) (&ctx)); /* context */

43 / 61

p4est canoP

CanoP bug...

Kelvin-Helmholtz avec un bug entre les indices locaux / globaux....

44 / 61

p4est canoP

Possible cell-based AMR extensions on the applicative layer use patch on the octree leaves Elliptic problems using Finite Element approach (already done by P4EST authors)

PNdof modal: u(xk , t) = i=1 uki (t)φi (x) where φi usually are orthogonal polynomials (e.g. Legendre) © change order, § border conditions, ... P k nodal: u(xk , t) = N i=1 ui (xk , t)li (x) where li are Lagrange polynomials © multigrid transfert operator, collocated quadrature,

High-order numerical schemes using Discontinuous Galerkin approach modal approach simpler than nodal in the context of AMR approach used in http://arxiv.org/abs/1506.06140 (Schaal, Springel, et al.)

45 / 61

p4est canoP

Cell-based AMR on accelerators

CLAMR by R. Robey, Las Alamos NL. Cell-based AMR mini-app / proof of concept for new architectures (GPU, XeonPhi, ...) in OpenCL. Some key ideas: maximum data locality, use a hash-mapping to translate neighboring information from physical space into memory space. proof of concept: 2D only, swallow-water numerical scheme

46 / 61

p4est canoP

Programing accelerator with smart libraries

kokkos, Sandia NL. out-sourced from Trilinos (distributed linear algebra package) kokkos executive summary: à la TBB library for accelerators parallel primitive algorithms (specialized for a given architecture) + decouple memory layout from physical layout.

47 / 61

p4est canoP

Future of accelerator programming: Kokkos among other Kokkos’ Layered Libraries  Core  Multidimensional arrays and subarrays in memory spaces  parallel_for, parallel_reduce, parallel_scan on execution spaces  Atomic operations: compare-and-swap, add, bitwise-or, bitwise-and

 Containers    

UnorderedMap – fast lookup and thread scalable insert / delete Vector – subset of std::vector functionality to ease porting Compress Row Storage (CRS) graph Host mirrored & synchronized device resident arrays

 Sparse Linear Algebra  Sparse matrices and linear algebra operations  Wrappers for vendors’ libraries  Portability layer for Trilinos manycore solvers

4

reference: slides by Edwards, Trott, Sunderland (SANDIA) at GTC2014 Kokkos, a Manycore Device Performance Portability Library for C++ HPC Applications 48 / 61

p4est canoP

Future of accelerator programming: Kokkos among other

kokkos multi-dimensional array map multi-index (i, j, k, ...) ⇐⇒ memory location in a memory space 2 Kokkos will choose a default memory layout adapted to the target device Decouple logical index (i, j, k, ...) from actual memory layout

2 In the same line of idea, see chapter 28 of book High Performance Parallelism Pearls, Morton order improve performance 49 / 61

p4est canoP

Future of accelerator programming: Kokkos among other MiniMD used to bench thread-scalable algorithm before integrating them in LAMMPS (2014)

MiniMD Performance

Lennard Jones force model using atom neighbor list 

Solve Newton’s equations for N particles



Simple Lennard Jones force model: F i =







j , r ij < r cut

ς ς −2 r ij r ij 7

[( )

13

( )]

Use atom neighbor list to avoid N2 computations pos_i = pos(i); for( jj = 0; jj < num_neighbors(i); jj++) { j = neighbors(i,jj); r_ij = pos_i – pos(j); //random read 3 floats if ( |r_ij| < r_cut ) f_i += 6*e*( (s/r_ij)^7 – 2*(s/r_ij)^13 ) } f(i) = f_i;



Moderately compute bound computational kernel



On average 77 neighbors with 55 inside of the cutoff radius

17

50 / 61

p4est canoP

Future of accelerator programming: Kokkos among other MiniMD used to bench thread-scalable algorithm before integrating them in LAMMPS (2014)

MiniMD Performance

Lennard Jones (LJ) force model using atom neighbor list 

Test Problem (#Atoms = 864k, ~77 neighbors/atom) 



Neighbor list array with correct vs. wrong layout  Different layout between CPU and GPU Random read of neighbor coordinate via GPU texture fetch 200

GFlop/s

150

correct layout (with texture)

100

correct layout without texture

50

wrong layout (with texture)

0 Xeon 

Xeon Phi

K20x

Large loss in performance with wrong layout  

Even when using GPU texture fetch Kokkos, by default, selects the correct layout

18

51 / 61

p4est canoP

Complex memory layout for performance chapter 28 of book High Performance Parallelism Pearls, Morton order improve performance, by Kerry Evans (INTEL); mesure on Xeon and XeonPhi transpose, dense matrix multiplication on Xeon and XeonPhi

52 / 61

p4est canoP

Space filling-curve - memory layout - hash table

SFC can be used for mapping geometric data to memory

ref: lecture note, cmsc425, game programing

53 / 61

p4est canoP

AMR with hash table data storage

(i, j) indices of each oct stores the standard structured indices as if the entire grid were at the refinement level of the oct P index ID = level−1 (2l .2l ) + i.2level + j is used as hash-key to retrieve l=0 cell data. retrieve neighbors using hash map. ref: Ji, Lien and Yee, A new adaptive mesh refinement data structure with an application to detonation, JCP 229(2010), 8981-8993 54 / 61

p4est canoP

AMR with hash table data storage

(i, j) indices of each oct stores the standard structured indices as if the entire grid were at the refinement level of the oct P index ID = level−1 (2l .2l ) + i.2level + j is used as hash-key to retrieve l=0 cell data. retrieve neighbors using hash map. ref: Ji, Lien and Yee, A new adaptive mesh refinement data structure with an application to detonation, JCP 229(2010), 8981-8993 55 / 61

p4est canoP

Cell-based AMR - Poisson solver

Several refs: Finite Element discretization; e.g. p4est step4 or DENDRO Ji, Lien and Yee, Parallel Adaptive Mesh Refinement Combined with Additive Multigrid for the Efficient Solution of the Poisson Equation, IRSN Applied Mathematics, vol 2012 (2012)

56 / 61

p4est canoP

Cell-based AMR - Poisson solver - Multigrid method DENDRO: refs: A parallel geometric multigrid method for finite elements on octree meshes, Sampath and Biros, SIAM J. Sci. Comput., vol 32, no 3, pp 1361 Finite element spaces using conforming trilinear basis functions Construct a sequence of octrees: coarser and coarser octrees used in the multigrid V −cycle sort of remove hanging nodes, do not represent independent degrees of freedom

57 / 61

p4est canoP

Cell-based AMR - Poisson solver - p4est

refs: arxiv 1406.0089, recursive algorithms for distributed forests of octrees. implements the same idea to remove non-conformant / hanging nodes data structure named p4est_lnodes (see step 4 in examples); lnodes stands for Gauss-Lobatto quadrature points.

58 / 61

p4est canoP

p4est - finite elements applications

How hanging nodes are handled in p4est:

59 / 61

p4est canoP

p4est - discontinuous Galerkin applications Burstedded et al., A High-Order Discontinuous Galerkin Method for Wave Propagation Through Coupled Elastic-Acoustic Media, 2010.

60 / 61

p4est canoP

Cell-based AMR - hash map - WENO - misc

Based on same idea as Yee (hash table AMR), but using cell tri-partition instead of bi-partition (i.e. octree): Wang, Dong and Shu, Parallel adaptive mesh refinement method based on WENO finite difference scheme for the simulation of multi-dimensional detonation (Brown University) Qin, Shu and Yang, Bound-preserving discontinuous Galerkin methods for relativistic hydrodynamics, (Brown University).

61 / 61