Engineering molecular dynamics simulation in chemical engineering

0 downloads 0 Views 13MB Size Report
Oct 5, 2014 - The EMMS Group, State Key Laboratory of Multiphase Complex Systems, ... of MD simulation to deal with the whole spectrum of chemical engineering phenomena. ..... ReaxFF employs bond-order dependent potentials that require no .... water with an appropriate concentration of ions, resulting in about.
Chemical Engineering Science 121 (2015) 200–216

Contents lists available at ScienceDirect

Chemical Engineering Science journal homepage: www.elsevier.com/locate/ces

Engineering molecular dynamics simulation in chemical engineering Ji Xu, Xiaoxia Li, Chaofeng Hou, Limin Wang, Guangzheng Zhou, Wei Ge n, Jinghai Li n The EMMS Group, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing 100190, People's Republic of China

H I G H L I G H T S

 Traditional MD is generalized to accommodate discrete elements at different scales.  Consistency among system, model, software and hardware to achieve high efficiency.  Possibility of engineering MD into a virtual experiment platform is discussed.

art ic l e i nf o

a b s t r a c t

Article history: Received 2 July 2014 Received in revised form 25 September 2014 Accepted 27 September 2014 Available online 5 October 2014

Chemical engineering systems usually involve multiple spatio-temporal scales, grouped into different levels, from the molecular scale of reactants to the industrial scale of reactors. Molecular dynamics (MD) simulation is one of the most fundamental methods for the study of such systems, but it is too costly and hence formidable for simulating large-scale behavior directly. However, there are two great potentials in extending this method. First, the logic and algorithms of traditional MD simulations can be generalized from the material level to higher levels since the elements of each level are all discrete in nature, and can be well defined, allowing an MD-style simulation based on different elements. Second, MD simulations can be accelerated by realizing the structural consistency among the problem, model, software and hardware (the so-called EMMS paradigm). These two potentials give possibilities to engineer the method of MD simulation to deal with the whole spectrum of chemical engineering phenomena. In this review, we summarize our discrete simulation studies to explore such potentials, from the establishment of a general software and hardware framework, to the typical applications at different levels, including the reactions in coal pyrolysis, the dynamics in virion, the atomic behavior in silicon at millimeter scale, and finally continuum flow. The possibility of engineering MD simulation into a virtual experiment platform is discussed finally. & 2014 Elsevier Ltd. All rights reserved.

Keywords: Engineering MD simulation Chemical processes Meso-scale Multiscale EMMS paradigm Particle methods

1. Introduction Chemical engineering systems typically involve multi-scales, from the material level of atoms, molecules and assemblies to the reactor level of particles and clusters, and then the system level of reactors and processes, as illustrated in Fig. 1 (Li et al., 2009, 2013).

Abbreviations: CPU, central processing unit; CUDA, compute unified device architecture; DEM, discrete element method; DPPC, dipalmitoylphosphatidylcholine; EMMS, energy minimization multi-scale; GPU, graphics processing unit; LAMMPS, large-scale atomic/molecular massively parallel simulator; LINCS, linear constraint solver; MD, molecular dynamics; MPI, message passing interface; PE, polyethylene; PPM, pseudo-particles modeling; QM, quantum mechanics; RNA, ribonucleic acid; ReaxFF, reactive force field; SPC, simple point charge; SPH, smoothed particle hydrodynamics; VARxMD, visualization and analysis of reactive molecular dynamics; VPE, virtual process engineering n Corresponding authors. E-mail addresses: [email protected] (W. Ge), [email protected] (J. Li). http://dx.doi.org/10.1016/j.ces.2014.09.051 0009-2509/& 2014 Elsevier Ltd. All rights reserved.

Each level features an element scale, a global scale (which is also the element scale of the next higher level) and a meso-scale in between (Li et al., 2009). For example, atoms and molecules are at the element scale of the material level where the bulk of the material corresponds to the global scale and the molecular assemblies in between are at the meso-scale. Computer simulation is becoming a more and more important approach in chemical engineering with its ever-increasing demand for quantifying such systems in higher accuracy and spatiotemporal resolution, for which experimental methods alone are insufficient in many cases. Molecular dynamics (MD) simulation is one of the most fundamental choices in the toolbox available. In MD simulation, the whole system is discretized into interacting elements, typically atoms, to reflect the underlying mechanism. After more than half a century, it has developed from the study of interacting hard spheres (Alder and Wainwright, 1957) to selfassembly of proteins (Lee et al., 2011; Dill and MacCallum, 2012;

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

201

Fig. 1. Multi-scale hierarchy of chemical engineering. Modified from Li et al. (2009, 2013).

Zhao et al., 2013). Many force fields have been well established to formulate the interactions between the elements, providing fruitful details of the studied systems at the material level. Although it is computationally formidable to predict macro-scale behavior from atomic scale yet, it can be applied at a much wider scale range and play a more important role, if the principle of MD simulation is understood in a general sense. Thanks to the discrete nature of the physical world, which is organized into multilevels and multiscales, the elements of each level can be described as discrete “particles” with well-defined interaction models, as shown in Fig. 1. Therefore, the framework of MD simulation, that is, discretizing the investigated system into elements with interactions, can be extended to higher levels by substituting the atoms and molecules with the more general “elements” (referred to as “particles” in the following). For example, the discrete element method (DEM) (Cundall and Strack, 1979) and the smoothed particle hydrodynamics (SPH) method (Gingold and Monaghan, 1977; Lucy, 1977) employ natural or fictitious material elements with Newtonian interactions for the simulation of macro-scale behavior. Maybe the essential difference is that, from the modeling aspect, dissipation has to be considered in these interactions, but computationally almost the same algorithmic framework can be used, so that these methods can be “engineered” to become part of the generalized family of MD simulation. That is, the modeling in MD simulation can be generalized from the material level to higher levels. In this review, we will focus on the research of the first two levels in Fig. 1, namely material and reactor systems. On the other hand, from the computational point of view, a big gap is left between the peak performance of the supercomputers and the sustainable performance of the MD applications (Phillips et al., 2005; Hess et al., 2008), and it is actually not cost-effective to use the mainstream central processing units (CPUs) as the main computing hardware for such applications. The problem can be tackled, from one aspect, by keeping the structural consistency between problem, physical model, numerical algorithm, and hardware (Chen et al., 2009; Ge et al., 2011; Li et al., 2013), which follows the so-called “EMMS paradigm” (Li et al., 2009, 2013; Ge et al., 2011). This paradigm is proposed according to the common features of the multi-scale simulation methods based on the energyminimization multi-scale (EMMS) model (Li et al., 1988; Li and Kwauk, 1994) and its extensions. The model was first proposed to describe the heterogeneous meso-scale structure in concurrent-up gas–solid flow, featuring the closure of hydrodynamic equations with stability conditions, and then extended to gas–liquid/solid (Liu et al., 2001), gas–liquid (Zhao and Ge, 2007; Yang et al., 2010), turbulence (Li et al., 1999; Zhang et al., 2014) and some other systems (Ge et al., 2007). In this paradigm, the multi-scale simulations start with the prediction of global flow field distribution at the

steady state, dynamic evolution of the systems is then followed employing the meso-scale models. In some case, micro-scale details are described by carrying out discrete simulations concurrently. Algorithms with different computational features are then identified and carried out by corresponding processors of different architectures. Higher efficiency and scalability as well as much lower cost can be achieved when this consistency is kept. For the simulation of multiphase flows, this strategy has been proven effective (Ge et al., 2011, 2013; Li et al., 2014a, b). In the spirit of these conceptions, we will review the studies at our group on the establishment of efficient and scalable MD simulation capabilities, under the name “engineering molecular dynamics simulation”. This name stands for both the action of generalizing MD modeling to different scales and levels together with the improving of its implementation, and also the resultant method that is capable of solving engineering problems. A series of applications enabled by this capability, ranging from the reactions in coal pyrolysis, the dynamics in virion, to the atomic behavior in silicon at millimeter scale, until continuum flow, are then revisited to demonstrate the significance of engineering MD simulations in a broader sense.

2. Engineering MD simulation To engineer MD simulation into a general simulation approach for chemical engineering, it is important to extract the common features and structures of all related methods. For this purpose, we have outlined a flowchart shared by most simulations in the MDstyle, as shown in Fig. 2. The general modules in it are also summarized in Table 1, namely, Neighbor search, Interaction processing and Configuration update. In addition, the algorithms for a specific group of methods exclusively are listed as Particular modules, which are also shared among them. However, there are still slight functional differences in each general module. For example, in Neighbor search, the neighbor-list technique is widely used in traditional MD to accelerate the search of changing neighbors with short-range interactions. In the MD simulation for solid material using many-body potential, a much-simplified fixed neighbor list or even a stencil can be used instead. While in some models of DEM where the interaction contains the historydependent forces (Zhang and Vu-Quoc, 2000; Rapaport, 2007), the contact history list should be added to the normal neighbor list. However, these variants are interrelated and they can share a lot in implementation and programming. With such abstraction and sharing at different levels, we can minimize the programming efforts in carrying out specific simulation cases. We will discuss the establishment of other modules in the same strategy in the following sections.

202

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

2.1. Generality To implement all these modules efficiently is a non-trivial task. The “harmony” (consistency) in the “quartet” of system-modelsoftware-hardware, as mentioned in Section 1, is the key to higher efficiency and scalability. To achieve this “harmony” (consistency), the structures and characteristics of each module should be analyzed further and utilized in its implementation. In Neighbor search modules which aims to accelerate the Interaction processing, two general approaches, namely cell-list and neighbor-list methods (Verlet, 1967), are available. In cell-list method, the simulation space is divided into cells with sizes slightly larger than the cut-off distance of the interactions, so that for particles in each cell, the interacting pairs only need to be searched in their own and neighboring cells. This method is more suitable for systems with fast moving particles where the interacting pairs should be updated nearly every

Fig. 2. Generality of the flowchart for most MD-style simulations.

time step. In contrary, neighbor-list method is more favorable for slowly changing systems, which implies that the interacting pairs remain valid over several time steps. The neighboring particles are searched by the cells as those in the cell-list method. The indices of neighboring particles are grouped together to be stored successively. Furthermore, searching units, rather than atoms or particles, are sometimes employed, especially in the systems where not all particles need to be searched. For example, in the water models SPC (Berendsen et al., 1981) and TIP3P (Jorgensen et al., 1983), the whole water molecule can be represented by the oxygen atom, which reduces the searching cost between water/water and non-water/water particles to 1/9 and 1/3, respectively. This strategy not only improves the efficiency of neighbor searching, but also unifies the searching units of different methods, such as atoms in MD and particles in DEM. Moreover, neighbor searching endorses the fine-grain parallelism that all searching units can be treated simultaneously at the thread and/or instruction level. The Interaction processing module distinguishes each model fundamentally. Among all the interaction types between particles, only the short-range ones are classified into the Interaction processing module, and they take different forms in different systems. The SPH method describes the motion of fluids and/or solids with the discrete form of continuity and momentum equations, and the numerical particles have interaction within the influence domain of the weighting function (Monaghan, 2005; Liu and Liu, 2010). The particles of granular materials interact with each other only when they collide, and there exist various contact models (Zhu et al., 2007), such as the spring, dashpot and friction interactions (Cundall and Strack, 1979). This module is the most computationally intensive part. Thus, its efficiency and scalability should be ensured at the first place. Fortunately, some common nature of the models discussed, namely additivity and locality, gives intrinsic parallelism and scalability that can greatly facilitate supercomputing (Ge and Li, 2000; Ge et al., 2011). Here, additivity means that the interactions among the particles can be calculated independently at the same time, and then all the forces on one particle due to other particles can be summed up to give the resultant force. It provides a fine-grain parallelism at the thread and/or instruction level, which is important for reducing the computing time of a small- or medium-scale problem. On the other hand, locality means that the interactions are only effective in a short-range and can be reasonably truncated beyond a certain distance. It facilitates the application of the most commonly used space-decomposition strategy for the parallel computing on a high level of processes in distributed systems.

Table 1 Modules and corresponding algorithms for MD-style simulations. Engineering MD

Modules General

Traditional MD MD

PPM SPH DEM

(1) Neighbor search (neighbor/cell-list) (2) Interaction processing (short-range & non-bonded) (3) Configuration update (position, velocity etc.)

Applications Particular

a

Temperature control Pressure control Reaction Many-body potential Long-range interaction Constraints

(not provided in this review) Coal pyrolysis Polyethylene crystallization Whole H1N1 virion Silicon surface reconstruction

– Weighting function Hysteresis effect, friction

Single phase flow Liquid–liquid flow Granular material

Note: “Traditional MD” refers to the models describing mono-spherical particle with symmetric short-range potentials (Alder and Wainwright, 1957; Rapaport, 2006), which is the basis of other MD-style methods. “MD” refers exclusively to the models describing more complex molecular systems, such as polymers, surfactants and proteins, which have bonded and/or long range interactions, or silicon and metal systems, which are modeled by many-body potentials. a

The algorithms in Particular modules should be modified according to the algorithms used in General module.

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

203

In the Configuration update module, the state variables, such as the position, velocity and rotation if needed, are integrated over time. Similar to Interaction processing, these operations are also independent of each other, which guarantees the fine grain parallelism with one thread and/or instruction dealing with one particle. However, different from the Interaction processing module, the memory access operations (load/write) are usually more time-consuming than the arithmetic operations, making the speedup bounded by the memory bandwidth rather than the processor performance. 2.2. Particularity In the Particular modules, specific algorithms are provided for each method. In MD simulation of complex molecular systems, such as polymers and proteins, bonded and/or long range interactions are needed. While in the simulation of solid materials, such as silicon and metal, many-body interactions are required. To mimic the experimental conditions, thermostats (e.g., Nosé, 1984; Hoover, 1985; Berendsen, 1991; Bussi et al., 2007) and pressure controllers (e.g., Parrinello and Rahman, 1981; Berendsen et al., 1984) are needed. In addition, constraints algorithms such as SETTLE (Miyamoto and Kollman, 1992) for rigid water, SHAKE (Ryckaert et al., 1977) and LINCS (Hess et al., 1997; Hess, 2007) for large molecules should be adopted. In DEM simulations, when only contact forces are considered, the actually interacting pairs are less than those in MD simulations. However, various types of interactions, such as the elasticity, friction forces and plastic deformation etc., may present during the contact period (Cundall and Strack, 1979; Zhou et al., 1999, , 2002; Zhu et al., 2007). Two types of friction models, the normal and regularized Coulomb friction models (Han et al., 2000a, 2000b), are usually used. The later one introduces a tangential restoring force which depends on the cumulative relative displacement at the point of contact (history-dependent) to model the elastoplastic effect of the friction (Vu-Quoc and Zhang, 1999a, 1999b; Zhang and Vu-Quoc, 2000). It shows that the history-dependent force plays an important role in correctly predicting the properties of granular systems (Rapaport, 2007; Zhu et al., 2007). With regard to the hardware design, the traditional CPU-based architecture has been used indifferently for all modules previously. However, with the emergence of many-core processors, such as the general-purpose graphics processing units (GPUs) and the dedicated programming environment, such as compute unified device architecture (CUDA) (NVIDIA, 2006), the fine-grain parallelism of MD algorithms, especially that in the Interaction processing and Configuration update modules, can be better exploited. We therefore also developed a customized supercomputer, the Mole8.5 system (Wang et al., 2010a; Ge et al., 2011; Li et al., 2013; Wang and Ge, 2013). Mole-8.5 has a multi-layer architecture in which the top and middle layers are equipped with CPUs to execute relatively complex instructions and organize the communication between the nodes, while the middle and bottom layers are equipped with GPUs to process dynamic evolution of the simulated system in detail, as shown in Fig. 3. Compared with other CPU–GPU hybrid supercomputers in the world, the Mole-8.5 system has the highest ratio of GPU to CPU (3:1). This ratio is not optimal for Linpack performance, but is for MD-style simulations.

3. Implementation of engineering MD With the strategy for engineering MD simulation developed above, simulations of a variety of systems from the atoms to apparatus become feasible. In this section, some of our previous

Fig. 3. Multi-layer architecture of Mole-8.5 and the computation on different layers.

work is reviewed, which is from two aspects as discussed in Sections 3.1 and 3.2. 3.1. Engineering MD simulation in the EMMS paradigm MD simulation is engineered to extend its computational capability by keeping the structural consistency between the system, model, software and hardware, which follows the EMMS paradigm, with the algorithms of fine-grain parallelism accelerated with GPUs and other tasks, such as communication, executed on CPUs. This CPU–GPU hybrid acceleration strategy has enabled the studies on engineering problems such as coal pyrolysis, structural changes in complex macromolecular systems, and large-scale simulation of solid material. 3.1.1. Simulation of coal pyrolysis with ReaxFF MD using CPUþ GPU Thermal decomposition of coal is a complex process in which coal is transformed at elevated temperature to produce gases, tar, and char (Solomon et al., 1988). Coal pyrolysis is the initial and fundamental step in coal transformation and most dependent on the coal properties (Solomon and Hamblen, 1983), therefore, it plays an important role in coal conversion processes such as hydrogenation, gasification and combustion, which accounts for up to 70% of the weight loss during coal transformation (Solomon et al., 1992). However, even with the state-of-the-art experimental approaches, it is very difficult to uncover the mechanism of coal pyrolysis because of the heterogeneous nature of coal and the numerous reaction pathways involved (Solomon et al., 1993). Only recently, the strategy of constructing more realistic representations of coal sheds new light on the details of coal properties at the atomic scale (Mathews et al., 2011). As a bridge between quantum mechanics (QM) (Friesner, 2005) and MD (Klein and Shinoda, 2008), the approach that combines MD with an empirical reactive force field (ReaxFF) developed by van Duin et al. (2001) and Russo and van Duin (2011) allows reactive molecular dynamics (ReaxFF MD) simulations for large and complex coal pyrolysis systems involving chemical reactions. ReaxFF employs bond-order dependent potentials that require no predefined reactive sites or reaction pathways during the simulation. This is probably the most favorable feature for investigating the complicated chemical reactions in coal thermolysis because a priori knowledge on the multiple reaction pathways is not available. However, such potentials certainly cost extra computation due to the detailed description and corrections in bond order based energy items and the dynamic charge equilibrium at each time step that is basically an expensive optimizing process. Following the strategy described in Section 2, the ReaxFF

204

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

algorithms are analyzed further. With the additivity and locality of the atomistic interactions involved, the computationally intensive and the memory-demanding features introduced by the bond order dependency and other complex calculations in ReaxFF can be tackled using GPUs with their fine-grain parallelism and superior memory bandwidth. Therefore, the first version of ReaxFF in the world using GPU– CPU hybrid computing, GMD-Reax (Zheng et al., 2013a) is developed, which significantly improved the performance as compared to the CPU implementations. Running on one Nvidia C2050 GPU, GMD-Reax achieves 4.5–12 fold speed-up over van Duin's FORTRAN codes in LAMMPS on a CentOS 5.4 server with 8 CPU cores of Intel Xeon E5620 2.4 GHz and 2.5–6 fold over the PuReMD-based C codes of LAMMPS in the single-precision version when benchmarked with pyrolysis simulations on coal models containing 1378 to 27,283 atoms. In addition, visualization and analysis of reactive molecular dynamics (VARxMD) (Zheng et al., 2013b, 2014; Li et al., 2014b), the first software for reaction analysis of ReaxFF MD simulation is created and applied to facilitate the analysis on the reaction mechanism in large coal pyrolysis systems that is beyond the capability of existing tools and manual analysis. Using GMD-Reax and VARxMD, the Liulin bituminous coal model containing 28,351 atoms (see Fig. 4), the second largest coal model ever used in ReaxFF MD is simulated for pyrolysis reaction mechanism exploration (Zheng et al., 2014). The temperature effects (1000–2600 K) on the product profiles and the initial chemical reactions are investigated using canonical (NVT) ensembles for 250 ps with a time step of 0.25 fs. It takes roughly 5 days to simulate one case with a constant temperature running on a CentOS 5.4 server with an Intel Xeon E5620 2.4 GHz and a C2050 GPU card attached. In comparison, that of Castro-Marcano et al. (2012) took about 4–6 weeks for a coal char combustion system with 35,458 atoms simulated with the ReaxFF program implemented in ADF software using 4 Intel Xeon X5560 2.8 GHz quad-core CPUs (16 cores) at the Lion-XH HPCS. Using VARxMD, the analysis of the trajectories indicates that the evolution tendencies of the product distribution with time and temperature are broadly consistent with the experimental observations in literature (Solomon and Hamblen, 1983; Solomon et al., 1988, 1992, 1993). As an example, the weight percentages for the three main classes (char, tar and gases) of the products in coal pyrolysis obtained within 250 ps at a series of temperatures are displayed in Fig. 5. The macromolecule network of Liulin coal decomposes quickly to produce smaller fragments, resulting in the phenomenon that char production reaches its minimum at 2200 K

Fig. 4. Liulin coal model containing C14,782H12,702N140O690S37 (Zheng et al., 2014).

28,351

atoms

with

formula

and numerous secondary reactions would occur when the temperature is further increased. In addition, the evolution trends of other specific fragments (phenols, naphthalenes, CO2, CO, CH4, etc.) with time and temperature are also obtained. Apart from the product profiles, details of the complex chemical reactions in the early stage of Liulin coal thermolysis are revealed with the aid of VARxMD, which are hardly accessible with experimental techniques. In particular, the decomposition pathways of the aromatic compounds (see Fig. 6) can be directly observed from the reaction lists and their 2D chemical structure diagram generated by VARxMD. The 6-membered rings first transform to 5- or 7-membered rings and then decompose to form other smaller fragments afterwards. Through this work, a methodology combining high performance GPU–CPU hybrid computing with cheminformatics analysis in ReaxFF MD has provided an additional insight into the complex reaction mechanisms of coal pyrolysis at the atomic level. 3.1.2. Simulation of the macromolecular systems with CPUþ GPU At this scale, MD simulation can be used as a powerful “computational microscope” to optimize the structures from the raw data obtained from experiments (Zhao et al., 2013) or to probe the dynamic structural change (Dill and MacCallum, 2012) in many complex macromolecular systems, especially when the spatiotemporal resolution of experimental facilities is insufficient. On the other hand, simulations may involve millions or even billions of particles, resulting in huge computational cost. The interacting particles in the macromolecular systems are the atoms which are processed by the three general modules in the framework of Fig. 2. In addition, the bonded interaction between atoms can be described by the Particular modules. Unlike the non-bonded short-range interaction, bonded interactions have strong correlations among different particles. In the parallel algorithm, the bonded information in each domain needs to be reconstructed due to the movement of the atoms among different domains. As illustrated in Fig. 7, two methods have been developed, termed as SET and SEARCH (Xu, 2012). SET (Fig. 7a) generates the local bonded interaction lists directly from the mapping of the locations of the atoms in the memory (termed atom memory positions, AMPs) to the global atom indices, using an additional mapping array. The local bonded interaction lists are just filled with AMPs instead of the global atom indices. This method is applied to relatively small systems, because the mapping array is of “Int” type with a length of the total number of atoms (N). If N ¼107, 38.1 MB memory is needed, which is acceptable. If N ¼3.0  108, 1.14 GB memory is needed, which is unacceptable. Thus, the SEARCH method for super large systems is

Fig. 5. Weight percentage of product compounds obtained from 250 ps ReaxFF MD simulation on Liulin coal pyrolysis using the canonical (NVT) ensemble at 1000– 2600 K (Zheng et al., 2014).

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

205

Fig. 6. Decomposition pathway of aromatic rings directly observed in ReaxFF MD simulations with GMD-Reax and VARxMD (Zheng et al., 2014).

Fig. 7. Algorithms of local bonded interaction lists generation: (a) SET and (b) SEARCH.

developed (Fig. 7b). In SEARCH, the bonded interactions of the atoms residing in the real space (black atoms) are generated by searching the neighboring atoms, including the atoms in the ghost space (gray atoms). For example, the local bonded list of atom i is

generated by searching the neighboring atoms within radius Rc. Though the time needed by SEARCH is longer than SET, it saves the additional mapping array (Xu, 2012). In addition, some other algorithms in the Particular modules are needed. To mimic the experimental conditions, the thermostat (Nosé, 1984; Hoover, 1985) and pressure control (Berendsen et al., 1984) algorithms have been developed. The reaction field algorithm is introduced to deal with the long-range electrostatic interaction (Tironi et al., 1995). Integrating the three general modules for short-range interaction and the algorithms in the Particular modules, the computational capability of MD simulation is significantly expanded, making it possible to probe the dynamic structure of a complex macromolecular system. Simulation on polymer entanglement and crystallization may serve as an example. Simulating the crystallization of 1200 polyethylene (PE) molecules with only one GPU and one CPU core for management (Xu et al., 2010) is now possible. As compared to the work of Yu et al. (2008), the simulated scale is 8 times larger. Each PE molecule consisted of 150 monomers, with each monomer modeled as two types of united-atoms, representing 298 CH2 and 2 CH3 respectively (Fig. 8a). The Dreiding II force field (Mayo et al., 1990) is adopted. The system studied contains 360,000 unitedatoms totally. The entanglement process and the following crystallization process are captured. Fig. 8 presents a snapshot of the interpenetration and crystallization, and a microscopic view of a crystal domain. In this work, the simulation is about 39 times faster than the CPU-based serial code, and more crystallization domains with different orientations linked by amorphous domains are observed as compared to the small systems (Yu et al., 2008). Another example is from biomolecular systems. For a long time, MD simulation at this scale usually focuses on individual biological molecules, but now it is possible to tackle macro-biomolecular

206

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

Fig. 8. Interpenetration and crystallization of 1200 PE molecules: (a) one PE molecule with the head and tail particle in purple and blue respectively; (b) snapshot of the interpenetration after 30 ns under 1000 K; (c) snapshot of the crystallization after 50 ns under 600 K and (d) microscopic view of a crystal domain. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Modified from Xu et al. (2010).

Fig. 9. Main procedure for probing the dynamic structures of a H1N1 virion at atomic scale. To view the interior, the pictured dynamic structure only shows 1/8 of the whole H1N1 virion. Modified from Xu et al. (2011b).

complexes or even biological agents with the strategy described in Section 2. The simulation of a whole influenza virion (H1N1) at the atomic scale is carried out. The main procedure is depicted in Fig. 9 (Xu et al., 2011b). Based on the data from electron microscopy and the theoretical model of H1N1 (Lamb and Krug, 2001), the sequences and structures of all the protein and RNA components are collected and constructed on computer, which contains 2240 proteins, 63,471 DPPC (dipalmitoyl-phosphatidylcholine) molecules and 8 RNA strands. The whole H1N1 is solvated into SPC water with an appropriate concentration of ions, resulting in about

300 million atoms (Xu et al., 2011b). With 1728 C2050 GPUs of the 288 low level hybrid computing nodes in the Mole-8.5 system as the main computational hardware and the same number of CPUs to control data communication among different GPUs, the dynamics of the structures is reproduced at 770 ps/day with an integration time step of 1 fs, that is, about 770,000 MD time steps in one day. The dynamic structure is analyzed, which shows that the virion experiences a significant change to obtain a stable structure. The virion sizes along x/y/z directions are also analyzed, all of which are calculated from the lipid membrane and trans-membrane

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

proteins, showing similar trends. The virion shrinks sharply during the first 0.1 ns, suggesting a quick adjustment of the unfavorable structures given by the static model. Then it expands a little in the next 0.05 ns, and remains comparatively stable thereafter. The hardware and software used in this work serve as a virtual lab to keep the virion alive. This endeavor may help us to investigate the biological behaviors of the influenza virion, find its weakness, and thus provide valuable knowledge to prevent future influenza epidemics.

3.1.3. Simulation of covalent solid material with many-body MD using CPU þGPU Although the simulation of solid materials is also at the atomic scale, the covalent interactions in these materials require algorithms for many-body potentials, such as the Tersoff potentials (Tersoff, 1988a, 1988b), which are different from the bonded interaction in the simulation of macromolecular system. In this case, some other features are commonly encountered. The dynamic deformation occurs frequently in the zones containing surface defects such as grain boundary, dopants etc., resulting in frequent change of bonded neighbors, while the particles in the bulk zones only vibrate slightly around their equilibrium positions, with nearly fixed bonded neighbors (Fig. 10). Taking into account these features, to make full use of the multi-scale computing systems such as Mole-8.5, a CPU–GPU hybrid algorithm is developed. Due to the complexity in memory access and instruction processing, the dynamic deformation zones with less computational workload are assigned to CPUs, and the algorithms for nonbonded interactions are adopted, including neighbor search (in

Fig. 10. EMMS paradigm for the simulations of solid materials. Modified from Hou et al. (2013b).

207

Table 1). On the other hand, the zones with regular crystalline structure and intensive computational workload are assigned to GPU. Furthermore, to improve the cache hit rate and the computational efficiency, a supercell storage algorithm has been developed, where sorted internal particles are arranged in successive locations in memory (Hou et al., 2013a). Moreover, the parallelization of the implementation is straightforward, with space decomposition and shift-style message passing interface (MPI) communications. In this way, the spatial dimension of the simulations on crystalline silicon has been elevated significantly with unprecedented performance. Weak scaling is ideal until the full system scale on both Mole-8.5 and Tianhe-1A (Top500, 2010; Hou et al., 2013b), and strong scaling is also good for decreasing atom number per node. A sustainable performance of 1.87 petaflops in single precision is achieved in the simulation of the bulk silicon crystal with about 1.56 mm in length and totally 110.1 billion atoms. That is, to our knowledge, the highest performance of MD simulation at that time (early 2011; Hou et al., 2013b), maybe the nearest work is from Roadrunner (Germann et al., 2009), which reached 369 teraflops in double precision. The final sustainable performance on the whole simulation coupling all CPUs and GPUs is 1.17 petaflops in single precision plus 92.1 teraflops in double precision. The current CPU–GPU hybrid implementation has been applied successfully in material science and engineering, such as those for the study of surface reconstruction, polycrystalline silicon, silicon nanowire and so forth. Fig. 11 is the simulation results for the surface reconstruction of crystalline silicon with 111.2 billion atoms (Hou et al., 2013b), where only the results from one node are displayed. The captured snapshots obviously

208

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

Fig. 11. Large-scale simulation of the 2  1 reconstruction on Si(010) surface.

demonstrated the reconstruction pattern of 2  1 with dimerization on the flat Si(010) surface, which is the well-known phenomenon for the silicon surface, and is comparable to most reports in literature (Buehler and Boland, 2000; Ohira et al., 2000; Jung et al., 2001; Cicero et al., 2004). The scalability provided by this implementation is extremely valuable for avoiding finite size effect in non-equilibrium MD simulation, and predicting the intrinsic thermal properties of silicon nanowires and semiconductors with low doping concentrations (Hou et al., submitted for publication). 3.2. Engineering MD simulation by extending its element model To deal with macro-scale engineering problems, including single and multiple phase flows and granular flow, besides the CPU–GPU hybrid acceleration, the spatio-temporal scale of the simulations is elevated further by modeling higher level systems with discrete particles. 3.2.1. Simulation of fluid flow with PPM Navier–Stokes equations are widely believed to be sufficient to describe macro-scale hydrodynamic behaviors including the long-lasting mystery of turbulence. However, it is still tempting to understand the deeper mechanism behind from the bottom layer of atoms and molecules. Anyway, Navier–Stokes equations are based on the statistical mechanics of fluids and the continuum assumption, while MD is directly based on Newton's laws, the most explicit and reliable description within the realm of classical mechanics. MD has successfully reproduced the macroscopic hydrodynamic behaviors such as flow past a single particle (Rapaport, 1987). However, for flow at larger scales, such as turbulence, its high computational cost is still a practical barrier. In traditional MD, most computation is spent on the elaborate description of molecular interactions. It is interesting to note that these MD models are already drastically simplified representation of real molecules and their interactions, but it does not restrict them to express reasonable macro-scale flow behavior. Therefore, we may ask whether further simplifications can be made to lower the computational cost. This inspires us to represent a swarm of molecules or atoms as the most fundamental elements in fluid flows, while their discrete molecular nature is still retained. That is the so-called pseudo-particles modeling (PPM) (Ge and Li, 1996, 2003). PPM discretizes the fluid into a large number of smooth and rigid pseudo-particles (PPs), which are by far smaller than real macroscopic particles but larger than molecules and atoms. Each pseudo-particle has four properties: mass; radius;

position and velocity. All PPs move synchronously in uniform time steps. Generally speaking, for the problems we are interested in, PPM will omit the microscopic details of the fluids that have no essential contribution to their macroscopic properties. With PPM, it is now possible to reproduce the first stage of the transition from laminar to turbulent flow from molecular scale with affordable computation (Ge and Li, 2003). In the following case, the fluid is driven by a constant axial body force (g) in a two-dimensional duct with periodic boundary condition axially, forming the so-called plane Poiseuille flow. The setup of the simulation is illustrated in Fig. 12 and the major dimensionless parameters are listed in Table 2, taking the PP radius and mass and the time step as units of length, mass and time, respectively. In Table 2, ν0 is the thermal velocity, r is the radius of particle, N is the number of particles, and l is the pitch between PPs in the initial square array. Fig. 13 presents the evolution of flow field in this simulation. At the beginning, the flow is purely laminar with axial velocity only, and then it becomes instable and transverse velocities develops spontaneously and gradually, presenting the Tollmien– Schlichting (T–S) waves (Grossmann, 2000). It is comparable to the T–S waves agitated by artificial disturbance in CFD simulation (Hwang and Wu, 1998), but it is amplified from the molecular thermal fluctuations directly, not a numerical result at the continuum level. The resulting Reynolds number Reh ¼VhW/(2ν) is estimated at 1400, where ν is the kinematic viscosity. It seems that at this very limited scale, the transition starts at an earlier stage than normal (Re 2300). However, further development to more complicated flow structures is still impossible at this Re range, as we can see at the later period of the evolution: the T–S waves are distorted but no new structure can be found. Fortunately, the parallel implementation of PPM has shown good scalability and speed-up (Lu et al., 2009), so more interesting scenario of laminar-turbulent transition can be expected to display in PPM simulations at larger scales. For this purpose, 3D channel flow is then studied. As shown in Fig. 14, the simulation setup is similar to the 2D case. Totally 2.8 billion PPs are enclosed in a rectangular box of the size H by W by D. Periodic boundary condition is applied in the X and Z directions, with no-slip boundary condition in the Y direction. A body force of intensity g is exerted in the X direction. The main parameters are listed in Table 3. At time t ¼0, the fluid is at rest, and then the body force takes effect, representing a pressure gradient. The velocity profile for the unsteady flow resulted in this case is described by the analytical solution (Bird et al., 1960): 1 J ðα ξÞ uðξ; τÞ n expð  α2n τÞ ¼ ð1 ξ2 Þ  8 ∑ 30 umax n ¼ 1 αn J 1 ðαn Þ

ð1Þ

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

where ξ ¼|W 2y|/W, τ¼ 4νt/W2, umax is the maximum velocity, Jn(ξ) is the nth order Bessel function of ξ, and the αn are the roots of the equation J0(αn) ¼0. Fig. 15 presents the detailed flow field in the 3D case. Initially, the fluid velocity in both the core and near-wall region is small. With the time going, the fluid velocity in the core region becomes higher and higher, but it still decreases to zero towards the walls, meaning that the no-slip boundary condition is successfully kept. In Fig. 16, the domain-averaged velocity profiles from the

209

simulation at different sampling instances are compared with the analytical solution of Eq. (1). One can observe that the PPM results agree well with the analytical solutions. The simulation above is carried out on the Mole-8.5 system using 160 CPUs with domain-decomposed parallelization, and it run at 0.48 ns (2400 time steps) per day. Due to the time limit, the flow has not fully developed yet and whether it will further destabilize to produce T–S waves is yet to be explored. However, it can be anticipated that this PPM-based simulation can reveal more details than continuum-based methods on microscopic origin of macro-scale flow including turbulence. Studies in this direction are still underway.

Fig. 12. Schematic diagram for 2D simulation of plane Poiseuille flow with PPM.

Table 2 Parameters for the simulation on plane channel flow. r

l

H

W

N

ν0

g

1.0

3.0

4800

600

640,000

0.08

2.0  10  6

Fig. 14. Schematic diagram for 3D simulation of micro-channel flow with PPM.

Fig. 13. Evolution of plane Poiseuille flow simulated in PPM.

210

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

Table 3 Simulation parameters for a 3D micro-channel.

Dimensionless Dimensional

r

H

W

D

ν0

ν

g

Δt

1.0 2 nm

5400 10.8 μm

1500 3 μm

900 1.8 μm

0.08 800 m s  1

0.0893 1.79  10  6m2 s  1

5.0  10  7 2.5  1010m s  2

1 0.2 ps

Fig. 15. Detailed flow field in 3D micro-channel flow. (a) 100ps, (b) 500 ps, (c) 1.3ns, (d) 3 ns, (e) 4.8 ns and (f) 5.3 ns.

3.2.2. Simulation of liquid–liquid flows with SPH In Section 3.2.1, we use discrete particle methods for the simulation of fluid flows where their discrete origin or properties are of interest. In this section, we will demonstrate that even if we are only interested in the macro-scale behavior of the flows, discrete particle method can still serve as an advantageous approach in some cases. In contrast to PPM based on the simplifications of microscopic interactions, the SPH method

depicts the inter-particle forces directly using the discretization of macroscopic Navier–Stokes equations. Multiphase flows of immiscible fluids exist in a wide range of engineering problems, including oil recovery, packed bed chemical reactors, groundwater contamination, and so forth. As for the simulation of liquid–liquid flows, a suitable description of topological transitions of interface is crucial to accurate results. There already exist a number of numerical algorithms based on Eulerian

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

grids, such as volume of fluid (VOF) (Hirt and Nichols, 1981) and level set method (Sussman et al., 1994). However, due to the problems of mesh adaptability and connectivity, these techniques generally encounter difficulties in coping with complicated coalescence and fragmentation of dynamical interfaces. In contrast, SPH, as a purely meshless Lagrangian approach, has remarkable advantages in this regard. Although initially proposed for astrophysical applications (Gingold and Monaghan, 1977; Lucy, 1977), it has been successfully applied in various areas of fluid dynamics as well as solid mechanics (Monaghan, 2005; Liu and Liu, 2010). In the SPH methodology, the continuous medium is decomposed into a collection of discrete particles associated with their own physical properties, which constantly update densities and velocities by explicit integration of the continuity and momentum equations. Consequently, the interfacial shapes can be readily identified and tracked based on the natural distributions of particles in each phase. In addition, the SPH method can also be implemented under the strategy of engineering MD with the modifications of the three general modules, particularly the Interaction processing module. As illustrated in Fig. 17 (Zhou et al., 2010), the immiscible displacement of oil by water in a simple structure is investigated using SPH and the so-called pseudo periodic boundary conditions (inflow and outflow), where water and oil are denoted by blue and red colors, respectively. The revised model of surface tension proposed by Zhou et al. (2008, 2010, 2013) has been employed,

Fig. 16. Velocity distributions of channel flow at different sampling times (symbols represent PPM results and solid lines represent analytical solutions).

211

where a special repulsive force is exerted between adjacent particles of different phases. In contrast to some other surface tension models, this model has relatively simple formulations, and is computationally more efficient. The rigid wall in Fig. 17 is neutral to both water and oil, while the inflow velocity is 5  10  4 m s  1 and the width of vertical fracture is 1.5 mm. The water injected from the left inlet would sink at the bottom of the vertical fracture in the form of droplets, resulting in the gradual elevation of the interface. A critical width of the vertical fracture is found, beyond which the oil could be displaced out of the cavity. Besides, lower inflow velocity, lower oil density and water-wet wall are found to be favorable to a smaller critical width. Moreover, the selective withdrawal of oil and water from a microcavity has also been simulated (Zhou et al., 2013), which is meaningful to the technological applications of microfluidic systems. As presented in Fig. 18, the initial interface between the upper and lower phases is flat, while the inner fluid is withdrawn at a prescribed flow rate V0. A critical withdrawal rate corresponding to each initial interface height is identified, beyond which the lower phase would be entrained in a thin spout together with the upper phase. In addition, smaller dynamical viscosity and density of the upper phase, as well as larger surface tension, lead to longer threshold time (arrival of lower phase at outlet), which is in favor of the withdrawal of upper phase in terms of both quantity and efficiency. With regard to the large network of cavities and fractures as shown in Fig. 19 (Wang et al., 2010b; Li, 2013), the cost of direct simulation with uniform distribution of particles in both the cavities and fractures is huge, due to the large size ratio of the cavities to the fractures. A multi-scale simulation strategy based on the EMMS paradigm (Table 4) is promising for this kind of problems, which could effectively balance efficiency and precision. Actually, some efforts have been made in this direction, where the cavities are simulated with the SPH method while the fractures are simply depicted by the network model based on pressure balance (Li, 2013). The simulation of each cavity is implemented on one GPU device, but these cavities could not directly exchange data with each other. The species and quantity of the outflow fluids from the cavities are transferred to the network model, which then sends back the species and quantity of the inflow fluids for each cavity. The information exchange between cavities and fractures is carried out through network communication or file access. Fig. 19 is a typical snapshot of the displacement process, where there are 40 cavities connected with fractures in various ways and the cavities also have different wettability, involving neutral, waterwet or oil-wet walls. The water is injected from the left inlet, and it

Fig. 17. Displacement process of oil by water with neutral wall. The oil and water are marked with red and blue colors, respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) (a) 20 s, (b) 40 s, (c) 60 s and (d) 88 s. Adapted from Zhou et al. (2010).

212

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

Fig. 18. Sketch of the microcavity with a top outlet (a) and the interface evolution with initial height of 7.68 mm and withdrawal rate of 0.2 m s  1 (b). Adapted from Zhou et al. (2013).

Fig. 19. Multi-scale simulation of the oil displacement process in a porous network.

Table 4 Consistency between the four aspects in multi-scale simulation of the displacement process. Problem

Model

Algorithm

Hardware

Cavities Fractures

SPH Network

Intensive computation, low precision, Less computation, high precision

Many-core (GPU), single precision Multi-core (CPU), double precision

gradually displaces the initial oil inside the cavities and fractures toward the outlet on the right-hand side. With such a multi-scale approach, larger and more complex network of cavities can be simulated with very good scalability. A more detailed description will be available in our future publication (Wang et al., 2014).

3.2.3. Simulation of granular flows with DEM Granular flows are quite common in nature and various engineering practices, which usually exhibit quite complicated rheological phenomena different from conventional gases, liquids, and solids (Aranson and Tsimring, 2006; Campbell, 2006). DEM (Cundall and Strack, 1979) simulation is advantageous in terms of flow details as compared to other methods. The elements in this method are the macroscale particles and suitable contact models depict the interactions. These interactions are dissipative, due to the inelastic collisions as well as the presence of sliding and rolling friction. There already exist many versions of DEM, including traditional linear model and various non-linear models (Zhu et al., 2007). The contact models of DEM are generally rather complicated, and a huge number of particles are involved in industrial-scale problems, both of which will result in considerable computational cost. Fortunately, the computational speed and efficiency of DEM can also be significantly enhanced with our strategy. For a sophisticated DEM model, which contains the history-dependent forces, the neighbor list algorithm should be modified to include the contact history list. As for relatively simple DEM models without the historydependent forces, the neighbor list approach in traditional MD can

be applied directly. What distinguishes the neighbor list used in DEM from that in MD is that the number of neighboring particles is usually less due to the contact nature of DEM interactions. Moreover, the Configuration update module needs to incorporate the rotational movement of macroscopic particles. All these modifications can be readily realized in the three general modules. With the CPUs taking on the computationally less intensive tasks such as parallel communication, particle insertion and deletion, and the GPUs focusing on the main computational task of interaction processing, their operations can be overlapped and the overall performance can be boosted further. In this review, it should be noted that, only the contact forces in DEM are considered, although the non-contact forces, such as van der Waals and electrostatic forces, can also be readily incorporated into DEM. Rotating drums are ubiquitous in a wide range of industries, including mixers, reactors, dryer and kilns. As illustrated in Fig. 20 (Xu et al., 2011a), an industrial-scale rotating drum (in the clockwise direction) is simulated, where almost 10 million particles with a diameter of 0.01 m is involved. The length and diameter of the drum are 13.5 m and 1.5 m, respectively. The rigid wall and eight blades are composed of “frozen” particles, which have the same particle sizes and material properties as the interior moveable particles. Besides, a simplified contact model is used to reduce the computational cost. The simulation is carried out using 46 nodes (including 270 C2050 GPUs) of Mole-8.5 (Wang and Ge, 2013) with a time step of 1.0  10  4s, leading to nearly 10% realtime speed. During the simulation, the results are processed at the middle layer (computing/displaying nodes) of Mole-8.5, and visualized in realtime on the LCD display wall. Fig. 20 gives a

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

photograph of the simulation result on the display wall, which just presents a segment of the rotating drum. It is found that those blades can effectively enhance the particle mixing in the rotating drum. Moreover, as illustrated in Fig. 21, the simulation on the interaction between two streams of granular flows, which has been studied previously (Ge et al., 2007), is further accelerated. Driven by body forces in different directions, the two streams of particles with identical size and material properties move head on

Fig. 20. Simulation of industrial-scale rotating drum. Modified from Ge et el. (2011).

213

and then interact with each other. During the simulation, the particles are constantly inserted from the two top inlets, and immediately deleted when they leave the computational domain. The interactions of these two streams produce an oscillating structure in both space and time, which could be explained as a compromise between the two dominant mechanisms in the flows (Ge et al., 2007). Due to the efficient implementation on Mole-8.5 and the comparatively small number of particles, fully realtime simulation is achieved in this case. Moreover, with the combination of simulation, post-processing, display, and control modules, a prototype of virtual process engineering (VPE) (Liu et al., 2012) is realized in this case: the relevant parameters can be adjusted through the control module at runtime, including the flow directions, force intensities, and densities of the two streams. Actually, following this example, other VPE applications could be implemented likewise. The DEM simulation capabilities established so far are of practical importance already. For example, efficient mixing of granular materials is quite important for lots of processes in chemical, mining, pharmaceutical industries, and so forth (Bridgwater, 2012). The information from DEM simulations is valuable for the optimal design of mixers as well as the choice of its operating conditions. As illustrated in Fig. 22 (Qi, 2014), an industrial screw conveyor is simulated. The influences of geometrical parameters and operating

Fig. 21. Snapshot from realtime simulation of two streams of granular flows: (a) simulation setup; (b) control panel; and (c) one of the results.

214

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

Fig. 22. Particle velocity distribution in a screw conveyor. Modified from Qi (2014).

Table 5 Parameters used in the screw conveyor simulations. Conveyor variables

Base value

Explored range

Length (mm) Outer diameter (mm) Axis diameter (mm) Pitch I (mm) Pitch II (mm) Inlet I diameter (mm) Inlet II diameter (mm) Inlet I center (mm) Inlet II center (mm) Feeding ratio of I–II Outlet diameter (mm) Screw speed (rpm) Feeding rate (kg h  1)

1600 300 100 200 315 230 130 115 835 10 100 12 3176

1600–2420

Particle properties Density (kg m  3) Diameter (mm) Young's modulus (N m  2) Sliding friction coefficient Poisson's ratio Coefficient of restitution Time step (s)

200–350

465–835

12–32 1096–3176

Value 1900 2.4 5.0  105 0.3 0.5 0.8 5.0  10  6

conditions on mixing behaviors are investigated, including conveyor length, screw pitches, inlet positions, screw speed and feeding rate, as listed in Table 5. As depicted in Fig. 22, the longest conveyor contains about 4.5 million particles. It is found that the mixing quality is enhanced as the rotating speed and the length of the screw conveyor increase. Besides, the feeding rate has remarkable influence on the mixing performance but less impact on the residence time distribution.

4. Conclusions and prospect In this review, we demonstrate that MD simulation is not only a powerful tool for understanding the micro-scale behavior of materials and processes in chemical engineering, but can also be engineered to treat meso- and macro-scale systems in two aspects. First, thanks to the multi-level, multi-scale structures common to most chemical engineering systems, and the discrete nature of their constituent elements, different elements could be tracked instead of atoms and molecules, elevating the spatio-temporal scale of simulation by modeling higher level systems with discrete particles. The simulation methods can be realized in a framework similar to MD with three General modules of Neighbor search, Interaction processing and Configuration update, together with specific algorithms in the Particular modules. Second, the fidelity of the elemental presentation of the system and the parallelism and scalability provided by the locality and additivity of their interactions facilitate the fine-grain parallel acceleration with GPUs. Combined with the CPUs to deal with the complex logic tasks

and the parallel communication, the computational ability is elevated with this multiscale CPU–GPU hybrid strategy. In fact, it is becoming more and more important with the ever-increasing demand for better quantification in chemical engineering and affordable with the drastic revolution in computing technology. An exciting future development of engineering MD simulation along this direction is possibly interactive computing and virtual reality, or VPE (Ge et al., 2011; Liu et al., 2012), as we have demonstrated preliminarily in Section 3.2.3. That means, the simulations are almost based on first-principles, which are qualified candidates for virtual experimental systems in terms of accuracy and resolution; and on the other hand, the simulation speed will finally catch up with the evolution of the physical system, that is, in realtime. For large elements like the solid particles in small systems, this can be reached in a strict sense already, even with modest computational cost. For real atoms and molecules, however, it can be realized now only in a loose sense even with the fastest computer in the world, that is, visually perceivable motion of the molecules can be observed during the computation if they are enlarged enough. In both cases, we are already able to operate the simulated system online. Considering the rooms for further improvement of the models and algorithms and the consistent development of computer hardware following the EMMS paradigm, VPE based on engineering MD simulations may become a routine practice in both product engineering and process engineering in one decade or so, by then Exaflops performance should become normal for supercomputers. Recently, we are at a step closer to this prospect in that a labscale gas–solid fluidized bed, 6 m in height and 0.1 m in diameter with 7.56 billion particles of 80 micron (diameter), has been simulated successfully at a speed of 2 h/s using 120 Nvidia C2050 GPUs for the solids and the same number of Intel E5520 CPU cores for solving the gas flow (Lu et al., 2014a,b), which is achieved by an EMMS-based discrete particle method with coarsegraining. The scalability of the simulation suggests that smoothly interactive computing can be realized in our Mole-8.5 system with this method, and strict realtime simulation should be attainable on Exaflops systems in the future. With the engineering MD simulation methods applied to the first two levels in Fig. 1, the complex structure in a reactor can be given at a high resolution. Although the concept of engineering MD simulation cannot be applied directly to the level of system, a single reactor, or similar facilities, can therefore be treated as a black box with well-defined input–output functions, which represents an element at this level. A whole chemical engineering process and its environment can then be treated as a network of these elements. Though agent-based simulations in process system engineering (PSE) have appeared for a long time (Gotts et al., 2003; Grimm et al., 2006; Hao et al., 2006; Serrano et al., 2014), it is still very time-consuming to optimize the whole process and assess its ecological effect. Whether the concept of Engineering MD can be applied at this level to speed up the performance of agent-based simulations and improve their accuracy is another exciting field to be explored.

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

Acknowledgments This work is financially supported by the National Natural Science Foundation of China under Grant nos. 21106147, 21206167, 21373227, 21225628 and 21106155 and the Chinese Academy of Sciences under Grant nos. KGCX2-YW-124 and XDA07080102. The authors would thank Prof. Junwu Wang, Dr. Feiguo Chen, Dr. Huabiao Qi, Ms. Mo Zheng, Dr. Ying Ren, and Prof. Li Guo of IPE, CAS for many helpful discussions and assistance.

References Alder, B.J., Wainwright, T.E., 1957. Phase transition for a hard sphere system. J. Chem. Phys. 27, 1208–1209. Aranson, I.S., Tsimring, L.S., 2006. Patterns and collective behavior in granular media: theoretical concepts. Rev. Mod. Phys. 78, 641–692. Berendsen, H.J.C., 1991. Transport properties computed by linear response through weak coupling to a bath. In: Meyer, M., Pontikis, V. (Eds.), Computer Simulation in Materials Science. Springer, Netherlands, pp. 139–155. Berendsen, H.J.C., Postma, J.P.M., van Gunsteren, W.F., DiNola, A., Haak, J.R., 1984. Molecular dynamics with coupling to an external bath. J. Chem. Phys. 81, 3684–3690. Berendsen, H.J.C., Postma, J.P.M., van Gunsteren, W.F., Hermans, J., 1981. Interaction models for water in relation to protein hydration. Intermol. Forces 14, 331–342. Bird, R.B., Stewart, W.E., Lightfoot, E.N., 1960. Transport Phenomena. John Wiley & Sons, New York. Bridgwater, J., 2012. Mixing of powders and granular materials by mechanical means – a perspective. Particuology 10, 397–427. Buehler, E.J., Boland, J.J., 2000. Dimer preparation that mimics the transition state for the adsorption of H2 on the Si(100)-2  1 surface. Science 290, 506–509. Bussi, G., Donadio, D., Parrinello, M., 2007. Canonical sampling through velocity rescaling. J. Chem. Phys. 126, 014101. Campbell, C.S., 2006. Granular material flows – an overview. Powder Technol. 162, 208–229. Castro-Marcano, F., Kamat, A.M., Russo Jr, M.F., van Duin, A.C.T., Mathews, J.P., 2012. Combustion of an Illinois No. 6 coal char simulated using an atomistic char representation and the ReaxFF reactive force field. Combust. Flame 159, 1272–1285. Chen, F., Ge, W., Li, J., 2009. Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs. Sci. China Ser. B: Chem. 52, 372–380. Cicero, G., Galli, G., Catellani, A., 2004. Interaction of water molecules with SiC(001) surfaces. J. Phys. Chem. B 108, 16518–16524. Cundall, P.A., Strack, O.D.L., 1979. A discrete numerical-model for granular assemblies. Geotechnique 29, 47–65. Dill, K.A., MacCallum, J.L., 2012. The protein-folding problem, 50 years on. Science 338, 1042–1046. Friesner, R.A., 2005. Ab initio quantum chemistry: methodology and applications. Proc. Natl. Acad. Sci. USA 102, 6648–6653. Ge, W., Chen, F., Gao, J., Gao, S., Huang, J., Liu, X., Ren, Y., Sun, Q., Wang, L., Wang, W., Yang, N., Zhang, J., Zhao, H., Zhou, G., Li, J., 2007. Analytical multi-scale method for multi-phase complex systems in process engineering – bridging reductionism and holism. Chem. Eng. Sci. 62, 3346–3377. Ge, W., Li, J., 1996. Pseudo-particle approach to hydrodynamics of particle–fluid systems. In: Kwauk, M., Li, M., J. (Eds.), Proceedings of the 5th International Conference on Circulating Fluidized Bed. Science Press, Beijing, pp. 260–265. Ge, W., Li, J., 2000. Conceptual model for massive parallel computing of discrete systems with local interactions. Comput. Appl. Chem. 17, 385–388. Ge, W., Li, J., 2003. Macro-scale phenomena reproduced in microscopic systems – pseudo-particle modeling of fluidization. Chem. Eng. Sci. 58, 1565–1585. Ge, W., Wang, W., Yang, N., Li, J., Kwauk, M., Chen, F., Chen, J., Fang, X., Guo, L., He, X., Liu, X., Liu, Y., Lu, B., Wang, J., Wang, J., Wang, L., Wang, X., Xiong, Q., Xu, M., Deng, L., Han, Y., Hou, C., Hua, L., Huang, W., Li, B., Li, C., Li, F., Ren, Y., Xu, J., Zhang, N., Zhang, Y., Zhou, G., Zhou, G., 2011. Meso-scale oriented simulation towards virtual process engineering (VPE) – the EMMS paradigm. Chem. Eng. Sci. 66, 4426–4458. Ge, W., Xu, J., Xiong, Q., Wang, X., Chen, F., Wang, L., Hou, C., Xu, M., Li, J., 2013. Multi-scale continuum-particle simulation on CPU–GPU hybrid supercomputer. In: Yuen, D.A., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (Eds.), GPU Solutions to Multi-scale Problems in Science and Engineering. Springer, Berlin Heidelberg, pp. 143–161. Germann, T.C., Kadau, K., Swaminarayan, S., 2009. 369 Tflop-s molecular dynamics simulations on the petaflop hybrid supercomputer Roadrunner. Concurr. Comput.: Pract. Exp. 21, 2143–2159. Gingold, R.A., Monaghan, J.J., 1977. Smoothed particle hydrodynamics– theory and application to non-spherical stars. Mon. Not. R. Astron. Soc. 181, 375–389. Gotts, N.M., Polhill, J.G., Law, A.N.R., 2003. Agent-based simulation in the study of social dilemmas. Artif. Intell. Rev. 19, 3–92. Grimm, V., Berger, U., Bastiansen, F., Eliassen, S., Ginot, V., Giske, J., Goss-Custard, J., Grand, T., Heinz, S.K., Huse, G., Huth, A., Jepsen, J.U., Jørgensen, C., Mooij, W.M., Müller, B., Pe’er, G., Piou, C., Railsback, S.F., Robbins, A.M., Robbins, M.M., Rossmanith, E., Rüger, N., Strand, E., Souissi, S., Stillman, R.A., Vabø, R., Visser, U.,

215

DeAngelis, D.L., 2006. A standard protocol for describing individual-based and agent-based models. Ecol. Model. 198, 115–126. Grossmann, S., 2000. The onset of shear flow turbulence. Rev. Mod. Phys. 72, 603–618. Han, K., Peric, D., Crook, A.J.L., Owen, D.R.J., 2000a. A combined finite/discrete element simulation of shot peening processes – Part I: studies on 2D interaction laws. Eng. Comput. 17, 593–620. Han, K., Peric, D., Owen, D.R.J., Yu, J., 2000b. A combined finite/discrete element simulation of shot peening processes – Part II: 3D interaction laws. Eng. Comput. 17, 680–702. Hao, Q., Shen, W., Zhang, Z., Park, S.-W., Lee, J.-K., 2006. Agent-based collaborative product design engineering: an industrial case study. Comput. Ind. 57, 26–38. Hess, B., 2007. P-LINCS: a parallel linear constraint solver for molecular simulation. J. Chem. Theory Comput. 4, 116–122. Hess, B., Bekker, H., Berendsen, H.J.C., Fraaije, J.G.E.M., 1997. LINCS: a linear constraint solver for molecular simulations. J. Comput. Chem. 18, 1463–1472. Hess, B., Kutzner, C., van der Spoel, D., Lindahl, E., 2008. GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput. 4, 435–447. Hirt, C.W., Nichols, B.D., 1981. Volume of fluid (VOF) method for the dynamics of free boundaries. J. Comput. Phys. 39, 201–225. Hoover, W.G., 1985. Canonical dynamics: equilibrium phase-space distributions. Phys. Rev. A 31, 1695–1697. Hou, C., Ge, W., Xu, J., Li, J., 2014. Molecular dynamics simulation overcoming finite size effects of thermal conductivity. Appl. Phys. Lett. (submitted for publication). Hou, C., Xu, J., Wang, P., Huang, W., Wang, X., 2013a. Efficient GPU-accelerated molecular dynamics simulation of solid covalent crystals. Comput. Phys. Commun. 184, 1364–1371. Hou, C., Xu, J., Wang, P., Huang, W., Wang, X., Ge, W., He, X., Guo, L., Li, J., 2013b. Petascale molecular dynamics simulation of crystalline silicon on Tianhe-1A. Int. J. High Perform. Comput. Appl. 27, 307–317. Hwang, G.J., Wu, S.J., 1998. Direct numerical simulation of the amplification of a 2D temporal disturbance in plane Poiseuille flow. Int. J. Numer. Methods Fluids 26, 443–457. Jorgensen, W., Chandrasekhar, J., Madura, J., Impey, R., Klein, M., 1983. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79, 926–935. Jung, Y., Choi, C.H., Gordon, M.S., 2001. Adsorption of water on the Si(100) surface: an Ab initio and QM/MM cluster study. J. Phys. Chem. B 105, 4039–4044. Klein, M.L., Shinoda, W., 2008. Large-scale molecular dynamics simulations of selfassembling systems. Science 321, 798–800. Lamb, R.A., Krug, R.M., 2001. Orthomyxoviridae: the viruses and their replication. In: Knipe, D.M., Howley, P.M. (Eds.), Field’s Virology, fourth ed. Lippincott, Philadelphia, pp. 1487–1531. Lee, O.-S., Stupp, S.I., Schatz, G.C., 2011. Atomistic molecular dynamics simulations of peptide amphiphile self-assembly into cylindrical nanofibers. J. Am. Chem. Soc. 133, 3677–3683. Li, X., Zheng, M., Liu, J., Guo, L., 2014a. Revealing chemical reactions of coal pyrolysis with GPU-enabled ReaxFF molecular dynamics and cheminformatics analysis. Mol. Simul. 10.1080/08927022.2014.913789. Li, B., Zhou, G., Ge, W., Wang, L., Wang, X., Guo, L., Li, J., 2014b. A multi-scale architecture for multi-scale simulation and its application to gas–solid flows. Particuology 15, 160–169. Li, J., Cheng, C., Zhang, Z., Yuan, J., Nemet, A., Fett, F.N., 1999. The EMMS model – its application, development and updated concepts. Chem. Eng. Sci. 54, 5409–5425. Li, J., Ge, W., Kwauk, M., 2009. Meso-scale Phenomena from Compromise – A Common Challenge, Not Only for Chemical Engineering. 〈arXiv:0912.5407〉. Li, J., Ge, W., Wang, W., Yang, N., Liu, X., Wang, L., He, X., Wang, X., Wang, J., Kwauk, M., 2013. From Multiscale Modeling to Meso-science: A Chemical Engineering Perspective Principles, Modeling, Simulation, and Application. Springer, Berlin Heidelberg. Li, J., Kwauk, M., 1994. Particle-fluid Two-phase Flow: The Energy-minimization Multi-scale Method. Metallurgical Industry Press, Beijing, PR China. Li, J., Tung, Y., Kwauk, M., 1988. Multi-scale modeling and method of energy minimization in particle–fluid two-phase flow. In: Basu, P., Large, J.F. (Eds.), Circulating Fluidized Bed Technology II. Pergamon Press, Compiègne, France, pp. 89–103. Li, X., 2013. Multi-scale Simulation of Fractured-vuggy Reservoirs (Ph.D. thesis). Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China. Liu, M., Li, J., Kwauk, M., 2001. Application of the energy-minimization multi-scale method to gas–liquid–solid fluidized beds. Chem. Eng. Sci. 56, 6805–6812. Liu, M.B., Liu, G.R., 2010. Smoothed particle hydrodynamics (SPH): an overview and recent developments. Arch. Comput. Methods Eng. 17, 25–76. Liu, X., Guo, L., Xia, Z., Lu, B., Zhao, M., Meng, F., Li, Z., Li, J., 2012. Harnessing the power of virtual reality. Chem. Eng. Prog. July, 28–32. Lu, J., Zhang, J., Wang, X., Wang, L., Ge, W., 2009. Parallelization of pseudo-particle modeling and its application in simulating gas–solid fluidization. Particuology 7, 317–323. Lu, L., Ge, W., Li, J., 2014a. EMMS-based DPM simulation of gas–solid flows with coarse-graining. In: Proceedings of the 7th World Congress of Particle Technology, Oral and Poster Presentation. Lu, L., Xu, J., Yue, Y., Liu, X., Ge, W., Li, J., 2014b. EMMS-based discrete particle method (EMMS-DPM) for simulation of gas–solid flows. Chem. Eng. Sci. 120, 67–87.

216

J. Xu et al. / Chemical Engineering Science 121 (2015) 200–216

Lucy, L.B., 1977. A numerical approach to the testing of the fission hypothesis. Astron. J. 82, 1013–1024. Mathews, J.P., van Duin, A.C.T., Chaffee, A.L., 2011. The utility of coal molecular models. Fuel Process. Technol. 92, 718–728. Mayo, S.L., Olafson, B.D., Goddard, W.A., 1990. DREIDING: a generic force field for molecular simulations. J. Phys. Chem. 94, 8897–8909. Miyamoto, S., Kollman, P., 1992. Settle: an analytical version of the SHAKE and RATTLE algorithm for rigid water models. J. Comput. Chem. 13, 952–962. Monaghan, J.J., 2005. Smoothed particle hydrodynamics. Rep. Prog. Phys. 68, 1703–1759. NVIDIA, 2006. NVIDIA CUDA Compute Unified Device Architecture-CUDA Programming Guide, 1.0 ed, Santa Clara, CA. Nosé, S., 1984. A molecular dynamics method for simulations in the canonical ensemble. Mol. Phys. 52, 255–268. Ohira, T., Ukai, O., Noda, M., 2000. Fundamental processes of microcrystalline silicon film growth: a molecular dynamics study. Surf. Sci. 458, 216–228. Parrinello, M., Rahman, A., 1981. Polymorphic transitions in single crystals: a new molecular dynamics method. J. Appl. Phys. 52, 7182–7190. Phillips, J.C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R.D., Kale, L., Schulten, K., 2005. Scalable molecular dynamics with NAMD. J. Comput. Chem. 26, 1781–1802. Qi, H., 2014. Application of GPU-based Discrete Simulation to Flow and Mixing Mechanisms of Granular Materials (Ph.D. thesis). Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China. Rapaport, D.C., 1987. Microscale hydrodynamics: discrete-particle simulation of evolving flow patterns. Phys. Rev. A 36, 3288–3299. Rapaport, D.C., 2006. Hexagonal convection patterns in atomistically simulated fluids. Phys. Rev. E 73, 025301. Rapaport, D.C., 2007. Radial and axial segregation of granular matter in a rotating cylinder: a simulation study. Phys. Rev. E 75, 031301. Russo, M.F., van Duin, A.C.T., 2011. Atomistic-scale simulations of chemical reactions: bridging from quantum chemistry to engineering. Nucl. Instrum. Methods Phys. Res. Sect. B – Beam Interact. Mater. Atoms 269, 1549–1554. Ryckaert, J.-P., Ciccotti, G., Berendsen, H.J.C., 1977. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 23, 327–341. Serrano, E., Moncada, P., Garijo, M., Iglesias, C.A., 2014. Evaluating social choice techniques into intelligent environments by agent based social simulation. Inf. Sci. 286, 102–124. Solomon, P.R., Fletcher, T.H., Pugmire, R.J., 1993. Progress in coal pyrolysis. Fuel 72, 587–597. Solomon, P.R., Hamblen, D.G., 1983. Finding order in coal pyrolysis kinetics. Prog. Energy Combust. Sci. 9, 323–361. Solomon, P.R., Hamblen, D.G., Carangelo, R.M., Serio, M.A., Deshpande, G.V., 1988. General-model of coal devolatilization. Energy Fuels 2, 405–422. Solomon, P.R., Serio, M.A., Suuberg, E.M., 1992. Coal pyrolysis – experiments, kinetic rates and mechanisms. Prog. Energy Combust. Sci. 18, 133–220. Sussman, M., Smereka, P., Osher, S., 1994. A level set approach for computing solutions to incompressible two-phase flow. J. Comput. Phys. 114, 146–159. Tersoff, J., 1988a. Empirical interatomic potential for silicon with improved elastic properties. Phys. Rev. B 38, 9902–9905. Tersoff, J., 1988b. New empirical approach for the structure and energy of covalent systems. Phys. Rev. B 37, 6991–7000. Tironi, I., Sperb, R., Smith, P., van Gunsteren, W., 1995. A generalized reaction field method for molecular dynamics simulations. J. Chem. Phys. 102, 5451–5459. Top500, 2010. 〈http://www.top500.org/list/2011/11/100〉. van Duin, A.C.T., Dasgupta, S., Lorant, F., Goddard, W.A., 2001. ReaxFF: a reactive force field for hydrocarbons. J. Phys. Chem. A 105, 9396–9409. Verlet, L., 1967. Computer “Experiments” on classical fluids. I. Thermodynamical properties of Lennard–Jones molecules. Phys. Rev. 159, 98–103. Vu-Quoc, L., Zhang, X., 1999a. An accurate and efficient tangential force–displacement model for elastic frictional contact in particle-flow simulations. Mech. Mater. 31, 235–269.

Vu-Quoc, L., Zhang, X., 1999b. An elastoplastic contact force–displacement model in the normal direction: displacement-driven version. Proc. R. Soc. Lond. Ser. A: Math. Phys. Eng. Sci. 455, 4013–4044. Wang, P., Zhang, Y., Li, X., Ge, W., Li, J., 2014. A multi-scale simulation approach to multi-phase flow in heterogeneous porous media. Particuology, IPE Internal Report. Wang, X., Ge, W., 2013. The Mole-8.5 supercomputing system. In: Vetter, J.S. (Ed.), Contemporary High Performance Computing: From Petascale toward Exascale. Chapman & Hall/CRC, Boca Raton, FL, pp. 75–98. Wang, X., Ge, W., He, X., Chen, F., Guo, L., Li, J., 2010a. Development and Application of a HPC System for Multi-scale Discrete Simulation – Mole-8.5. In: Proceedings of the International Supercomputing Conference. Germany. Wang, X., Ge, W., Zhou, G., Li, X., Li, B., Wang, P., 2010b. Internal Report. Institute of Process Engineering, Chinese Academy of Sciences. Xu, J., 2012. GPU Accelerated MD Simulation for Macromolecular Systems (Ph.D. thesis). Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China. Xu, J., Qi, H., Fang, X., Lu, L., Ge, W., Wang, X., Xu, M., Chen, F., He, X., Li, J., 2011a. Quasi-realtime simulation of rotating drum using discrete element method with parallel GPU computing. Particuology 9, 446–450. Xu, J., Ren, Y., Ge, W., Yu, X., Yang, X., Li, J., 2010. Molecular dynamics simulation of macromolecules using graphics processing unit. Mol. Simul. 36, 1131–1140. Xu, J., Wang, X., He, X., Ren, Y., Ge, W., Li, J., 2011b. Application of the Mole-8.5 supercomputer: probing the whole influenza virion at the atomic level. Chin. Sci. Bull. 56, 2114–2118. Yang, N., Chen, J., Ge, W., Li, J., 2010. A conceptual model for analyzing the stability condition and regime transition in bubble columns. Chem. Eng. Sci. 65, 517–526. Yu, X., Kong, B., Yang, X.Z., 2008. Molecular dynamics study on the crystallization of a cluster of polymer chains depending on the initial entanglement structure. Macromolecules 41, 6733–6740. Zhang, L., Qiu, X., Wang, L., Li, J., 2014. A stability condition for turbulence model: From EMMS model to EMMS-based turbulence model. Particuology 16, 142–154. Zhang, X., Vu-Quoc, L., 2000. Simulation of chute flow of soybeans using an improved tangential force–displacement model. Mech. Mater. 32, 115–129. Zhao, G., Perilla, J.R., Yufenyuy, E.L., Meng, X., Chen, B., Ning, J., Ahn, J., Gronenborn, A.M., Schulten, K., Aiken, C., Zhang, P., 2013. Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics. Nature 497, 643–646. Zhao, H., Ge, W., 2007. A theoretical bubble breakup model for slurry beds or threephase fluidized beds under high pressure. Chem. Eng. Sci. 62, 109–115. Zheng, M., Li, X., Guo, L., 2013a. Algorithms of GPU-enabled reactive force field (ReaxFF) molecular dynamics. J. Mol. Graph. Model. 41, 1–11. Zheng, M., Li, X., Liu, J., Guo, L., 2013b. Initial chemical reaction simulation of coal pyrolysis via ReaxFF molecular dynamics. Energy Fuels 27, 2942–2951. Zheng, M., Li, X., Liu, J., Wang, Z., Gong, X., Guo, L., Song, W., 2014. Pyrolysis of Liulin coal simulated by GPU-based ReaxFF MD with cheminformatics analysis. Energy Fuels 28, 522–534. Zhou, G., Chen, Z., Ge, W., Li, J., 2010. SPH simulation of oil displacement in cavityfracture structures. Chem. Eng. Sci. 65, 3363–3371. Zhou, G., Ge, W., Li, B., Li, X., Wang, P., Wang, J., Li, J., 2013. SPH simulation of selective withdrawal from microcavity. Microfluid. Nanofluid. 15, 481–490. Zhou, G., Ge, W., Li, J., 2008. A revised surface tension model for macro-scale particle methods. Powder Technol. 183, 21–26. Zhou, Y.C., Wright, B.D., Yang, R.Y., Xu, B.H., Yu, A.B., 1999. Rolling friction in the dynamic simulation of sandpile formation. Phys. A: Stat. Mech. Appl. 269, 536–553. Zhou, Y.C., Xu, B.H., Yu, A.B., Zulli, P., 2002. An experimental and numerical study of the angle of repose of coarse spheres. Powder Technol. 125, 45–54. Zhu, H.P., Zhou, Z.Y., Yang, R.Y., Yu, A.B., 2007. Discrete particle simulation of particulate systems: theoretical developments. Chem. Eng. Sci. 62, 3378–3396.