Molecular characterization of petroleum fractions

0 downloads 0 Views 756KB Size Report
Feb 7, 2017 - Molecular characterization of petroleum fractions using state space ... based on some easily obtainable bulk properties, e.g. the specific density, the H/C ratio, ...... . Jabr, N.
Chemical Engineering Science 164 (2017) 81–89

Contents lists available at ScienceDirect

Chemical Engineering Science journal homepage: www.elsevier.com/locate/ces

Molecular characterization of petroleum fractions using state space representation and its application for predicting naphtha pyrolysis product distributions Hua Mei a,b,⇑, Hui Cheng a, Zhenlei Wang a, Jinlong Li a,⇑ a b

Key Laboratory of Advanced Control and Optimization for Chemical Processes of the Ministry of Education, East China University of Science and Technology, Shanghai 200237, China Department of Chemical & Materials Engineering, University of Alberta, Edmonton, Alberta T6G 1R1, Canada

h i g h l i g h t s  Definition of state space representation of petroleum fractions.  Basis fractions and their acquisition via non-negative matrix factorization.  Products prediction of naphtha pyrolysis based on the basis fractions products.

a r t i c l e

i n f o

Article history: Received 10 September 2016 Received in revised form 28 December 2016 Accepted 3 February 2017 Available online 7 February 2017 Keywords: Molecular characterization State space representation Basis fractions Non-negative matrix factorization (NMF) Naphtha pyrolysis prediction

a b s t r a c t Molecular model of petroleum fractions plays an important role in the designing, simulation and optimization for petrochemical processes such as pyrolysis process, catalytic reforming and fluid catalytic cracking (FCC). However, it is very difficult to exactly characterize the composition distributions due to its internal complexity and containing numerous redundant information and measuring errors although many efforts have been made so far. As an improvement of the work in Mei et al. (2016), a molecularbased representation method within a multi-dimensional state space is developed in this paper. In this method, each pure component in the petroleum mixtures is defined as a state variable and any petroleum fractions can be geometrically represented as a point in a multi-dimensional linear state space, in which a conception of basis fractions is further introduced by defining a group of linear independent vectors so that any petroleum fractions within the specified range (e.g. naphtha) can be obtained through a linear combination by such basis fractions. The redundant information and measuring errors in the predetermined petroleum fraction samples are eliminated through the procedure of calculating the basis fractions with non-negative matrix factorization (NMF) algorithm, meanwhile the scale of the feedstock database is highly decreased. As an application example of the basis fractions, a quick prediction approach on naphtha pyrolysis product distributions is developed by linearly combining the pyrolysis products of the basis fractions. In contrast to mechanistic models, this proposed method is more suitable for real-time control and optimization purpose with little loss of accuracy. Ó 2017 Elsevier Ltd. All rights reserved.

1. Introduction Petroleum fractions are complex mixtures of thousands of hydrocarbon compounds and can be categorized roughly into several special fractions, e.g. liquefied petroleum gas (LPG), straight run gasoline, naphtha, gas oil, diesel, etc., according to their boiling ranges. However, those petroleum fractions within a similar boil⇑ Corresponding authors at: Key Laboratory of Advanced Control and Optimization for Chemical Processes of the Ministry of Education, East China University of Science and Technology, Shanghai 200237, China (H. Mei & J. Li). E-mail addresses: [email protected] (H. Mei), [email protected] (J. Li). http://dx.doi.org/10.1016/j.ces.2017.02.005 0009-2509/Ó 2017 Elsevier Ltd. All rights reserved.

ing range may possess different physical and/or chemical properties due to the variety of their internal molecular weights and structures. For instance, a typical straight-run medium naphtha contains 40–70 wt.% paraffins, 20–50 wt.% naphthenes, 5–20 wt.% aromatics and only 0–5 wt.% olefins, while the naphtha produced by fluid catalytic cracking (FCC) might contain 30–50 wt.% olefins (Antos and Aitani, 2004). In order to obtain the detailed molecular composition distribution in the petroleum fractions, during the past decade several modern analytical techniques such as gas chromatography (GC), gas chromatography-mass spectrometry (GSMS) and nuclear magnetic resonance (NMR), have been developed.

82

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89

But all these straightforward methods are time-consuming and error-prone so that they are hardly implemented in simulation and real-time optimization. As alternative non-experimental solutions, a number of ‘‘soft” modeling methods were proposed to determine the molecular compositions within the mixture merely based on some easily obtainable bulk properties, e.g. the specific density, the H/C ratio, paraffins-olefins-naphthenes-aromatics (PONA) mass fractions, ASTM boiling points, etc., which are easily integrated into the software for simulation and hence become more and more attractive with the improvement of the information technology. Conventional characterization methods, including pseudocomponent characterization (Lee and Kesler, 1975), compound class characterization (Van Nes and Van Westen, 1951; Kurtz et al., 1958; Riazi and Daubert, 1980, 1986) and average structure parameters characterization (Ruzicka, 1983; Boduszynski, 1988), were built on a basis of representing a hydrocarbon mixture in terms of pseudo-components or lumping components. Although they are widely used in commercial packages such as PROII, HYSYS and ASPEN PLUS, they are also rather limited in their application range and not easily extendable due to the lack of the detailed molecule information, especially those molecules with different structures but similar properties. In order to represent the petroleum fractions explicitly on molecular level, a number of molecular reconstruction methods were developed in recent decades. According to the difference of the way to decide the representative molecules, these methods can be briefly categorized into stochastic and determinative methodology (Wu, 2010). In the former methods (Neurock et al., 1990; Neurock et al., 1994; Campbell, 1998; Hudebine et al., 2002; Hudebine and Verstraete, 2004), a MonteCarlo technique is used to randomly sample the probability distributions and construct molecules, thereby forming a set of random molecules that fit analytical data. Whereas the stochastic methodology could reconstruct the mixtures especially for heavy oil fractions, the drawback of the time-consuming computation makes it less attractive in application for process optimization, because a new library of possible molecule must be generated for each new simulation. Instead of using a variable library, some researchers proposed some deterministic methods by predefining a molecular library (Jabr et al., 1992; Wahl et al., 2002; García et al., 2010) and made significantly faster than the stochastic methods in calculations. However, it should be noted that the accuracy of the deterministic methodology extremely depends on the amount of the analytic information contained in the predefined library. If one certain component was not previously included in the library, there must be a policy to enhance the library according to some criteria, e.g. the total enthalpy criteria (Allen and Liguras,1991), the Shannon entropy criteria (Hudebine and Verstraete, 2004; Verstraete et al., 2010; Van Geem et al., 2007; Pyl et al., 2010) or the subspace identification by principle component analysis (PCA) (Joo et al., 2001; Pyl et al., 2010) and so on. The complexity of petroleum mainly lies in the astronomical number of hydrocarbon molecules and their isomers. Handling such a tremendous amount of information is always a disaster for computation. Therefore, to promote the computational efficiency, a proper configuration integrating both the structural and composition information should be selected. The structureoriented lumping (SOL) approach developed by Quann and Jaffe (1992) formulates the individual molecules into vectors of structural increments, and then derives reaction rules from fundamental reaction routine to track the changes of the set of structural vectors and thus build reaction network. Originated from the SOL approach, Peng (1999) proposed a molecular type homologous series (MTHS) matrix characterization approach to represent the composition of a petroleum fraction. In this methodology, a petroleum mixture is visualized as a matrix, in which the columns represent

the homologous series and the rows represent carbon number, and the entries of the matrix represent the molar/ weight percent of each component or lump of all possible molecular isomers. Based on Peng’s work, many continuous and incremental researches in the framework of the MTHS matrix characterization approach were developed (Zhang, 1999; Aye and Zhang, 2005; Gomez-Prado et al., 2008; Wu, 2010; Ahmad et al., 2011) and utilized widely in molecular modeling and overall refinery optimization on the molecular level. The reader can refer to the published review for more details of this approach (Ahmad et al., 2011). From a geometric point of view, the MTHS matrix approach represents the hydrocarbon mixtures with abundant discrete points distributing in a three-dimension space, whose X-axis is a discrete sequence of homologous series, Y-axis the sequence of carbon number and Zaxis the weight of each component or lumping group. In spite of good visualization, it is not feasible to discover the common features between these chaotic points. Moreover, in order to cover various petroleum fractions in a wide range, the structural variables involved in the representation matrix should be enlarged so that the matrix becomes high-dimensional but sparse usually resulting in a high computational burden. To overcome such limitations of the matrix representation approach, in our early work (Mei et al., 2016) a molecular type homologous series vector representation approach was proposed to characterize naphtha fractions. Different from MTHS matrix representation, the group of hydrocarbon species with the same carbon atom number is defined as a state variable and all these state variables with different carbon atom numbers consist of a vector representation of a naphtha fraction in a multidimensional linear space. The drawback of such an approach is that it did not consider the influence of the molecular distribution in the same groups on the physical and chemical properties of the mixtures. In this paper, the authors will extend this approach into a molecular-based methodology to represent petroleum fractions as illustrated detailedly in the second section. In the third section, a conception of basis fractions and their acquisition approach via non-negative matrix factorization are developed. On the basis of such a linear combination of basis fractions, a quick prediction of naphtha pyrolysis products is then made based on the pyrolysis products distributions of the basis fractions in the fourth section. Finally, a conclusion is given. 2. State-space representation of petroleum fractions 2.1. Definition of the state-space of substances A petroleum fraction is a complex mixture of thousands of hydrocarbon compounds with different molecular structure and in general its thermodynamic and physical properties can be calculated through the relations among four basic observable or measurable state functions such as pressure, volume, temperature and amount of substance (Hu, 2013). In this paper, the authors mainly focus on characterizing the composition of a hydrocarbon mixture from a geometric viewpoint by constructing a state space of substances so that the alterations of pressure, volume and temperature are not taken into account. Define each pure component in the mixture as a state variable and feature them in the form of binary vectors as follows:

component 1 : e1 ¼ ½1; 0; . . . ; 0T component 2 : e2 ¼ ½0; 1; . . . ; 0T .. . componentn : en ¼ ½0; 0; . . . ; 1T

ð1Þ

83

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89

where the unit element means the existence of the component molecule and the zero means non-existence. As defined in Eq. (1), all these unit coordinate vectors e1 ; e2 ; . . . en are orthogonal and therefore span a n-dimensional state-space denoted by S,spanðe1 ; e2 ; . . . en Þ. Apparently, S is a Euclidean space over the non-negative real field Rþ because the amount of the substance cannot be negative. 2.2. Vector representation of a mixture in S Suppose that there is a mixture with the specified components, it thus can be characterized by a vector v in S whose project to ei ; 8ni¼1 is the amount of the corresponding pure component, i.e., mixture v can be expressed by a weighted sum of the pure components as:



n X wi  ei ; ðwi P 0Þ i¼1

1 w1 Bw C B 2C C ¼ ðe1 ; e2 ;    ; en Þ  B B .. C @ . A 0

ð2Þ

wn or simplified to a column vector as

v ¼ ðw1 ; w2 ;    ; wn ÞT

ð3Þ

in which wi is the amount of the pure component ei . Moreover, suppose that the mixture v is a blend of several distinguishing mixtures v 1 ; v 2 ; . . . ; v m , then v can be formulated as

v¼ ¼

m X

m X n X wi;j  ei

j¼1

j¼1 i¼1

vj ¼

n X m X

wi;j  ei

ð4Þ

i¼1 j¼1

¼

n X wi  ei i¼1

where wi ¼

kv k1 ,

Pm

j¼1 wi;j .

Define one-norm of

v

as

n X jwi j;

ð5Þ

i¼1

it actually means that the total amount of substances in the mixture and the normalization of v with respect to this norm is equal to

v ,

n n X X v wi  i  ei ; w ¼  ei ¼ kv k1 kv k1 i¼1 i¼1

ð6Þ

in which

 i 6 1Þ  i ,wi =kv k1 ; ð0 6 w w

As defined in Section 2.1, the dimension of the proposed state space is determined by the number of the pure components. However, the existence of the large number of possible isomers will cause an extremely high dimensional space and the resulting computational complexity. Thus, a convenient lumping rule is necessarily induced to group all the isomers with the same carbon number into a single lumped component, which is called as ‘horizontal’ lumping by Ranzi et al. (2001) and Dente et al. (2007). Besides the ‘horizontal’ lumping with the species of the same molecular weight, another ‘vertical’ lumping of homologous species with a different carbon number is also allowed since the reactivity and product distribution of large hydrocarbons with nC carbon atoms can be estimated correctly enough by linearly combining the reactivity and product distribution of homologous species with nC  1 and nC þ 1 carbon atoms (Ranzi et al., 2015). From the viewpoint of the forth mentioned state space representation, a pseudo-component through either horizontal or vertical lumping rules can be interpreted as a mixture of several specified components. Suppose that the state space of the substances S contains n þ l components as fe1 ; . . . ; en ; ~ e1 ; . . . ; ~ el g and el g are those that will be lumped, apparthe components f~ e1 ; . . . ~ el g, denoted as S 0 , is a subspace ently the space spanned by f~ e1 ; . . . ~ of S. Hence a pseudo-component lumped from f~ e1 ; . . . ~ el g can be characterized by a weighted vector in S 0 as

~eL ¼

l X ~ i;L  ~ei w

ð7Þ

~ 1;L ; w ~ 2;L ; . . . ; w ~ l;L ÞT : or ~eL ¼ ðw

ð9Þ

i¼1

where ~ eL is the pseudo-component and k~ eL k1 ¼ 1. Owing to the fact that the vectors in S 0 are orthogonal to those outside of S 0 , S can be eL Þ with its dimension reduced reorganized as S ¼ spanðe1 ; . . . ; en ; ~ from n þ l to n þ 1. Furthermore, the species in the state space of substances can be grouped into several disjointed subspaces and S can thus be reduced as S ¼ spanðS 0 ; S 00 ; S 000 ; . . .Þ with its dimension equal to the number of these subspaces. ~ i;L demonstrate the contribuClearly, the weighted coefficients w tion of the lumping components in S 0 as well as their impact on the properties of the pseudo-component. In general, the distributions of the lumping components in those petroleum fraction samples from different origins might be various, thus the pseudocomponent should be selected to possess similar properties with these samples by minimizing the distance between the vector of the pseudo-component and those of all samples in the state space of substances. Given r lumping samples from different origins as T

~ ~ 1j ; w ~ 2j ;    ; w ~ lj Þ ; j ¼ 1; 2; . . . ; r, and k is the amount of noneLj ¼ ðw ~ i;L can be identified by solving such zero coefficients in ~ eL , then w a linear programming problem according to different selection of distance’s definition. For the sake of computing convenience, the distance is defined as a L2 norm and such a LP problem becomes

~ i;L ¼ arg min w

is molar or mass fraction of the i-th pure component. Note that n X  i j ¼ 1;  k1 ¼ kv jw

2.3. Lumping rules in a subspace configuration

r r X k X X 2 ~ ij  w ~ i;L Þ k~eLj  ~eL k22 ¼ arg min ðw j¼1

k X ~ i;L ¼ 1; s:t: w

j¼1 i¼1

ð10Þ

8ki¼1

i¼1

ð8Þ

and its analytic solutions are equal to

i¼1

 satisfied This identity means that all the normalized vectors v to Eq. (8) construct a n  1 dimensional subspace of S, denoted by S. For instance, S is a line passing two points, (1, 0) and (0, 1), in S if n ¼ 2, and a plane determined by three points, (1, 0, 0), (0, 1, 0) and (0, 0, 1), in S if n ¼ 3, and so on.

~ i;L ¼ w

8 > > > > >
X X ~j ~j > w w > i i > þ 1k  ; k r rk : j¼1

j¼1 i¼1

ð11Þ

84

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89

To test the above lumping rules, an example of branched alkanes with 8 carbon atoms ever illustrated in references (Ranzi et al., 2001; Aye and Zhang, 2005) is taken here. The internal distributions of the branched isomers C8H18 of three different naphthas are compared in Table 1. A clear regularity of the composition is observed among three different origins of the mixtures that the relative predominance can be summarized as follows:  monomethyl-heptanes > dimethyle-hexanes;  monoethyl-hexanes > trimethyl-pentanes;  2-methyl-heptane is more abundant inside the methylheptanes;  the quaternary structures are less abundant than the tertiary ones. According to our proposed lumping rules, a lumped ISOC8H18 vector containing the same elements as internal weights given by Ranzi et al. (2001) is calculated and the final result seems much closer to the geometrical center of the virgin fractions than the original internal weights as shown in the right column in Table 1.

vectors

v 1; v 2;    ; v n

satisfying

According to the principle of linear space theory, besides unit orthogonal bases e1 ; e2 ;    en as mentioned above, independent

aj  v j ¼ 0 if and only if

are also the bases of S and hence arbitrary vector in S can be expressed by a linear combination of v 1 ; v 2 ;    ; v n . Assume that there exist several well-defined petroleum fractions v 1 ; v 2 ;    ; v n and they are independent since their properties are so distinctive that they cannot be obtained from others, we thereby define them as basis fractions because arbitrary petroleum fraction in S can be represented as a linear combination of these basis fractions. A general pathway to obtain the basis fractions is to select them from some well-determined petroleum fractions. Let v 1 ; v 2 ;    ; v n

be the potential basis fractions and v ðiÞ ; 8Ni¼1 be N well-determined petroleum fractions blended from v 1 ; v 2 ;    ; v n , thus from Eqs. (4) and (6) the relationship between their normalized vectors is equal to n X

vj

v ðiÞ ¼

v ðiÞ j¼1 ¼ n ¼ kv ðiÞ k1 X kv j k1

n X j kv j k1  v j¼1

j¼1

3.1. Conception of basis fractions

j¼1

ai ¼ 0; 8

3. Basis fractions Vector representation of a hydrocarbon mixture illustrates its distribution of the compositions in a high-dimensional nonnegative Euclidean space. Nevertheless, it seems hard to ascertain its properties merely according to its molar or mass fractions without any other analytical tools. A common approach is to compare it with other petroleum fractions whose composition and properties are available in a database and predict its properties from those similar samples. Zhang (1999) proposed an interpolation method in her thesis to obtain the molecular information from bulk properties based on a principle that the unknown molecular composition of an oil mixture can be derived by treating the oil mixture as a blend of several well-characterized oil mixtures. Clearly, the more sufficient database with well-characterized oil samples is contained, the better prediction accuracy will be obtained. However, except for the high expense and time consumption to gather the information of these well-characterized oil samples, abundant similar samples bring about lots of redundant information which will usually cause complicated computation because of singular data. In this section, we will propose a conception of basis fractions in a state space framework and a methodology to simplify the sampling database via non-negative matrix factorization technique.

Pn

n i¼1

¼

n X kv j k1

ð12Þ

j¼1

n X ðiÞ  j ¼ ðv  1; v  2; . . . ; v  n Þ  hðiÞ hj  v j¼1 ðiÞ

ðiÞ

ðiÞ

ðiÞ T

ðiÞ

kv j k1

where h ,ðh1 ; h2 ;    ; hn Þ and hj , Pn

j¼1

kv j k1

ðiÞ

; ð0 6 hj 6 1Þ is the

blending ratio of the j-th fraction v j to the final blend. Eq. (12) can also be reformed into a matrix equation as

V ¼WH

ð13Þ

where

 ð1Þ ; v  ð2Þ ; . . . ; v  ðNÞ Þ V,ðv 3 2 ð1Þ ð2Þ ðNÞ w1 w1 w1 7 6 ð1Þ ð2Þ ðNÞ 6w w2 w2 7 7 6 2 ¼6 . .. .. 7 7 6 .. . 4 . . . . 5 ð1Þ

wn

ð2Þ

wn

ðNÞ

wn

nN

is the sample matrix,

 1; v  2; . . . ; v  nÞ W,ðv 3 2 w1;1 w1;2 w1;n 6 w2;1 w2;2 w2;n 7 7 6 7 ¼6 .. .. .. 7 6 .. 4 . . . . 5 wn;1

wn;2

wn;n

nn

is the basis-fraction matrix,

Table 1 Relative amount of branched isomers C8H18 (wt.%). Isomers

Ponca

Occidental

Texas

Internal weights

This work

2-Methylheptane 3-Methylheptane 4-Methylheptane 2,3-Dimethylhexane 2,4-Dimethylhexane 2,5-Dimethylhexane 3,4-Dimethylhexane 2,2-Dimethylhexane 3,3-Dimethylhexane 2,3,4-Timethylpentane 2,2,3-Timethylpentane 2,3,3-Timethylpentane 3-Ethylhexane 2-Methyl-3-ethylpentane 3-Methyl-3-ethylpentane

46.3 15.4 10.3 3.6 3.1 3.1 6.7 0.5 1.5 0.3 0.2 0.3 4.6 3.1 1

36.9 28.5 10.2 5.4 5.5 5.7 2.6 – 1.7 – – – 3.5 – –

42.1 23.4 9.3 6.3 4.2 4 3.7 0.3 0.4 1.1 – 0.6 3.1 1.5 –

45.8 22.9 11.5 3.4 3.4 3.4 3.4 – – 1.2 – – 3.8 1.2 –

42.0 22.7 10.2 5.3 4.5 4.5 4.6 – – 0.7 – – 4 1.8 –

85

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89 ð1Þ

ð2Þ

ðNÞ

H,ðh ; h ;    ; h Þ 3 2 ð1Þ ð2Þ ðNÞ h1 h1 h1 6 ð1Þ ð2Þ ðNÞ 7 6h h2 h2 7 7 6 2 ¼6 .. .. .. 7 7 6 .. 4 . . . . 5 ð1Þ

hn

ð2Þ

ðNÞ

hn

hn

nN

is the blending coefficient matrix, and V, W and H are all nonnegative matrices. Obviously, if there are n linearly independent  ð1Þ ; v  ð2Þ ; . . . ; v  ðNÞ , then the basis fractions can be samples within v determined directly but large representing errors will be arisen because these original fraction samples contain a great deal of measuring errors. On the other hand, for some specified petroleum fractions such as naphtha fractions, their compositions range in such a relatively narrow scope that they can be represented with less basis fractions than the amount of their internal molecular species. Suppose that W and H can be split into the form of partitioned matrix respectively as

 1; v  2; . . . ; v  m jv  mþ1 ; v  mþ2 ; . . . ; v  n Þ,ðW nm jW nðnmÞ Þ W ¼ ðv 2

ð1Þ

h1

ð2Þ

ðNÞ

h1

h1

3

7 6 ð1Þ ð2Þ ðNÞ 7 6 h2 h2 h2 7 6 7 6 . . . . 7 6 . . . . . . . . 7 6 7 6 ð1Þ ð2Þ ðNÞ 7 6 h h h m m m 7 6              7 H¼6 7 6 ð1Þ ð2Þ ðNÞ 7 6 hmþ1 hmþ1 hmþ1 7 6 7 6 ð1Þ ð2Þ ðNÞ 7 6 hmþ2 hmþ2 hmþ2 7 6 7 6 . . .. . 7 6 . . . 5 4 . . . . ð1Þ

0

hn

HmN

ð2Þ

ð14Þ

ð15Þ

ðNÞ

h 1n

hn

nN

B C ,@      A HðnmÞN Substitute them into Eq. (13) and then obtain

0

1 HmN B    C V ¼ ðWnm jWnðnmÞ Þ  @ A HðnmÞN

ð16Þ

¼ Wnm  HmN þ WnðnmÞ  HðnmÞN If WnðnmÞ  HðnmÞN is small enough, then we can approximate  1; v  2; . . . ; v m the sample matrix V with some independent vectors v  1; v  2; . . . ; v  m are defined as basis fractions in less than n. Hence, v term of the subspace specified by the sample matrix V and a petroleum fraction can be approximated with

v 

m X j hj  v

ð17Þ

j¼1

3.2. Numeric computation for the basis fractions According to Eq. (16), the sample matrix V is decomposed into a product of the basis fractions matrix and the corresponding blending ratio matrix. However, the non-negativity of Wnm and HmN makes it unavailable to calculate them through some typical matrix decomposition methods, e.g. orthogonal decomposition or singular value decomposition unless through a numeric iterative computation named as the Non-negative Matrix Factorization (NMF). The NMF algorithm was initiated by Paatero and Tapper (1994, 1997), together with Lee and Seung (1999, 2001), and devel-

oped in recent decades by introducing some regularized terms and/or constraints into the cost function (Laurberg, 2007; Mørup and Hansen, 2009; Arora et al., 2012; Esser et al., 2012; Benachir et al., 2013; Kumar et al., 2013; Huang et al., 2014; Liu et al., 2016). Since then, the NMF-based methods are widely used in a lot of areas, such as pattern recognition, antenna array processing, or environmental data processing. As a matter of fact, the problem of calculating the basis fractions can be transformed into such an optimization with non-negative constraints as follows:

ðW; HÞ ¼ arg min kV  W  Hk2F

ð18Þ

WP0;HP0

nm where V 2 RnN ; H 2 RmN and k  kF denotes the Frobeþ ; W 2 Rþ þ nius norm of a matrix. This optimization problem can be solved by using a multiplicative update rules proposed by Lee and Seung (2001)

H

H

WT  V T

W WH

and

W

W

V  HT W  H  HT

;

ð19Þ

where A  B and AB represent element-wise matrix and division respectively, and the optimizing procedure is thereby given in Table 2. It is noteworthy that the NNF problem has been proved to be NP-hard (Vavasis, 2009) and the standard NMF algorithm treats the problem as an instance of general non-convex programming, which causes that the standard NMF algorithm cannot guarantee the global convergence such that its solution is not unique and related with the initialization of W and H. In order to guarantee the factorization to be unique, Donoho and Stodden (2003) proved that a specific set of conditions must hold for NMF to determine the parts of a dataset, one of which is that data must contain samples with every feature in every allowable combination as summarized by Ingram (2015). In practice, an approximate factorization with enough accuracy can be obtained via repeating tests for minimal square errors or utilizing some step-wise NMF methods, referred in (Bittorf et al., 2012; Tepper and Sapiro, 2016; Wang et al., 2016), to promote the robustness of iterative procedure according to the requirements of the specific engineering practice. 3.3. Case study In order to clarify the proposed representation method based on basis fractions, a detailed PIONA dataset including 59 naphtha samples from a Chinese ethylene plant is collected and used, which were ever utilized in (Mei et al., 2016). These data have been summarized in Table S1 in the supplementary materials. By lumping homologous species with the same carbon number, a naphtha fraction is simplified with 35 lumped compositions and the distributions of all naphtha samples are shown in Fig. 1. Thereby, through the aforementioned standard NMF algorithm 21 basis fractions are obtained and also given in supplementary materials. Their composition distributions are shown in Fig. 2.

Table 2 The optimizing procedure of standard NMF. Step

Action

1

Determine the dimension of W, which is commonly selected as m 6 n N=ðn þ NÞ Initialize W and H by setting randomly non-negative values as their entries Update W and H by multiplicative update rules given in Eq. (19) Verify the value of kV  W  HkF . If it is smaller than a predetermined small value or the iterative step exceeds some amount, then halt the iteration; or else go to STEP3

2 3 4

86

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89

composition (wt %)

60

i-Parafins

50 40 30 20

n-Parafins

Olefins Naphthenes Aromatics

60

10 0 35

50 40 30

30 25

20

20

15

species

10

naphtha samples

10

5

0

0

Fig. 1. Distributions of 59 naphtha samples.

composition (wt %)

100

i-Parafins

80

Olefins

40

n-Parafins

Naphthenes

60

Aromatics

20 0 35

20 15 30

10 25

20

species

15

5 10

5

0

basis fractions

0

Fig. 2. Distributions of 21 basis fractions.

Obviously, compared with the original samples shown in Fig. 1, a great deal of redundant information and measuring errors are eliminated in the basis fractions. In Fig. 3, the square errors of the entries between V and W  H were illustrated showing that 21 basis fractions are enough to approximate the original samples with high accuracy. 4. Application for predicting the naphtha pyrolysis products Steam cracking of hydrocarbon fractions is one of the most important pathways for industrially producing ethylene and

propylene. To predict the products distribution of hydrocarbon pyrolysis, many kinetic modeling methods based on the RiceHerzfeld’s free radical mechanism, e.g. SPYRO (Dente and Ranzi, 1979) and COILSIM (Van Geem et al., 2007), etc., were established in the past decades. These kinetic models can provide more proper representation of pyrolysis process as well as good inter- and extrapolation ability (Ranzi et al., 2001), but their potential mathematical difficulties and computation burden speak against its use to predict cracking products for real-time control and optimization purpose. Different from complicated mechanistic modeling, datadriven modeling techniques have a concern on constructing a

87

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89

1

square error

0.8 0.6 0.4

60

0.2 0 35

50 40 30

30 25

20

species

20 15

10

naphtha samples

10 5

0

0

Fig. 3. Square errors between representing models and the original samples.

direct relationship between the pyrolysis product yields and operating data, such as the compounds in the feedstock, the coil inlet temperature (CIT) and outlet temperature (COT), the coil inlet pressure (CIP) and outlet pressure (COP), feedstock flowrate, steam-to-hydrocarbon ratio (SHR), and so on. Due to the absence of internal mechanistic analysis, data-driven models can provide much faster computation for real-time control and optimization purpose with a cost of somewhat losing extrapolation performance. Suppose X Pi is the function of the product P i ’s yield with respect  and process condition h, that is to petroleum fraction v

 jhÞ; y P i ¼ X P i ðv

ð20Þ

where h is an assemblage of functions with respect to the operating parameters such as CIT&COT, CIP&COP, feedstock flowrate and SHR.  1; v  2; . . . ; v  m and their blendAssume that both the basis fractions v  are known, by substituting Eq. (17) into Eq. (20) we ing ratios to v have

yP i  X P i

! m X  j jh : hj  v

ð21Þ

j¼1

In common senses, X Pi is nonlinear with respect to the compounds in the feedstock. However, X Pi is nearly linear with the same mixing rule of the feedstock following the horizontal and vertical lumped reaction rules for naphtha feedstock (Ranzi et al., 2001, 2015). Thus, the yield of naphtha pyrolysis products can be approximated via linearly combining those of the basis fractions under the same operating conditions as given in Eq. (22)

yPi 

m X  j jhÞ: hj  X P i ðv

ð22Þ

j¼1

 j jhÞ could be obtained from the mechanistic model off-line in X P i ðv advance, then the product yield of a petroleum fraction yPi can be calculated immediately according to its blending ratios. The main product yields of one of naphtha samples given in Table 3 are firstly obtained from a mechanistic model (COILSIM) at four COTs. To validate the proposed approximating method based on the basis fractions, the product yields by linear combination of the yields of the basis fractions under the same conditions are calculated. The predicted results and their relative error to the predictions of the COILSIM model are listed in Table 3. It can be seen that the calculated product yields are of fairy accuracy but

Table 3 Comparison between the product yields from COLSIM model and the calculation based on the basis fractions. Operating conditions: [Flow rate = 40 t/h; SHR = 0.55; CIP = 0.24 MPa; COP = 0.21 MPa; CIT = 590 °C] COT (°C)

805

Product yields

COILSIM (wt.%)

Calculated (wt.%)

Error (%)

COILSIM (wt.%)

815 Calculated (wt.%)

Error (%)

COILSIM (wt.%)

825 Calculated (wt.%)

Error (%)

COILSIM (wt.%)

840 Calculated (wt.%)

Error (%)

H2 CH4 C2H4 C2H6 C3H6 C3H8 1;3-C4H6 C4 Raffinate Benzene

0.93 14.66 25.47 3.25 19.09 0.45 5.34 4.93 3.72

0.97 14.92 25.50 3.37 19.27 0.47 5.21 4.79 3.56

4.34 1.81 0.14 3.85 0.97 2.90 2.34 2.75 4.24

0.99 15.75 26.53 3.17 18.01 0.38 5.30 3.94 4.15

1.03 15.98 26.52 3.29 18.08 0.39 5.15 3.83 4.00

3.95 1.43 0.04 3.62 0.38 1.43 2.90 2.90 3.70

1.11 17.45 27.75 3.00 15.58 0.25 4.95 2.57 4.81

1.13 17.51 27.07 3.01 15.38 0.25 4.77 2.53 4.76

1.95 0.33 2.45 0.47 1.30 2.91 3.65 1.74 0.97

1.27 19.17 28.29 2.72 12.30 0.14 4.25 1.53 5.36

1.30 19.25 28.21 2.80 12.21 0.14 4.18 1.54 5.24

2.37 0.43 0.29 2.91 0.77 0.72 1.71 0.52 2.28

88

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89

few computation showing that the proposed method can be easily applied for real-time control and optimization purpose. 5. Conclusion A molecular type and homologous series (MTHS) vector representation methodology of petroleum fractions is proposed by defining all compounds in the mixture as state variables in forms of a series of unit vectors that span a multi-dimensional state space in this paper. According to the given definition, one petroleum fraction can be represented as a point with explicit geometric interpretation and its coordinates are the quantities of the components. In addition, because of the separability of a linear space, a lumping rule is developed so that the lumping components span a state subspace and their internal distribution can be obtained by minimizing the sum of the distance square between the samples and the lumped component. To obtain the basis fractions, NMF approach is employed to search finite independent vectors from abundant petroleum fraction samples by numerical computation and 21 basis fractions from 59 potential naphtha samples are selected. Finally, the conception of the basis fractions is applied to predict the product distribution of naphtha pyrolysis and good predicted results are obtained by linear combinations. In contrast to calculating by mechanistic models, this interpolation method is much more convenient for real-time control and optimization purpose with little loss of accuracy. Acknowledgements This work is supported by Natural Science Foundation of Shanghai (16ZR1407300) and sponsored by the China Scholarship Council (201606745013). The authors are sincerely grateful to Professor Eliseo Maria Ranzi for assistance on naphtha pyrolysis mechanism and lumping rules of species. Appendix A. Supplementary material Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.ces.2017.02.005. References Ahmad, M.I., Zhang, N., Jobson, M., 2011. Molecular components-based representation of petroleum fractions. Chem. Eng. Res. Des. 89, 410–420. Allen, D.T., Liguras, D., 1991. Structural models of catalytic cracking chemistry: a case of a group contribution approach to lumped kinetic modeling. In: New York: Mobil Workshop: Chemical Reactions in Complex Mixtures. Antos, G.J., Aitani, A.M., 2004. Catalytic Naphtha Reforming, Revised and Expanded. CRC Press, Florida. Arora, S., Ge, R., Kannan, R., Moitra, A., 2012. Computing a nonnegative matrix factorization–provably. In: Proceedings of the 44th Annual ACM Symposium on Theory of Computing, STOC, pp. 145–162. Aye, M., Zhang, N., 2005. A novel methodology in transforming bulk properties of refining streams into molecular information. Chem. Eng. Sci. 60, 6702–6717. Benachir, D., Hosseini, S., Deville, Y., Karoui, M.S., Hameurlain, A., 2013. Modified independent component analysis for initializing non-negative matrix factorization: an approach to hyperspectral image unmixing. In: Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM), 2013 IEEE 11th International Workshop of, Toulouse, France, pp. 1–6. Boduszynski, M., 1988. Composition of heavy petroleums: 2. Molecular characterization. Energy Fuels 2, 597–613. Bittorf, V., Recht, B., Re, C., et al., 2012. Factoring nonnegative matrices with linear programs. In: Advances in Neural Information Processing Systems 2012, Nevada, USA, pp. 1214–1222. Campbell, D.M., 1998. Stochastic Modeling of Structure and Reaction in Hydrocarbon Conversion, Doctoral Dissertation. University of Delaware, Newark. Dente, M., Ranzi, E., 1979. Detailed prediction of olefin yields from hydrocarbon pyrolysis through a fundamental simulation program (SPYRO). Comput. Chem. Eng. 3, 61–75. Dente, M., Bozzano, G., Faravelli, T., Marongiu, A., Pierucci, S., Ranzi, E., 2007. Kinetic modelling of pyrolysis processes in gas and condensed phase. Adv. Chem. Eng. 32, 51–166.

Donoho, D., Stodden, V., 2003. Whendoes non-negative matrix factorization give a correct decomposition into parts? In: Advances in Neural Information Processing Systems (NIPS 2003). Esser, E., Moller, M., Osher, S., Sapiro, G., Xin, J., 2012. A convex model for nonnegative matrix factorization and dimensionality reduction on physical space. IEEE Trans. Image Process. 21 (10), 3239–3252. García, C.L., Hudebine, D., Schweitzer, J.M., Verstraete, J.J., Ferré, D., 2010. In-depth modeling of gas oil hydrotreating: from feedstock reconstruction to reactorstability analysis. Catal. Today 150, 279–299. Gomez-Prado, J., Zhang, N., Theodoropoulos, C., 2008. Characterisation of heavy petroleum fractions using modified molecular-type homologous series (MTHS) representation. Energy 33, 974–987. Hu, Y., 2013. Physical Chemistry. High Education Press, Beijing, China. Huang, K., Nicholas, D., Swami, S.A., 2014. Non-negative matrix factorization revisited: uniqueness and algorithm for symmetric decomposition. IEEE Trans. Signal Process. 62 (1), 211–224. Hudebine, D., Vera, C., Wahl, F., Verstraete, J.J., 2002. Molecular representation of hydrocarbon mixtures from overall petroleum analysis. In: AIChE 2002 Spring Meeting, 27a. Hudebin, D., Verstraete, J.J., 2004. Molecular reconstruction of LCO gasoils from overall petroleum analyses. Chem. Eng. Sci. 59, 4755–4763. Ingram, S., 2015. An Improved Projected Gradient Method for Nonnegative Matrix Factorization. . Jabr, N., Alatiqi, I.M., Fahim, M.A., 1992. An improved characterization method for petroleum fractions. Can. J. Chem. Eng. 70, 765–773. Joo, E., Park, S., Lee, M., 2001. Pyrolysis reaction mechanism for industrial naphtha cracking furnaces. Ind. Eng. Chem. Res. 40, 2409–2415. Kumar, A., Sindhwani, V., Kambadur, P., 2013. Fast conical hull algorithms for nearseparable non-negative matrix factorization. Proceedings of the 30th International Conference on Machine Learning 28(1), 231–239. Kurtz, S., King, R., Stout, W., Peterkin, M., 1958. Carbon-type composition of viscous fractions of petroleum. Density–refractivity intercept method. Anal. Chem. 30, 1224–1236. Laurberg, H., 2007. Uniqueness of non-negative matrix factorization. In: IEEE/SP 14th Workshop on Statistical Signal Processing, pp. 44–48. Lee, B., Kesler, M., 1975. A generalized thermodynamic correlation based on three parameter corresponding states. AIChE J. 21, 510–527. Lee, D.D., Seung, H.S., 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401 (6755), 788–791. Lee, D.D., Seung, H.S., 2001. Algorithms for nonnegative matrix factorization. In: Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, pp. 556–562. Liu, Y., Liao, Y., Tang, L., Tang, F., Liu, W., 2016. General subspace constrained nonnegative matrix factorization for data representation. Neurocomputing 173, 224–232. Mei, H., Du, Y., Wang, Z., Qian, F., 2016. Naphtha characterization based on a molecular-type homologous series vector representation. J. Tsinghua Univ. (Sci. Technol.) 56 (7), 723–727 (In Chinese). Mørup, M., Hansen, L.K., 2009. Tuning pruning in sparse non-negative matrix factorization. In: 17th European Signal Processing Conference (EUSIPCO 2009), Glasgow, Scotland. Neurock, M., Nigam, A., Libanati, C., Klein, M.T., 1990. Monte Carlo simulation of complex reaction systems: molecular structure and reactivity in modelling heavy oils. Chem. Eng. Sci. 45, 2083–2088. Neurock, M., Nigam, A., Trauth, D., Klein, M.T., 1994. Molecular representation of complex hydrocarbon feedstocks through efficient characterisation and stochastic algorithms. Chem. Eng. Sci. 49, 4153–4177. Paatero, P., Tapper, U., 1994. Positive matrix factorization—a nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics 5 (2), 111–126. Paatero, P., Tapper, U., 1997. Least squares formulation of robust non-negative factor analysis. Chemomet. Intell. Lab. 37, 23–35. Peng, B., 1999. Molecular Modelling of Refinery Processes. UMIST, Manchester. Pyl, S.P., Van Geem, K.M., Reyiers, M., et al., 2010. Molecular reconstruction of complex hydrocarbon mixtures: An application of principal component analysis. AIChE J. 56 (12), 3174–3188. Quann, R.J., Jaffe, S.B., 1992. Structural-oriented lumping: describing the chemistry of complex mixtures. Ind. Eng. Chem. Res. 31, 2483–2497. Ranzi, E., Dente, M., Goldaniga, A., Bozzano, G., Faravelli, T., 2001. Lumping procedures in detailed kinetic modeling of gasification, pyrolysis, partial oxidation and combustion of hydrocarbon mixtures. Prog. Energy Combust. Sci. 27 (1), 99–139. Ranzi, E., Pierucci, S., Dente, M., Van Goethem, M.W.M., 2015. Accurate molecular reconstruction of cracking feeds improves the predictions of ethylene yields. In: 2015 Spring Meeting & 11th Global Congress on Process Safety, Austin TX, USA. Riazi, M., Daubert, T., 1986. Prediction of molecular-type analysis of petroleum fractions and coal liquids. Ind. Eng. Chem. Proc. Des. Dev. 25, 1009–1015. Riazi, M., Daubert, T., 1980. Prediction of the composition of petroleum fractions. Ind. Eng. Chem. Proc. Des. Dev. 19, 289–294. Ruzicka, V., 1983. Estimation of vapor pressures by a group-contribution method. Ind. Eng. Chem. Fund. 22, 266–267. Tepper, M., Sapiro, G., 2016. Compressed nonnegative matrix factorization is fast and accurate. IEEE Trans. Signal Process. 64 (9), 2269–2283. Van Geem, K.M., Hudebine, D., Reyniers, M.F., et al., 2007. Molecular reconstruction of naphtha steam cracking feedstocks based on commercial indices. Comp. Chem. Eng. 31, 1020–1034.

H. Mei et al. / Chemical Engineering Science 164 (2017) 81–89 Van Nes, K., Van Westen, H., 1951. Aspects of the Constitution of Mineral Oils. Elsevier Publishing Co., Inc, New York. Vavasis, S., 2009. On the complexity of non-negative matrix factorization. SIAM J. Optim. 20 (3), 1364–1377. Verstraete, J.J., Schnongs, Ph., Dulot, H., Hudebine, D., 2010. Molecular reconstruction of heavy petroleum residue fractions. Chem. Eng. Sci. 65, 304– 312. Wahl, F., Hudebine, D., Verstratete, J.J., 2002. Reconstruction of the molecular composition of FCC Gasolines from overall petroleum analyses. Oil and Gas Science and Technology – Review IFP.

89

Wang, D., Nie, F., Huang, H., 2016. Fast robust non-negative matrix factorization for large-scale human action data clustering. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI-16), New York, USA, pp. 2104–2110. Wu, Y., 2010. Molecular Management for Refining Operations. The University of Manchester PhD thesis. Zhang, Y., 1999. A Molecular Approach for Characterisation and Property Predictions of Petroleum Mixtures with Applications to Refinery Modelling. UMIST PhD thesis.