A structural weight estimation model of FPSO topsides using an ...

Ships and Offshore Structures, 2017 Vol. 12, No. 1, 43–55, http://dx.doi.org/10.1080/17445302.2015.1099246

A structural weight estimation model of FPSO topsides using an improved genetic programming method Sol Haa , Tae-Sub Umb , Myung-Il Rohc,∗ and Hyun-Kyoung Shind a

Department of Ocean Engineering, Mokpo National University, Muan-gun, Republic of Korea; b Maritime Research Institute, Hyundai Heavy Industries Co., Ltd., Ulsan, Republic of Korea; c Department of Naval Architecture and Ocean Engineering & Research Institute of Marine Systems Engineering, Seoul National University, Seoul, Republic of Korea; d School of Naval Architecture and Ocean Engineering, University of Ulsan, Ulsan, Republic of Korea (Received 2 June 2015; accepted 21 September 2015) The weight information of an FPSO (floating production, storage, and offloading) plant is one of the important data needed to estimate the amount of production material (e.g., plates) needed and to determine the suitable production method for its construction. In addition, the weight information is a key factor that affects the building cost and the production period of the FPSO plant. Although the importance of the weight has long been recognised, the weight, especially of the topside, has been roughly estimated using the existing similar data as well as the designer’s experience. To improve this task, a weight estimation model for FPSO plant topsides was developed in this study using the improved genetic programming (GP) method. For this reason, various past records on the estimation of the weight of the FPSO plant were collected through a literature survey, and then the weight estimation model using GP was established by fixing the independent variables based on these data. In addition, correlation analysis was performed to make up for the weak points of genetic programing, which is apt for inducing overfitting when the number of data is relatively smaller than that of the independent variables. That is, by reducing the number of variables through the analysis of the correlation between the independent variables, an increase in the number of weight data can be expected. Finally, to evaluate the applicability of the suggested model, it was applied to an example of the weight estimation of the FPSO plant topside. Compared with the results of the multiple nonlinear regression analysis that was conducted in the previous study, the results showed that the suggested model can be applied to the weight estimation process of the FPSO plant at the early design stage. Keywords: weight estimation; topsides weight; FPSO plant; genetic programming; correlation analysis; optimisation; statistics

1. Introduction The engineering of FPSO (floating production, storage, and offloading) plants is divided into two phases: the front-end engineering design (FEED) phase and the detailed engineering phase (Hwang 2013). Of these two phases, the FEED phase is more critical for determining the feasibility of specific well area development. An economic analysis of the development of a specific well area is performed based on the outputs of the FEED phase. The final outputs of the FEED phase are the weight, the total costs, and the layout of an offshore plant. It is essential to accurately estimate the weight of the FPSO plant topside design at the FEED phase in terms of both cost management and performance satisfaction. Accurately estimating the topsides weight of the FPSO plant as early as possible is critical in controlling the costs and schedules of building these facilities. Furthermore, weight estimation of the FPSO plant topside is necessary to provide the information required for hull structural design, to estimate the equipment to be built, and the amount of mate∗

Corresponding author. Email: [email protected]

C 2015 Informa UK Limited, trading as Taylor & Francis Group

rial that needs to be procured, to manage the stability of the platform, and to estimate the total cost and construction period of the project. If the topsides weight can be accurately estimated at the FEED phase, the weight can be efficiently controlled, and the material cost can be kept stable. Figure 1 shows the concept of weight control in FPSO plant design. It is not easy, however, to accurately estimate the weight of the FPSO plant topsides at the early design stage. Especially, when parent or similar FPSO plants are not available, it is necessary to choose a reliable method for weight estimation. Thus, this study proposes a method that can be used to develop a weight estimation model for FPSO plant topsides at the early design stage. Many theories, such as the optimisation and statistical methods, can be used for this purpose. In the authors’ previous study (Ha et al. 2015), a simplified model was proposed for the weight estimation of FPSO plant topsides using the statistical method. This study used variable transformation to consider a nonlinear form of independent variables, and it used correlation analysis and multiple regression analysis of the statistics to generate

44

Figure 1.

S. Ha et al.

Concept of weight control in FPSO plant design.

a simplified model for the weight estimation of FPSO plant topsides. This method has the advantage of obtaining the result in a short time, but the quality of the estimated weight might be low compared with the actual data. Therefore, this study proposes a new method involving the use of both the optimisation and statistical methods to solve the accuracy and time consumption issues.

2. Related studies There are many methods of estimating the weight of a system such as a ship and an offshore plant. Among these, the methods based on the particulars of the system (particularsbased methods) represent a simple approach based on the assumption that the weight of a system is related with the particulars of the system, such as the length, height, volume, and density. An example of this approach is the volumetric density method, which is used to estimate the detailed weight group by multiplying the space volume by the bulk factor (density). For example, the detailed weight of a system can be expressed as the multiplication of the space volume and the bulk factor. Bolding (2001) used this bulk factor to estimate the weight of an FPSO plant’s topsides. The parametrics method is a method of representing the weight with several parameters, an essential prerequisite of the following ratiocination. The weight of the hull structure, for instance, can be estimated as L1.6 (B + D) (Lee et al. 2001), and the tubular weight for carbon steel in a wind farm is defined as W = 24,660(D − n)nL (Kaiser and Snyder 2014). These methods, however, are based on the domain knowledge of the target system, and as such, it is not easy to determine which parameters are dependent on the system weight. The statistical or optimisation method can help solve this problem. It can be used when developing a weight equation from the analysis of various past records, and then estimating the weight using the equation. It helps in estimating the relationships among the variables of a system, and includes many techniques for modelling and analysing several variables. Based on the statistical method, some researches related to the development of the model for weight estimation have been conducted in the naval architecture and ocean engi-

neering fields. Koike and Minoura (2011) applied a statistical method to predict ship performance using onboard measurement data. Hussin et al. (2012) presented a systematic methodology for analysing the maintenance data of an offshore system to gain insights into the system reliability performance and to identify the critical factors influencing the performance based on the statistical method. Kaiser and Snyder (2013) developed a linear regression model to predict the rig weight using the hull length and breadth (width), water depth capability, designer, environmental class (harsh vs. moderate), and building years as predictor variables. Ha et al. (2015) tried to apply nonlinear multiple regression analysis to the weight estimation of the FPSO plant topsides. If an engineer uses the statistical method, she/he will definitely obtain the result in a short time, but if she/he does not have the needed past records of the target system, the estimated data can be affected by some errors compared with the actual data. The genetic programming (GP) approach can address this weakness and can thus help improve the estimation result. Some researchers applied the GP approach to the domains of naval architecture and ocean engineering. Gaur and Deo (2008) used GP to forecast wave in real time. They used the samples for a period of 15 years, and also tested the samples for the last 5-year period. Charhate et al. (2009) used GP to forecast offshore wind in real time. They suggested that the GP approach can be used to predict the wind speed and direction at two offshore locations along the west coast of India over the future time steps of 3–24 hours based on a sequence of past wind measurements made by floating buoys. All of these researches using GP claimed that they could come up with a better estimation model for each domain, but it is time-consuming to draw the proper model in accordance with its options, such as the population, independent variables, and generations. Thus, this study combined correlation analysis as a statistical method with the GP approach to reduce the number of independent variables, and the resulting approach showed good effects on the calculation time. 3. Weight estimation model using GP In this study, a weight estimation model was developed using GP. To improve the calculation time, correlation analysis was performed prior to GP. In this chapter, a brief description of each of such analyses is given. 3.1. Overview Figure 2 shows an overview of the weight estimation model proposed in this study. First, the past records of FPSO plants were inputted, and the initial variables for weight estimation were selected. Correlation analysis was performed to

Ships and Offshore Structures

Figure 2.

45

Overview of the weight estimation model using GP. (This figure is available in colour online.)

choose the dependent and independent variables based on any statistical relationship. Here, correlation refers to any of a broad class of statistical relationships involving dependence. Correlation analysis reduces the number of independent variables and thus shortens the calculation duration of GP. Next, analysis was again done, this time using GP. In this step, GP was performed with the variables chosen in the correlation analysis. It uses an evolutionary algorithmbased methodology such as crossover, mutation, and reproduction. Finally, the output model for weight estimation was obtained. The following sections will describe each of the aforementioned steps in detail. A computational program for generating a model for weight estimation was developed in this study through the procedure described in Figure 2. Compared with the conventional program, the program developed in this study can perform the procedure in Figure 2 automatically, and can also derive the model for weight estimation. The program has the functions for correlation and GP, and as such, it is necessary to verify its results. For this reason, the results of the correlation analysis in this program were compared with the results obtained from Microsoft Excel, a wellknown conventional program. The applications to FPSO plant topsides, which will be discussed in the next chapter,

were also verified by comparing the results of the nonlinear multiple regression analysis performed in the previous study (Ha et al. 2015). 3.2. Genetic programming GP was developed by Koza (1992, 1994), with the original idea inspired by evolution to automatically develop computer programs without programming them. Essentially, GP is a set of instructions and a fitness function for measuring how well a computer has performed a task. It is a machine learning technique used to optimise a population of computer programs according to a fitness landscape determined by a program’s ability to perform a given computational task (Banzhaf et al. 1997). GP is a specialised genetic algorithm (GA), where each individual is a computer program, and as such, it has many features in common with GA. The main difference between GP and GA is the representation of chromosomes. Table 1 shows the difference between the two in this regard. While GA uses fixed-length-string-based chromosomes, GP uses tree-based chromosomes with variable sizes and shapes. Its tree-based representation makes GP flexible, but unfortunately, it is not very efficient. Thus, GA is used for the task

46

S. Ha et al. Table 1.

Difference between genetic algorithms and genetic programming.

Algorithm

Genetic algorithms (e.g., binary-string coding)

Genetic programming

Expression Main operator

Binary string of 0 and 1 String Fixed length Crossover

Function Tree Length variable Crossover

Structure

1010110010101011

of optimising parameters for solutions when their structure is known, while GP is more often used to learn and discover both the contents and structures of solutions. It has produced many novel and outstanding results in areas such as quantum computing (Spector et al. 1998), electronic design (Koza et al. 1997), game playing (Alhejali and Lucas 2013), sorting (Wagner et al. 2015), and searching (Vidal et al. 2012) due to the improvements in the GP technology and the exponential growth of computing power. The main genetic operators in GP are reproduction, crossover, and mutation, which are similar to those in GA. Figure 3 shows the GP cycle using these operators. They change subtrees in the chromosomes. For instance, the crossover operator changes the subtrees of two chromosomes if they can be attached to the opposite tree. The genetic operators in GP change not only the values in the tree but also the structure of the tree. As such, compared to GA, GP has many operators, which more diversely affects the individuals existing in GP.

3.3. GP and correlation analysis GP is a domain-independent problem solving method, similar to GA. The fact that these stochastic, genetically inspired algorithms perform a global search and are robust can be regarded as both their advantage and their disadvantage, depending on the type of problem being solved (Takaˇc 2003). GP does not know the domain of the problem to be solved and may thus generate an overfitted solution. Similarly, one of the disadvantages of GP would be the time required to find a solution. The efficiency of the evaluation function greatly impacts the efficiency of the whole algorithm, and therefore also the application of GP. For this reason, it is important to implement fast evaluation of individuals. To

reduce the overfitting problem and to improve the performance of GP, this study used correlation analysis for statistical analysis, which can check the dependence among the variables and can reduce the number of independent variables. In the field of statistics, dependence means any statistical relationship between two random variables or two sets of data, and correlation refers to any of a broad class of statistical relationships involving dependence. There are several correlation coefficients (often denoted as r) that measure the degree of correlation. The most common of these is the Pearson correlation coefficient, which is sensitive only to a linear relationship between two variables (which may exist even if one is a nonlinear function of the other). It is obtained by dividing the covariance of the two variables by the product of their standard deviations, as in Equation (1). Pearson s correlation coefficient n Xi Yi − Xi Yi . (1) = n Xi2 − ( Xi )2 · n Yi2 − ( Yi )2 In Equation (1), r is the correlation coefficient, Xi is an independent variable, Yi is the dependent variable, and n is the number of data. The correlation coefficient between two variables indicates the degree of the said variables’ correlation with each other. Table 2 shows the relationship between variables according to the correlation coefficient. In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true (Goodman 1999). A researcher will often


Figure 3.

Cycle of GP and its main operators: a, reproduction; b, crossover; and c, mutation.

Table 2. Relationship between variables according to the correlation coefficient. Correlation coefficient (absolute value) 1.0–0.7 0.69–0.4 0.39–0.2 0.19–0.0

47

Relationship Strong relation Moderate relation Weak relation No relation

reject the null hypothesis when the p-value turns out to be lower than a certain significance level, often 0.05 or 0.01 (Dallal 2012). Such a result indicates that the observed result would be highly unlikely under the null hypothesis. To calculate the p-value, the following formula can be used: √ n−3 1 1 + r ln · √ . p-value = erfc 2 1−r 2

(2)

In Equation (2), erfc refers to the complementary error function. 4. Application of the weight estimation model to FPSO plant topsides To examine the applicability of the proposed method for developing the weight estimation model, it was applied to examples of the weight estimation of the topsides of an FPSO plant. 4.1. Collection of past records In this study, various past records for estimating the weight of an FPSO plant were collected through a literature sur-

vey (Kerneur 2010; Clarkson 2012). Table 3 shows such records. In Table 3, L, B, D, T, DWT, SC , OP , GP , WP , CREW, WD , and TLW T are the length, breadth, depth, draft, deadweight, storage capacity, oil production capacity, gas production capacity, water production capacity, complements, well depth, and light weight of the topsides (simply, topsides weight), respectively. N/C refers to whether the FPSO was newly built or was converted from other ships or offshore structures, and TM/SM indicates whether the FPSO’s mooring type is turret mooring or spread mooring. MMBBL means million barrels; MMBOPD refers to million barrels of oil per day; MMCFPD is million cubic feet per day; and MMBWPD refers to million barrels of water per day. To make the simplified model for this FPSO example, 37 records in Table 3 were used as the sample data (training set), and the others (test set) were used as the validation data for testing the applicability of the model.

4.2. Selection of initial variables From Table 3, 11 initial variables for estimating the topsides weight (TLW T ) were selected. As independent variables, the principal dimensions (L, B, D, T, and DWT), capacities (SC , OP , GP , and WP ), and others (CREW and WD ) were initially selected, and the dependent variable to be estimated was the topsides weight (TLW T ). 4.3. Generation of weight estimation model using GP In this section, the generation of a weight estimation model for FPSO plant topsides through GP will be presented.

N N N N N N N N N N N N N N N N N N N N C C C C C C C C C C C C C C N N C N N C

Captain Balder Bleo Holm Petrojarl Varg Jotun A Northern Endeavour Asgard A Terra Nova Girassol Kizomba A Kizomba B Erha Dalia Bonga Belanak Natuna Nganhurra Greater Plutonio Agbami Akpo Usan Petrojarl Foilnaven Haewene Brim Kuito Perintis Fluminense Searose Mystras Petrobras 43 Petrobras 48 Global Producer 3 Maersk Ngujjima-Yin Cidade de Sao Mateus Maersk Peregrino Petrobras 57 Pazflor CLOV Petrobras 63 Skarv OSX 1 Glas Dowr

TM TM TM TM TM TM TM TM SM SM SM SM SM SM SM TM SM SM SM SM TM TM SM TM TM TM SM SM SM TM TM SM TM SM SM SM SM TM TM TM

N/C TM/SM

Year 1996 1996 1997 1998 1999 1999 1999 2000 2001 2004 2005 2005 2005 2005 2005 2006 2006 2007 2008 2011 1988/1996 1996/1997 1979/1999 1984/1999 1974/2003 1999/2004 1976/2004 1975/2004 1973/2005 1999/2006 1999/2008 1989/2009 2008/2010 1988/2010 2011 2013 1983/2010 2010 2010 1995/2010

Principal particulars of FPSOs.

FPSO

Table 3.

214.7 211.1 242 214.7 232 273 276.4 292.2 300 285 285 296 312.4 305.1 285 260 319 320 310 320 250.2 253 334.9 245.4 362 271.8 271 337 328.6 217.2 332.9 322.1 345.2 318.8 325 305 346.3 295 271.7 242.3

38 36 42 38.2 41.5 50 45 45.5 59.6 60 60 63 60 58 58 46 58 58.4 61 61 34 42 43.7 39.6 60 46 44 54.5 54.5 38 58 56 58 56 61 61 57.3 50.6 46 42

23.7 20.8 21.2 22.2 23.5 28 26.6 28.2 30.5 32.3 32.3 32.3 33.2 32 26 25.8 31 32 30.5 32 19.1 23.2 27.7 20.6 28.3 26.6 22.4 27 27 23 31 29.5 31 29.5 32.5 32 28.5 29 26.6 21.1

18 14 14.9 16 16 18.8 19.7 20 22.8 24.4 24.4 24 24.3 23.4 16.7 18.5 23.4 24 23.5 24.7 12.8 15 21.4 14.7 23 18 17 21 21 17 22.7 19.8 20.6 19.8 25.6 24 22.9 19.9 18.2 14.9

88,326 88,420 119,000 60,000 92,800 180,000 105,000 120,000 343,000 340,660 340,660 375,600 329,000 312,500 210,000 142,000 360,000 337,859 321,000 353,200 43,276 103,000 228,033 94,238 356,400 150,000 138,900 273,191 273,622 85,943 308,492 276,000 277,450 255,271 320,000 350,000 322,911 128,000 148,192 105,000

0.56 0.38 0.69 0.42 0.58 1.4 0.94 0.96 2 2.2 2.2 2.2 2 2 1.14 0.9 1.77 2.2 2 2 0.28 0.6 1.5 0.65 1.3 0.94 1.04 2 1 0.45 1.2 0.7 1.6 1.6 1.9 2 2 0.95 0.95 0.66

0.06 0.083 0.1 0.057 0.089 0.003 0.22 0.125 0.27 0.25 0.25 0.21 0.24 0.225 0.1 0.1 0.24 0.25 0.225 0.18 0.14 0.07 0.1 0.035 0.081 0.13 0.08 0.15 0.15 0.1 0.12 0.035 0.1 0.18 0.22 0.22 0.14 0.085 0.08 0.06

12 39 58 53 38 35 840 300 280 400 400 340 280 170 500 80 400 450 530 176.6 100 110 52 100 75 150 85 162 210 75 100 353 13 71 150 250 35 671 53 85

0.25 0.085 0.135 0.057 0.122 0.174 0.063 0.115 0.18 0.525 0.525 0.15 0.265 0.1 0.06 0.1 0.45 0.12 0.42 0.1 0.12 0.022 0.02 0.018 0.05 0.18 0.032 0.2 0.251 0.3 0.2 0.01 0.023 0.232 0.382 0.319 0.325 0.02 0.06 0.065

50 60 80 77 60 84 116 120 140 100 100 100 190 70 120 80 120 100 240 180 70 55 76 85 100 80 100 100 194 90 80 85 100 110 240 240 46 126 89 96

105 127 110 84 126 366 320 95 1400 1180 1010 1220 1365 1030 100 390 1310 1462 1325 750 450 85 383 75 700 120 70 785 1035 113 420 790 100 1260 800 1200 1200 350 134 350

SC OP GP WP CREW WD L (m) B (m) D (m) T (m) DWT (ton) (MMBBL) (MMBOPD) (MMCFD) (MMBWPD) (person) (mm)

6200 3700 9000 2500 6000 11,000 15,000 32,000 23,500 23,000 23,000 30,000 30,000 22,000 25,000 8000 24,000 35,000 37,000 27,700 4500 5000 3500 1900 4500 12,000 5500 14,000 14,000 6000 7000 20,000 6500 14,500 32,000 37,478 14,000 16,000 12,000 4500

TLWT (ton)

48 S. Ha et al.


49

Table 4. Examples of parameters for developing a weight estimation model for FPSO plant topsides using.

Table 5. Three cases of parameters for selecting the maximum tree depth.

Parameter

Value

Parameter

Function

+ , −, × , ÷, sin (sine function), cos (cosine function), exp (exponential), √ (square root) L, B, D, T, DWT, SC , OP , GP , WP , CREW, WD , TLW T 100 10,000 0.1 0.8 0.1 4 [−20, 20]

Terminal (variables) Population size Maximum generation Reproduction rate Crossover probability Mutation probability Maximum depth Range of constants

The 37 records in Table 3 were used for generating the model. The results of the generated model could be changed according to the optimisation parameters of GP, such as the number of population and the number of generations. Thus, in this study, case studies were performed and suitable values for each parameter were selected. Table 4 shows examples of the parameters that can be used for developing a weight estimation model for FPSO plant topsides. The maximum depth means the maximum tree depth of the chromosomes in the tree structure. As for the range of constants, only the constants within this range can be chosen during the process. To determine how accurately the estimation model estimates the dependent variable, TLW T in this example, the fitness of each generation was also calculated during the calculation. This study used root mean square error (RMSE), which is generally used in statistics, to check the fitness of each generation.

Function Terminal (variables) Population size Maximum generation Reproduction rate Crossover probability Mutation probability Maximum depth Range of constants

Value √ + , −, × , ÷, sin, cos, exp, L, B, D, T, DWT, SC , OP , GP , WP , CREW, WD , TLW T 100 1000 0.05 0.85 0.1 Case 1: 4/Case 2: 5/Case 3: 6/Case 4: 7 [−20, 20]

maximum tree depth increased. Both the training and test sets showed this tendency. If the maximum tree depth increases, however, the final model for weight estimation may have a more complicated mathematical expression. The expression for each case is shown in Equations (3)–(6). The expressions were generated by simplifying the final chromosome in tree structure, and as such, they could have some constants outside the range of −20 to 20. Thus, it is thought that the proper maximum tree depth is 6 after compromising the fitness and complexity of the expression.

TLW T

TLW T = 2.655 · D · SC · cos WP GP · (D − 15.88) + 5158, √ = 2.452 SC · GP · eT · cos WP + 4505,

(3) (4)

TLW T = 0.01182[D 4 − SC (23.22L + DW T · WP ) + B · SC (T + WP ) (GP + CREW )] + 552.8, (5)

TLW T = 46, 100 − 4647 D +

√ 2GP D · CREW − 12.36 · CREW + SC · GP − 20.8 + eSC + − 133.8. WD

(6)

4.3.1. Case study

4.3.2. Final model generated by GP

To choose the proper value for each parameter of GP, some case studies were performed. In this section, an example of such case studies, where the effect of the maximum tree depth on the fitness of the final solution was checked, will be introduced. The case studies were performed using the parameter values in Table 5. All the values, except that of the maximum depth, were the same for all the cases. Figure 4 and Table 6 show the results of the case study for choosing the maximum tree depth. It is indicated that these results are from the final model, but they are actually from the intermediate model, from which the final model was obtained through short optimisation. The results show that both RMSE and the variance were improved when the

The optimisation parameters for GP, which were derived from the case studies, are summarised in Table 4. For the case of the maximum tree depth, its value was set to 6, as shown in the previous case study. All the values of the parameters were also chosen by performing case studies for each parameter. Figure 5 shows the results of GP for developing a weight estimation model for FPSO plant topsides. At this time, full optimisation was performed with GP, as opposed to the short optimisation for the case studies. As shown in Figure 5a, the best fitness decreased continuously during the optimisation. The RMSE of the training set with 37 data in Table 3 was 1971.28, and its variation was 96.79%. Additionally, the RMSE of the test set with three data in

50

S. Ha et al.

Table 6.

Results of the case study for selecting the maximum tree depth.

Item Training set (37 records)

Case 1 (maximum tree depth: 4)




4968.20 79.60% 67 seconds

4416.08 83.22% 76 seconds

3802.56 88.05% 83 seconds

3339.75 90.78% 98 seconds

Error Variation

Computation time

(a) Maximum tree depth: 4, Calculaon me: 67s

(b) Maximum tree depth: 5, Calculaon me: 76s

Best fitness: 4968.2028 found at generation 655


8.95

8.9 Best fitness

Best fitness

8.9 8.8 8.85 8.7 Log Fitness

Log Fitness

8.8 8.75 8.7 8.65

8.6

8.5

8.6 8.4 8.55 8.5

0

100

200

300

400 500 600 Generation

700

800

900

8.3

1000

0

100

200

300


700

800



8.9

8.8 Best fitness

Best fitness

8.8

8.7

8.7

8.6

Log Fitness

Log Fitness

1000

(d) Maximum tree depth: 7, Calculaon me: 98s

(c) Maximum tree depth: 6, Calculaon me: 83s

8.6

8.5

8.5

8.4

8.4

8.3

8.3

8.2

8.2

900

0

100

200

300


700

800

900

1000

8.1

0

100

200

300


700

800

900

1000

Figure 4. Results of the case study for selecting the maximum tree depth: (a) maximum tree depth = 4; (b) maximum tree depth = 5; (c) maximum tree depth = 6; and (d) maximum tree depth = 7.

Table 3 was 1859.92, and its variation was 84.78%. The final expression of the weight estimation model is

TLW T =

B · SC

The final model has many variables, and as such, it is suspected that this equation has an overfitting problem,

√ T GP (B − D) B − D + SC + CREW + 4.439Wcos + P B 0.0101 · OP + D − sin B

24.13·CREW WD sin D

0.0121

− 5256.

(7)


Figure 5.

51

Results of GP for developing a weight estimation model for FPSO plant topsides.

which generally occurred in GP, as mentioned in Section 3.3. Moreover, it is very difficult to understand the relation between the nine variables (B, D, T , SC , OP , GP , WP , CREW , and WD ) and the dependent variable TLW T . 4.4. Generation of weight estimation model using GP with correlation analysis In the previous section, the final expression of the weight estimation model generated by GP has too many independent variables for estimating the dependent variable TLW T , and thus, the model may have an overfitting problem. This problem is caused by the careless use of GP with regard to the dependency of the variables. Thus, this study adopted correlation analysis, which measures the relationship between two variables, for statistical analysis.

4.4.2. Final model generated by GP From the correlation analysis, nine variables are selected for GP. Their parameter values were the same as those in the previous section. The results of the GP including correlation analysis are shown in Figure 6. As shown in Figure 6a, the best fitness decreased continuously during the optimisation. The RMSE of the training set with 37 data in Table 3 was 2258.85, and its variation was 95.78%. Additionally, the RMSE of the test set with three data in Table 3 was 2014.85, and its variation was 82.13%. The final expression of the weight estimation model is TLW T = 2824 − 0.1489 2GP 0.3158 · DW T + (D − 17.63) (12.99B − D + GP ) × − sin D GP √ sin (OP · WD ) . × cos SC − CREW − sin B + sin D + 0.7049SC + sin WD

(8) 4.4.1. Correlation analysis Correlation analysis was performed on the independent variables and the dependent variable TLW T . Table 7 shows the results of the correlation analysis. The results in the table were verified by comparing them with the results obtained from Microsoft Excel. As shown in Table 7, all the independent variables, except L and WP , were selected. At this time, the criteria for the selection were the following: (1) a correlation coefficient (r) of over 0.5; and (2) a p-value of less than 0.15. In fact, the general criterion for the p-value is that it should be less than 0.05 or 0.01 (Dallal 2012). A higher value was used, however, so that the number of independent variables to be included in the weight estimation model would not be small in this example.

This model consists of eight independent variables for estimating the dependent variable TLW T . 4.4.3. Comparison with the results of GP without correlation analysis Compared with the results of the GP without correlation analysis, Equation (8) has less variables for expressing the dependent variable TLW T . Table 8 compares the results of the two GP methods. It shows that the results of the GP with correlation analysis seem to have less accuracy than the results of GP only, but the difference between two GP methods is acceptable when we consider the calculation time.

52

S. Ha et al.

Table 7.

Results of the correlation analysis of the independent variables and of the topsides weight of all the FPSOs.

Item L B D T DWT SC OP GP WP CREW WD

r p-Value r p-Value r p-Value r p-Value r p-Value r p-Value r p-Value r p-Value r p-Value r p-Value r p-Value

L

B

D

T

DWT

SC

OP

GP

WP

CREW

WD

TLWT

Criteria

1.00 – 0.80 0.00 0.73 0.00 0.75 0.00 0.81 0.00 0.69 0.00 0.39 0.03 0.20 0.30 0.14 0.47 0.39 0.03 0.59 0.00

0.80 0.00 1.00 – 0.91 0.00 0.88 0.00 0.96 0.00 0.87 0.00 0.62 0.00 0.40 0.03 0.42 0.02 0.55 0.00 0.77 0.00

0.73 0.00 0.91 0.00 1.00 – 0.95 0.00 0.90 0.00 0.88 0.00 0.68 0.00 0.40 0.03 0.45 0.01 0.51 0.00 0.74 0.00

0.75 0.00 0.88 0.00 0.95 0.00 1.00 – 0.92 0.00 0.89 0.00 0.71 0.00 0.37 0.04 0.53 0.00 0.55 0.00 0.78 0.00

0.81 0.00 0.96 0.00 0.90 0.00 0.92 0.00 1.00 – 0.90 0.00 0.65 0.00 0.31 0.09 0.45 0.01 0.49 0.01 0.83 0.00

0.69 0.00 0.87 0.00 0.88 0.00 0.89 0.00 0.90 0.00 1.00 – 0.75 0.00 0.35 0.06 0.51 0.00 0.47 0.01 0.80 0.00

0.39 0.03 0.62 0.00 0.68 0.00 0.71 0.00 0.65 0.00 0.75 0.00 1.00 – 0.59 0.00 0.60 0.00 0.53 0.00 0.78 0.00

0.20 0.30 0.40 0.03 0.40 0.03 0.37 0.04 0.31 0.09 0.35 0.06 0.59 0.00 1.00 – 0.25 0.18 0.39 0.03 0.40 0.03

0.14 0.47 0.42 0.02 0.45 0.01 0.53 0.00 0.45 0.01 0.51 0.00 0.60 0.00 0.25 0.18 1.00 – 0.40 0.03 0.56 0.00

0.39 0.03 0.55 0.00 0.51 0.00 0.55 0.00 0.49 0.01 0.47 0.01 0.53 0.00 0.39 0.03 0.40 0.03 1.00 – 0.48 0.01

0.59 0.00 0.77 0.00 0.74 0.00 0.78 0.00 0.83 0.00 0.80 0.00 0.78 0.00 0.40 0.03 0.56 0.00 0.48 0.01 1.00 –

0.4158 0.0218 0.7176 0.0000 0.7448 0.0000 0.7089 0.0000 0.6561 0.0001 0.7104 0.0000 0.7341 0.0000 0.6330 0.0002 0.4648 0.0093 0.7032 0.0000 0.6893 0.0000

X O O O O O O O O O O O O O O O X O O O O O

Table 8.

Comparison of the results of the two genetic programming methods.

Item No. of independent variables Error Training set Variation Error Test set Variation Computation time

Figure 6.

Genetic programming

Genetic programming with correlation analysis

9 1971.28 96.79% 1859.92 84.78% 2519 seconds

8 2258.85 95.78% 2014.85 82.13% 2045 seconds

Results of GP with correlation analysis for developing a weight estimation model for FPSO plant topsides.

Ships and Offshore Structures Table 9.

53

Difference between the actual and estimated weights of the FPSO topsides – 37 records. Estimated by GP with correlation analysis

Name Captain Balder Bleo Holm Petrojarl Varg Jotun A Northern Endeavour Asgard A Terra Nova Girassol Kizomba A Kizomba B Erha Dalia Bonga Belanak Natuna Nganhurra Greater Plutonio Agbami Akpo Usan Petrojarl Foilnaven Haewene Brim Kuito Perintis Fluminense Searose Mystras Petrobras 43 Petrobras 48 Global Producer 3 Maersk Ngujjima-Yin Cidade de Sao Mateus Maersk Peregrino Petrobras 57 Pazflor CLOV Petrobras 63

Estimated by nonlinear model

Actual weight (A)

Weight (B)

Difference ((A − B)/A)

Weight (C)

Difference ((A − C)/A)

6200 3700 9000 2500 6000 11,000 15,000 32,000 23,500 23,000 23,000 30,000 30,000 22,000 25,000 8000 24,000 35,000 37,000 27,700 4500 5000 3500 1900 4500 12,000 5500 14,000 14,000 6000 7000 20,000 6500 14,500 32,000 37,478 14,000

3474 3586 4918 5010 5910 9323 15,390 32,760 23,394 29,065 23,363 29,209 33,189 21,771 24,315 8586 26,295 29,025 37,222 27,458 4547 6362 7872 3166 4877 10,683 6029 13,784 12,968 7837 7034 19,994 8381 14,869 30,212 34,763 9340

0.44 0.03 0.45 −1.00 0.02 0.15 −0.03 −0.02 0.00 −0.26 −0.02 0.03 −0.11 0.01 0.03 −0.07 −0.10 0.17 −0.01 0.01 −0.01 −0.27 −1.25 −0.67 −0.08 0.11 −0.10 0.02 0.07 −0.31 0.00 0.00 −0.29 −0.03 0.06 0.07 0.33

4138 3288 4355 4478 4683 9793 25,402 15,348 23,018 27,510 27,510 26,238 29,794 19,770 18,416 7636 23,260 28,263 39,247 25,324 4012 6013 10,226 5047 10,884 9738 6548 15,925 18,113 5630 12,949 16,055 13,293 13,488 32,336 34,763 13,487

0.33 0.11 0.52 −0.79 0.22 0.11 −0.69 0.52 0.02 −0.20 −0.20 0.13 0.01 0.10 0.26 0.05 0.03 0.19 −0.06 0.09 0.11 −0.20 −1.92 −1.66 −1.42 0.19 −0.19 −0.14 −0.29 0.06 −0.85 0.20 −1.05 0.07 −0.01 0.07 0.04

4.5. Comparison with the statistical method In the authors’ previous study (Ha et al. 2015), a simplified model for the weight estimation of FPSO plant topsides was developed using the statistical method. In this section, to confirm the usability of the proposed method, the results are compared with those of the statistical method, especially nonlinear multiple regression analysis.

regression analysis was introduced in detail in the authors’ previous study (Ha et al. 2015). Nonlinear multiple regression analysis was performed with the 37 training sets presented in Table 3. The final nonlinear form of the weight estimation model is TLW T = −764.8522 + 0.3304D 3 + 720.6451SC3 + 21.2018GP + 0.0009863CREW 3 .

4.5.1. Nonlinear form of weight estimation model analysed using the statistical method In brief, the nonlinear multiple regression analysis in the previous study consisted of three analysis steps: variable transformation, correlation analysis, and multiple regression analysis. The whole procedure of nonlinear multiple

(9)

The final model in Equation (9) shows that the topsides weight of the FPSO example can be represented as the nonlinear relationship between four independent variables (D 2 , SC2 , GP , and CREW). It also satisfied the F-test and t-test criteria. In addition, the adjusted R2 of the final regression model is 0.797. The adjusted R2 is a statistic that gives

54

S. Ha et al. Table 10.

Differences between the actual and estimated weights of the FPSO topsides – three unused records. Estimated by GP with correlation analysis

Name Skarv OSX 1 Glas Dowr

Estimated by nonlinear model

Actual weight (A)

Weight (B)

Difference ((A − B)/A)

Weight (C)

Difference ((A − C)/A)

16,000 12,000 4500

18,313 9560 5434

−0.14 0.20 −0.21

24,111 7891 5221

−0.51 0.34 −0.16

some information about the goodness of fit of a model. An adjusted R2 value close to 1.0 indicates that the regression line fits the data well.

4.5.2. Comparison of the results of the two methods Using the two final models in Equations (8) and (9), the actual and estimated weights of the FPSO topsides for 37 records were compared. Table 9 shows the difference between the actual and estimated weights for the 37 records. The average difference between the actual and estimated weights of the FPSO topsides according to the GP-based model was 7.08%, the coefficient of variation (COV) was 0.324, and the variation explained (R2 ) was 95.78%. As for the statistical-method-based model, it determined the average difference to be 16.89%; the COV, 0.566; and the R2 , 81.93%. The GP-based model thus more accurately reported the average difference between the actual and estimated weights of the FPSO topsides, and the COV, compared to the statistical-method-based model. The maximum difference according to GP was 125%, and that according to the statistical method was 192%. For the validation of the two models, the actual and estimated weights of the FPSO topsides for the three unused records as a test set, the last records shown in Table 3, were also compared. Table 10 shows the differences between the actual and estimated weights of the FPSO topsides for the three unused records. The average ratio obtained by the GP-based model was 4.96%; the COV, 0.221; and the R2 , 86.34%. As for the statistical-method-based model, the average ratio that it obtained was 10.83%; the COV, 0.427; and the R2 , 71.19%. From this it can be seen that GP with correlation analysis can yield a better model than the statistical method.

5. Conclusions and future studies In spite of the importance of the topsides weight in the design of an FPSO plant, it has been roughly estimated using the existing similar data and based on the designer’s experience. To solve this problem, a weight estimation model for FPSO plant topsides was developed in this study using the optimisation method, especially genetic programming

(GP). Various past records of FPSO plants were first collected through a literature survey, and then analysis using GP was performed to develop a weight estimation model for FPSO plant topsides. To improve the computation time and to overcome the overfitting problem, correlation analysis was adopted in this study. A comparative test for the models based on GP and nonlinear multiple regression analysis was also performed. As a result, the GP-based model showed a better estimation capacity than the nonlinear multiple regression analysis-based model in terms of accuracy. Finally, to evaluate the applicability of the developed models, they were applied to an FPSO example. The results showed that the developed models can be used to estimate the topsides weight of future FPSOs. Furthermore, the overall performances of the developed models were shown to depend on the past records collected through literature survey. Thus, if there is noise or wrong information in the past records, the applicability and reliability of the models can be reduced. In addition, the engineering meaning of the developed models should be further investigated, and a parametric test for some variables used in the method should be performed to identify the impact of such variables on the developed models. Finally, in the future, the database for the past records of FPSO plants, such as FPSOs, will be continuously updated and made errorfree, and the developed models will be improved through their application to various examples.

Disclosure statement No potential conflict of interest was reported by the authors.

Funding This work was partially supported by Global Leading Technology Program of the Office of Strategic R&D Planning (OSP) funded by the Minister of Trade, Industry & Energy, Korea [100425562012-11]; New & Renewable Energy of the Korea Institute of Energy Technology Evaluation and Planning (KETEP) funded by the Minister of Trade, Industry & Energy, Korea [number 20124030200110]; BK21 Plus Program (Education and Research Center for Creative Offshore Plant Engineers) funded by the Ministry of Education, Korea; Engineering Research Institute of Seoul National University, Korea; and Research Institute of Marine Systems Engineering of Seoul National University, Korea.

Ships and Offshore Structures References Alhejali AM, Lucas SM. 2013. Using genetic programming to evolve heuristics for a Monte Carlo Tree Search Ms PacMan agent. Paper presented at: 2013 IEEE conference on Computational Intelligence in Games (CIG); Niagara Falls, ON, Canada. Banzhaf W, Nordin P, Keller RE, Francone FD. 1997. Genetic programming: an introduction: on the automatic evolution of computer programs and its applications (The Morgan Kaufmann Series in Artificial Intelligence). 1st ed. San Francisco (CA): Morgan Kaufmann. Bolding A. 2001. Bulk factor method estimates FPSO: topsides weight. Oil Gas J. 99:49–53. Charhate SB, Deo MC, Londhe SN. 2009. Genetic programming for real-time prediction of offshore wind. Ships Offshore Struct. 4:77–88. Clarkson. 2012. The mobile offshore production units register 2012. 10th ed. London (UK): Clarkson. Dallal GE. 2012. The little handbook of statistical practice [Internet]. [cited 2012 December 31]. Available from: http://www.jerrydallal.com/LHSP/LHSP.HTM Gaur S, Deo MC, 2008. Real-time wave forecasting using genetic programming. Ocean Eng. 35:1166–1172. Goodman SN. 1999. Toward evidence-based medical statistics. 1: the p value fallacy. Ann Int Med. 130:995–1004. Ha S, Seo SH, Roh MI, Shin HK. 2015. Simplified nonlinear model for the weight estimation of FPSO plant topside using the statistical method. Ships Offshore Struct. doi:10.1080/17445302.2015.1038870. Hwang JH. 2013. Selection of optimal liquefaction process system considering offshore module layout for LNG FPSO at FEED stage [Ph.D. thesis]. [Seoul (Korea)]: Seoul National University. Hussin H, Hashim FM, Mokhtar AA. 2012. Systematic approach to maintainability analysis at operational phase. J Appl Sci. 12:2562–2567.

55

Kaiser MJ, Snyder BF. 2013. Empirical models of jackup rig lightship displacement. Ships Offshore Struct. 8:468– 476. Kaiser MJ, Snyder BF. 2014. Offshore wind structure weight algorithms. Ships Offshore Struct. 9:551– 556. Kerneur J. 2010. 2010 Worldwide survey of FPSO units. Houston (TX): Offshore Magazine. Koike K, Minoura M. 2011. Application of a statistical prediction method of ship performance by using onboard measurement data. J Jpn Soc Nav Arch Ocean Eng. 13:51–58. Koza JR. 1992. Genetic programming: on the programming of computers by means of natural selection. Cambridge (MA): MIT Press. Koza JR. 1994. Genetic programming. II. Automatic discovery of reusable programs. Cambridge (MA): MIT Press. Koza JR, Bennett FH, Andre D, Keane MA, Dunlap F. 1997. Automated synthesis of analog electrical circuits by means of genetic programming. IEEE Trans Evol Comput. 1:109– 128. Lee KY, Roh MI, Cho SH. 2001. Multidisciplinary design optimization of mechanical systems using collaborative optimization approach. Int J Veh Des. 25:353–368. Spector L, Barnum H, Bernstein HJ, Swamy, N. 1998. Genetic programming for quantum computers. Genet Program. 365– 373. Takaˇc A. 2003. Genetic programming in data mining: cellular approach [Master’s thesis]. [Bratislava (Slovakia)]: Physics and Informatics Comenius University. Vidal T, Crainic TG, Gendreau M, Lahrichi N, Rei W. 2012. A hybrid genetic algorithm for multidepot and periodic vehicle routing problems. Oper Res. 60:611–624. Wagner M, Neumann F, Urli T. 2015. On the performance of different genetic programming approaches for the SORTING problem. Evol Comput. doi:10.1162/EVCO_a_00149.