Multi-component image segmentation using a hybrid ... - IEEE Xplore

2 downloads 3596 Views 1MB Size Report
Oct 18, 2008 - Abstract: Image segmentation is an important task in image analysis and processing. ... multi-objective GA to segment a multi-component image.
www.ietdl.org Published in IET Image Processing Received on 8th November 2006 Revised on 18th October 2008 doi: 10.1049/iet-ipr.2007.0213

ISSN 1751-9659

Multi-component image segmentation using a hybrid dynamic genetic algorithm and fuzzy C-means M. Awad1 K. Chehdi2 A. Nasri3 1

Center for Remote Sensing, National Council for Scientific Research, Beirut, Lebanon TSI2M-IETR-Lannion, Universite´ de Rennes 1 (ENSSAT), Lannion, France 3 Department of Computer Science, American University of Beirut (AUB), Beirut, Lebanon E-mail: [email protected] 2

Abstract: Image segmentation is an important task in image analysis and processing. Many of the existing methods for segmenting a multi-component image (satellite or aerial) are very slow and require a priori knowledge of the image that could be difficult to obtain. Furthermore, the success of each of these methods depends on several factors, such as the characteristics of the acquired image, resolution limitations, intensity in-homogeneities and the percentage of imperfections induced by the process of image acquisition. Recently, fuzzy C-means (FCM) and Genetic Algorithms were separately used in segmenting multi-component images but neither of them had successfully addressed the above concerns. GA was enhanced using Hill-climbing, randomising, and modified mutation operators, leading to what is called hybrid dynamic genetic algorithm (HDGA). Coupling HDGA and FCM creates an unsupervised segmentation method which could successfully segment two types of multi-component images (Landsat ETMþ, and IKONOS II). Comparison with the four different methods FCM, hybrid genetic algorithm (HGA), self-organizing-maps (SOM), and the combination of SOM and HGA (SOM-HGA) reveals that FCM-HDGA segmentation method gives robust and reliable results, and is more time efficient.

1

Introduction

Image segmentation is a central area in image analysis, and understanding [1]. It is widely used in higher-level image processing such as object recognition and edge detection. Existing methods segment an image using various criteria, such as grey level, colour or texture. Recently, researchers have investigated the application of genetic algorithm (GA) into the image segmentation problem, see [1 – 3] for a comprehensive survey. However, GA-based methods suffer from slowness, and work efficiently for particular type of images only with few that could handle multi-component images. Some of these methods combine GA with Fuzzy clustering using real-coded string with variable length such as in [4]. In this method, the chromosomes consist of cluster values only, and the objective is to obtain an optimal number of these clusters. One of the drawbacks of this 52

& The Institution of Engineering and Technology 2009

method is the additional computational cost needed to evaluate the fitness values using Xie – Beni index [5]. In addition, this method uses fixed mutation rate with different crossover to handle the concept of variable string. Another limitation is that only Indian remote sensing (IRS) multi-component images are used in the experiments, and the results are not compared to other than fuzzy C-means (FCM) methods. Other methods use multi-objective GA to segment a multi-component image such as the one in [6]. Again, this method is applied to one type of multi-component image, the IRS images with 5 m resolution. Comparison with other methods showed that segmentation with multi-objective GA has an efficiency of 96%, while those based on single-objective GA and FCM have 82 and 31.2%, respectively. To solve the fuzzy clustering problem, multi-objective GA methods, such as in [7], used real-coded variable length. Two fuzzy IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

www.ietdl.org cluster validity indices are simultaneously optimised, the Xie –Ben and the objective function J of FCM. The resultant set of near-Pareto-optimal solutions contains a number of non-dominated solutions. This type of research is very promising, however convergence to the true (or global) Pareto-optimal front may not occur due to the various characteristics of the problem to be addressed, such as multi-modality, deception, isolated optimum, collateral noise, and so on. All the above features are known to cause difficulty in single-objective GA [8], which is more intense in a multi-objective setting. It is known that multiobjective GA is better than single-objective if parameters are well defined. Moreover, there is no attempt to test this method on very high resolution multi-component images. Ad hoc means involving user judgment or index metric that are not reliable to prove the efficiency of such methods. Intelligent genetic algorithm (IGA) is used to segment images [9]. Basically, the method consists of applying an intelligent crossover (IC) based on orthogonal arrays (OAS). Experimental results showed that the proposed method can depress the Gaussian noise effectively, segment forward looking infra-red (FLIR) images properly, and is faster than the conventional GA and exhaustive search. However, IGA is still considered slow due to the use of static size of chromosomes. Moreover, experiments are conducted on one type of image. Among the various clustering algorithms, FCM [10, 11] is one of the most popular methods used in image processing since it is suitable to deal with the imprecise and uncertain nature of image data sets, such as remote sensing images. The use of FCM clustering to segment a multi-component image in meaningful regions has been reported in the literature [12 – 14]. FCM is a powerful tool for solving many clustering problems; its main drawback, however, is the requirement to specify the desired number of clusters beforehand, which may not be feasible. To improve the segmentation process, combinations of the above algorithms were used. Some of these cooperative methods are unsupervised non-parametric methods such as K-means and GA. The K-means method is used to segment an image into regions while GA is used to control the process of splitting and merging regions to optimise the evaluation function [15]. This method is applied to 1-m high resolution IKONOS satellite images where the results show that GA method using the fuzzy– set evaluation function is robust because it is parameter free. However, its performance remains slow and has not been tested on other lower resolution multi-component images. In addition, combining GA and other segmentation methods can be found in [16] where self-organizing map (SOM) cooperates with Hybrid GA to segment different types of satellite images. Again his cooperative method is very slow. IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

FCM and OAS could also be combined to split and merge regions in the image segmentation process [17]. First, the FCM algorithm measures the amount of fuzziness of the spatial criteria, such as partition coefficient, classification entropy and separation indices. Then, FCM algorithm is used to split the image into many small regions while measuring the contrast and the compactness between regions. OAS merge two adjacent regions r1 and r2, by measuring the difference between the average gray levels. The smaller the difference, the higher is the probability of their mergence. However, this method suffers from being trapped in the local optimal selection. In this paper, we overcome the above shortcomings by creating a new cooperative unsupervised nonparametric segmentation method. The method combines fuzzy C-means and hybrid dynamic genetic algorithm (FCMHDGA) that will help to improve the segmentation process by obtaining the desired number of clusters. In the HDGA, the population consists of chromosomes with variable number of cluster centres, but with a fixed string length. Since image segmentation process can be considered as a combinatorial problem [3], different heuristics are implemented to improve the process. In Section 2, the modified FCM is described. HDGA is discussed and explained in detail in Section 3. In Section 4, FCM-HDGA cooperative segmentation method is described. Experimental results are discussed in Section 5 with conclusion and recommendations drawn in Section 6.

2

Modified FCM

The standard FCM clustering algorithm works as follows. Given a set of n data patterns, x ¼ xi , . . . , xn , the FCM algorithm minimises the weights within the group sum of the squared error objective function J(U, V ) J (U , V ) ¼

n X c X

m um ik d (xk , vi )

(1)

k¼1 i¼1

where xk is the kth p-dimensional data vector, vi is the prototype of the cluster centre i, uik is the degree of membership of xk in the ith cluster, m is a weighting exponent on each fuzzy membership. The function dik (xk , vi) is a distance measure between the data vector xk and the cluster centre vi , n is the number of data vectors and c is the number of clusters. A solution of the objective function J(U, V) can be obtained via an iterative process where the degree of membership uik and the cluster centre vi are updated via (2) and (3), respectively uik ¼



Pc

1

2=m1 j¼1 (dik =dij )

Pn um ik xk vi ¼ Pk¼1 n m k¼1 uik

(2)

(3)

53

& The Institution of Engineering and Technology 2009

www.ietdl.org uik [

chromosome is completed by ending it with a number of marks (Fig. 2).

FCM is used to cluster a multi-component image by obtaining a variable number of cluster centres from the HDGA. This means that cluster centres are not computed or updated using (3), but the degree of membership is updated using (2). Thus, the gain is less calculation, higher speed, and less time. Higher accuracy is obtained when the optimal cluster centres are selected. Another substantial reason for using the above method is to avoid the problem of specifying the number of clusters which is required by FCM.

The objective is to minimise the function in (1), and to optimise the number of clusters. The fitness Fit(Chi) of each chromosome Chi in the current population is the inverse of the objective function (see (4)). According to the fitness value, each chromosome will be given a number of chances to be selected using the Roulette wheel selection process, where the lower in cost is given more possibilities for mating. One common problem with the Roulette wheel is the possibility of premature convergence to unwanted local optima due to the predominance of fit structures. A method of decreasing the possibility of premature convergence is to keep the number of multiple replications of individuals as small as possible. This can be achieved by using a strategy that is based on the idea that poor solutions require more corrections than better solutions. Thus the fitness value Fit(Chi) can be used to determine the mutation rate associated with each chromosome Chi . An upper and lower limit are chosen for the mutation rate, and within those limits the mutation probability Mutrate(Chi) for each chromosome Chi is calculated according to its fitness value, with the higher the fitness value, the lower the mutation rate. Equation (5) shows the procedure for calculating the mutation rate for each chromosome. In this Equation, the fitness value is normalised between 0.0 and 1.0

Subject to the P ½0, 1, ci¼1 uik ¼ 18k,

following constraints: P 0 , nk¼1 uik , N 8i

3 Hybrid dynamic genetic algorithm The optimisation procedure that mimics the process observed in natural evolution is called the GA [18]. The most important characteristic of GA is being a global optimisation technique, so it is able to find the global optimum solution without being trapped in local minima [19]. In addition, GA is a searching process based on the laws of natural selection and genetics. Usually, a simple GA consists of four operations: Selection, Genetic Operation, Hill-climbing and Replacement (Fig. 1). Genetic operations are crossover (reproduction), and mutation. Replacement is the process of replacing parents with newly evolved siblings. The first step, randomly generate an initial integer population (100 chromosomes in our case) that is, Pop(0) ¼ fCh1 , Ch2 , . . . , Chng, where each chromosome consists of the values of the random cluster centres Pc and the values (three bands) of the pixels px, which are selected from the image. These selected pixels are on the left side of each cluster centre ( pxi,1 pxi,2 pxi,3 Pcj pxiþ1,1 pxiþ1,2 pxiþ1,3 Pcjþ1). A maximum number of cluster centres (MaxClus) are selected in the initial population to ensure sufficient diversity between different chromosomes. What makes HDGA different from the other traditional GA is the ability to use chromosomes with different number of clusters. The length of each

Fit(Chi ) ¼ 1=OF

(4)

Mutrate(Chi ) ¼ (1  Fit(Chi ))  (mutratemax  mutratemin) þ mutratemin (5) The maximum and the minimum mutation rate (mutratemax and mutratemin) are specified by the user. Based on previous experiments, we consider mutratemax ¼ 0.15 and mutratemin ¼ 0.05. Two chromosomes are selected randomly and they are mated to create new two children. The probability of crossover is 60% and it is chosen after several tests. The newly created children replace their parents, their fitness values are computed again and the Hill-climbing method is applied. One must indicate the following two important remarks during the application of this new algorithm: 1. In the mutation operation only cluster centres are mutated. 2. The position of crossover and mutation points should not be on the marks.

Figure 1 HDGA structure 54

& The Institution of Engineering and Technology 2009

Fig. 3 shows the procedure of selection and the results of crossover operator. IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

www.ietdl.org

Figure 2 Chromosomes with different lengths representing cluster centres

Figure 3 Selection and crossover operation

The role of Hill-climbing process is to investigate adjacent points in the search space, and to move in the direction of giving the greatest increase in fitness. The Hill-climbing process investigates the cluster centres in each chromosome in more detail and repeatedly modifies the chromosome in order to increase its fitness. Hillclimbing is an exploitation technique capable of finding local minima [20]. Another improvement to GA is the introduction of a new heuristic process. This process works by selecting half the infeasible solutions greater or equal to the maximum unaccepted objective function value representing the distances between pixels and their cluster centres. The selected infeasible solutions are then forced to become feasible by changing the values of the cluster centres. However, it is noteworthy to mention that these feasible chromosomes are not modified by the Hillclimbing process. This process is repeated a pre-defined number of iterations or until the change in the objective function value of the fittest chromosome is unchanged. Fig. 4 shows the progress IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

Figure 4 Evolution of HDGA solutions of HDGA optimisation until it converges to an optimal solution. One of the advantages of using HDGA is the reduction of the time required to find an optimal solution. This is due to the variable length of chromosomes in terms of the number of cluster centres. In addition, the time consumed in reproduction operations and in computation (objective function, fitness function) is less than that needed in a GA with fixed length chromosome. Variable length 55

& The Institution of Engineering and Technology 2009

www.ietdl.org chromosome ensures diversity in the population, which will lead easily to an optimal solution.

4 FCM-HDGA cooperative segmentation method FCM and HDGA could cooperate as follows. Typically, FCM tries to cluster an image while HDGA finds the best combination of cluster centres that minimises the objective function of FCM (see (1)). In other words, the time spent on finding and then updating cluster centres are taken care by HDGA. Another good thing about using FCMHDGA cooperation is that if HDGA fails, FCM will continue with the best cluster centres provided by HDGA. In this case, the segmentation result will not be of the same quality of that obtained when HDGA continues until it finds an optimal solution. The cooperative method consists of many processes (Fig. 5). For a better understanding of this new method, the pseudo code in (Fig. 6) provides the steps involved in detail. The FCM and HDGA work in sequence. First HDGA finds an optimal solution by going through all the processes

Figure 6 Pseudo code of HDGA (initial population, selection reproduction, mutation . . . etc.). The solutions are provided to FCM which in turn evaluate them and provide feedback to HDGA. If the FCM solution is stable and there is no more change in the result, then HDGA terminates and FCM creates the new segmented image from the final obtained clusters. The process could be conducted in parallel where the communication between HDGA and FCM is done through sending tokens of messages. In case no termination token is sent by FCM, then HDGA keeps running, and the solutions are saved in a dynamic list. At the end, another process extracted from a known model is applied to verify the results, and this is discussed in the Section 6. Some of the advantages of using the new cooperative method are the improvement of the reliability and the efficiency of HDGA, and the increase in the speed of the clustering process.

5 Figure 5 Flowchart of FCM-HDGA cooperative segmentation method 56

& The Institution of Engineering and Technology 2009

Experiments

In order to prove the efficiency of the new cooperative method, it is compared with four implemented IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

www.ietdl.org segmentation methods: the SOM, FCM, HGA, and SOM-HGA. This is done using two different types of satellite images: a low resolution (30 m) and a very high resolution (1 m) with two different satellite platforms, Landsat ETMþ and IKONOS. In all the experiments, the size of the images is 480  480 pixels. The initial number of iterations for SOM and GA is 1000 and 30, respectively, in the simple and cooperative methods. SOM is composed of a two dimensional network, each of 16  16 cluster units or neurons. Each neuron is associated with three layers of the multi-component image. In addition, the crossover rate for each of the HGA, SOM-HGA and FCM-HDGA is equal. FCM is a semisupervised method since the number of clusters must be provided a priori. To solve this problem, FCM-HDGA is run first and the same number of clusters obtained by this method is provided to FCM. In Experiment 1, the above methods are used to segment the Landsat ETMþ image. The original image and the results are shown in Figs. 7a– 7f. In Experiment 2, the same methods are used to segment an IKONOS satellite image. This image

consists of three spectral bands and pan sharpened with a high-resolution panchromatic band. The original image and the results are shown in Figs. 8a – 8f. The following graphs Figs. 9a – 9e show samples of four classes from the IKONOS satellite image segmented by different segmentation methods. It can be seen from the above figures that SOM is producing a category with one very big class and three very small classes, which proves that SOM is not a clustering technique and it suffers from under-segmentation. FCM, on the other hand, is better than SOM but the number of clusters must be provided a priori, and in this experiment the number of classes is determined to be 6. HGA works better than SOM, but provides only 5 classes. Similarly, SOM-HGA provides 5 classes because it is affected by the SOM process. Finally, the segmentation method FCMHDGA provides 6 classes with no need of any information a priori. In order, to prove that FCM-HDGA is better, more efficient and robust than the previous segmentation methods, field surveys are done and the results are presented in the next section.

Figure 7 Various segmentation of Landsat ETMþ image a b c d e f

Original image HGA FCM SOM SOM-HGA FCM-HDGA

IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

57

& The Institution of Engineering and Technology 2009

www.ietdl.org

Figure 8 Segementation of IKONOS image a b c d e f

Original image HGA FCM SOM SOM-HGA FCM-HDGA

6 Evaluation of the segmentation methods The field survey is done by taking several samples from each type of the segmented image. In addition, a segmentation operator (OR), which is an essential part of the functional model (FM) [21], is used. The core of this method consists of five elementary blocks which are named: Measure, Criterion, Control, Modification and Stop. The segmentation process is achieved through one or more iterations of these blocks. In the Measure block, the selected sample windows are W  W from each segmented image and the variance is computed for each of these windows. This is considered as part of checking region homogeneity. The Criterion block receives all the measures from the Measure block and builds a scalar criterion C k(n) ¼ f (F k(i, n)) for each window n. The following equation is used in criterion computation C k (n) ¼

M X

wk (i, n)F k (i, n)

(6)

i¼1

where M is the number of scalar measures and n is the number of windows at iteration k (here M is equal to one due to computing the variance only). The weight w k (i, n) in this evaluation is 58

& The Institution of Engineering and Technology 2009

selected to be 1. The scalar criterion is used to identify significant changes in the segmentation result from one iteration to the next. Note that there is barely any difference in the results obtained from most of the segmentation methods used in this research from one iteration to the next one. As such, a slight modification is introduced to evaluate the changes after a number of iterations. For instance, from iteration h to iteration k, the control value must decrease or increase when the segmentation map approaches good results. The Control block takes criterion value as input and produces the control value E k(n) (see (7))

E k (n) ¼

C k (n)  C kh (n) C k (n) þ C kh (n)

(7)

The value is normalised between 21 and 1 where a positive number means that the region must be modified to reach high quality, and 0 means that this region has reached the required quality. A negative value means that the region has been modified beyond the expected quality. Table 1 shows the control block values after a certain number of iterations. Landsat ETMþ image is used to demonstrate the effectiveness of FCM-HDGA segmentation. IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

www.ietdl.org

Figure 9 Classes obtained by various segmentation methods a b c d e

SOM FCM HGA FCM-HDGA SOM-HGA

The table proves that SOM is the worst as it suffers from under-segmentation, which improves with further iterations. In addition, it shows that HGA is better than FCM in the first 50 iterations, although both suffer from over-segmentation. SOM-HGA has a very low rate of under-segmentation compared to FCM-HDGA. Finally, FCM-HDGA shows good performance because the rate of under-segmentation is very low. It is up to the experts to decide whether a low rate of over segmentation or under segmentation is better. Most of the time with oversegmentation, further processing is needed by aggregating small clusters. For further investigation and in order to prove that FCM-HDGA is better, the evaluation method

is used to evaluate the results of the IKONOS image segmentation by different methods (Table 2).

Table 1 Control values after a certain number of iterations

Table 2 Control values after a certain number of iterations

Iterations

The results in Table 2 show that the worst undersegmentation is obtained by SOM, and the worst oversegmentation method is FCM. HGA is better at the lowest rate (over-segmentation), however after 1000 iterations over-segmentation is still dominant. On the other hand, the results of FCM-HDGA and SOM-HGA improve after 250 iterations and they reach a stable state. Field verification is another method to prove the efficiency of FCM-HDGA. For this reason three classes are selected

FCM

SOM

HGA

SOMHGA

FCMHDGA

50

20.44

0.63

20.26

0.33

0.32

50

100

0

0.99

0

0.0

0

100

250

0

0

0

0.0

0

250

20.01

1000

0

0

0

0.0

0

1000

0

IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

Iterations

FCM 20.5 0.0019

SOM

HGA

SOMHGA

0.62

20.28

0.4

0.002 20.01

FCMHDGA 0.37

20.06

20.05

20.017 20.02

0

0

20.02

0

0

20.15

59

& The Institution of Engineering and Technology 2009

www.ietdl.org for verification: (i) urban (black colour), (ii) bare land (dark grey), (iii) agriculture (grey). Around 74 samples are taken from each segmented image. The following Tables 3 –7 show the confusion matrices for the results of the five segmentation methods: HGA, FCM, SOM, FCM-HDGA, and SOM-HDGA, all applied to Landsat ETMþ image. Table 3 HGA confusion matrix

Table 8 HGA confusion matrix Ground truth classes

1

2

3

4

Total

1. vegetation type I

23

3

0

0

26

2. urban

3 19

0

2

24

3. soil

1

3 27

0

31

4. vegetation type II

0

1

1 17

19

27 26 28 19

100

Ground truth classes

1

2

3

Total

1. urban

20

5

0

25

2. bare land

0 19

3

22

3. agriculture

0

1 26

27

Ground truth classes

1

2

3

4

Total

20 25 29

74

1. vegetation type I

21

2

2

1

26

2. urban

1 20

2

1

24

3. soil

3

2 25

1

31

4. vegetation type II

2

2

0 15

19

27 26 29 18

100

total

Table 4 FCM confusion matrix

total

Table 9 FCM confusion matrix

Ground truth classes

1

2

3

Total

1. urban

19

2

4

25

2. bare land

1 18

3

22

3. agriculture

2

1 24

27

Table 10 SOM confusion matrix

22 21 31

74

Ground truth classes

1

2

3

4

Total

1. vegetation type I

15

9

2

0

26

2. urban

4 15

3

2

24

3

31

2 10

19

22 35 27 16

100

total Table 5 SOM confusion matrix

total

Ground truth classes

1

2

3

Total

3. soil

0

8 20

1. urban

17

6

2

25

4. vegetation type II

1

6

2. bare land

1 15

6

22

total

3. agriculture

0

9 18

27

18 30 26

74

total

Table 6 FCM-HDGA confusion matrix

Table 11 FCM-HDGA confusion matrix Ground truth classes

1

2

3

4

Total

1. vegetation type I

22

2

1

1

26

0

3

24

0

31

0 18

19

22 26 30 22

100

Ground truth classes

1

2

3

Total

2. urban

0 21

1. urban

23

2

0

25

3. soil

0

2 29

2. bare land

0 22

0

22

4. vegetation type II

0

1

3. agriculture

0

0 27

27

total

23 24 27

74

total

Table 12 SOM-HGA confusion matrix Table 7 SOM-HGA confusion matrix

Ground truth classes

1

2

3

4

Total

1. vegetation type I

21

2

2

1

26

0

1

24

0

31

0 17

19

25 26 30 19

100

Ground truth classes

1

2

3

Total

1. urban

21

4

0

25

2. urban

2 21

2. bare land

0 22

0

22

3. soil

1

1 29

3. agriculture

0

0 27

27

4. vegetation type II

1

1

21 26 27

74

total

total

60

& The Institution of Engineering and Technology 2009

IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

www.ietdl.org

Figure 10 Time complexity analysis a With respect to the increase in space use b With respect to the increase in iteration number

From the confusion matrices, it can be proven that the highest rate of accuracy is 97% which is obtained by FCMHDGA. The second highest accuracy is 94% obtained by SOM-HGA, followed by HGA with a rate of 87%, then FCM with a rate of 82%, and the lowest accuracy rate is 68% for SOM. The results of IKONOS image segmentation by different methods were also field-verified and the confusion matrices are shown in Tables 8 – 12. Again from the confusion matrices, one can conclude that the accuracy rates for SOM, FCM, HGA, SOM-HGA, and FCM-HDGA are 60, 81, 86, 88 and 90%, respectively. The second comparison between FCM-HDGA and the other four methods is based on time complexity analysis. The term ‘time complexity’ usually refers to the upper bound of the asymptotic computational complexity of an algorithm or a problem, which is usually written in terms of the ‘big Oh’ notation. Let f : N ! R0 be an arbitrary function from the natural numbers to the non-negative real numbers. Denote by O( f (n)) the set of all functions t: N ! R0 such that t(n)  6 f (n) for all n  n0 , where 6 is a positive real constant and n0 is the threshold. In other words

complexity of the order max(O(N  M  Itr), O(Pop  Ch2  Gen)). HGA is the next slowest segmentation method followed by SOM. FCM is faster than the previous three methods with a time complexity of the order O(cn 2), where c is the number of clusters and n is the size of the data. In FCM-HDGA, the estimation of cluster values is taken care by the HDGA heuristic process which reduces the time complexity of FCM to O(cn). In addition, HDGA is of the order max(O(Pop), O(Pop  Gen)). This is due to the number of cluster centres being variable in chromosomes and of small size; Ch is considered small and negligible Ch  3 compared to the size of the chromosome in HGA Ch  size of the image. This means that FCM-HDGA is the fastest because its time complexity is of the order max(O(cn), O(Pop  Gen)). In order to illustrate the above analysis, the following graphs (Figs. 10a and 10b) shows the time complexity of each algorithm against the increase in number of iterations and space volume.

7

Conclusion

A new segmentation method is introduced where FCM cooperates with HDGA. Comparisons of this method and four different segmentation methods, namely hybrid genetic algorithm (HGA), SOM, FCM, and SOM-HGA, reveal that FCM-HDGA has a better performance.

1

O( f (n)) ¼ {t : N ! R0 j(96 [ Rþ )( 8 n [ N )[t(n)  6 f (n)]} The maximum rule O(max( f (n), g(n)).

says

that

O( f (n) þ g(n)) ¼

SOM has time complexity of the order O(N  M  Itr) where N and M are the size of Self-Organising Map Grid and Itr is the iteration number. HGA that includes the Hill-climbing process has time complexity of the order max(O(Pop, Ch2), O(Pop  Ch2  Gen)), where Gen is the number of generations, Ch is the size of the chromosome, and Pop is the size of the population. According to the maximum rule, SOM-HGA which combines both SOM and HGA is the slowest with a time IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213

The evaluation of the final results is carried out using a quantitative metric method, which is based on the variance to measure the progress of the segmentation methods, from one iteration number to another. The results show that HGA and FCM-HDGA are comparable and considered the best. However, HGA’s performance gets worse with the increase in the number of iterations. On the contrary FCM-HDGA improves with further iterations. Field-based surveys and verifications are conducted by collecting several samples from each segmentation method. The surveys show that FCM-HDGA is the best segmentation method, faster than the other four methods, and with more than 97% segmentation accuracy. This is mainly due to the variability in the size of the chromosome 61

& The Institution of Engineering and Technology 2009

www.ietdl.org in HDGA and the elimination of cluster centres computation in FCM. The improvement of FCM-HDGA to accept more than one image as input is the subject of future research.

[10] BEZDEK J., EHLRICH R., FULL W.: ‘FCM: the fuzzy-C-means clustering algorithm’, Comput. Geosci., 1984, 10, (2 – 3), pp. 191– 230

8

[11] PARK S., YUN I., LEE S.: ‘Color image segmentation based on 3-D clustering: morphological approach’, Pattern Recognit., 1998, 21, (8), pp. 1061 – 1076

References

[1] LO BOSCO G.: ‘A genetic algorithm for image segmentation’. Proc. 11th International Conf. Image Analysis and Processing (ICIAP’01), Palermo, Italy, 2001 [2] LAI C., CHANG C.: ‘A hierarchical genetic algorithm based approach for image segmentation’. Proc. IEEE International Conf. Networking, Sensing & Control, Taiwan, 2004, pp. 1284 – 1288 [3] RAMOS V., MUGE F.: ‘Image color segmentation by genetic algorithms’. Proc. of the 11th Portuguese Conference on Pattern Recognition RecPad 2000, Portugal, 2000, pp. 125– 129 [4] MAULIK H., BANDYOPADHYAY S.: ‘Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification’, IEEE Trans. Geosci. Remote Sens., 2003, 41, (5), pp. 1075 – 1081 [5] XIE X., BENI G.: ‘A validity measure for fuzzy clustering’, IEEE Trans. Pattern Anal. Mach. Intell., 1991, 13, pp. 841–847 [6] MUKHOPADHYAY A., BANDYOPADHYAY S., MAULIK U.: ‘Clustering using multi-objective genetic algorithm and its application to image segmentation’. Proc. IEEE International Conf. Systems, Man and Cybernetics (ICSMC ’06), Thailand, 2006, pp. 2678 – 2683 [7] BANDYOPADHYAY S. , MAULIK U. , MUKHOPADHYAY A.: ‘Multiobjective genetic clustering for pixel classification in remote sensing imagery’, IEEE Trans. Geosci. Remote Sens., 2007, 45, (5), pp. 1506 – 1511 [8] DEB K., HORN J., GOLDBERG E.: ‘Multi-modal deceptive functions’, Complex Syst., 1993, 7, pp. 131– 153 [9] WU J., LI J., LIU J., TIAN J.: ‘Infrared image segmentation via intelligent genetic algorithm based on 2-D maximum fuzzy entropy’. Proc. 2004 International Conf. (CDIC ’04), China, 2004, pp. 414 – 422

62

& The Institution of Engineering and Technology 2009

[12] JIU-LUN F., WEN-ZHI Z., WEI-XIN X.: ‘Suppressed fuzzy C-means clustering algorithm’, Pattern Recognit. Lett., 2003, 24, pp. 1607 – 1612 [13] LIU H., LI J., CHAPMAN M.: ‘Automated road extraction from satellite imagery using hybrid genetic algorithms and cluster analysis’, J. Environ. Inform., 2003, 1, (2), pp. 40– 47 [14] NOORDAM J., VAN DEN BROEK W., BUYDENS L.: ‘Geometrically guided fuzzy C-means clustering for multivariate image segmentation’. Proc. International Conf. Pattern Recognition (ICPR 2000), Barcelona, Spain, 2000, pp. 462– 465 [15] XIAOYING J. , DAVIS C.: ‘A genetic image segmentation algorithm with a fuzzy-based evaluation function’. Proc. IEEE International Conf. Fuzzy Systems, USA, 2003, pp. 938– 943 [16] AWAD M., CHEHDI K., NASRI A.: ‘Multi-component image segmentation using SOM-HGA’, IEEE Geosci. Remote Sens. Lett., 2007, 4, (4), pp. 571– 575 [17] HO S., LEE K.: ‘Efficient image segmentation using generic and non parametric approach’. Proc. Fourth International Conf. High Performance Computing in Asia – Pacific Region, Beijing, China, 2000 [18] HOLLAND J.: ‘Adaptation in natural and artificial systems’ (Ann Arbor University of Michigan Press, 1975) [19] NG S., LEUNG S., CHUNG C., LUK A., LAU W.: ‘The genetic search approach – a new learning algorithm for adaptive IIR filtering’, IEEE Signal Process. Mag., 1996, pp. 38– 46 [20] HUAPT R., HAUPT S.: ‘Practical genetic algorithm’ (Wiley Press, 2004, 2nd edn.), p. 253 [21] ZOUAGUI H., BENOIT-CATTIN H., ODET C.: ‘Image segmentation model’, Pattern Recognit., 2004, 37, pp. 1785 – 1795

IET Image Process., 2009, Vol. 3, Iss. 2, pp. 52– 62 doi: 10.1049/iet-ipr.2007.0213