January, 2016
Int J Agric & Biol Eng
Open Access at http://www.ijabe.org
Vol. 9 No.1
79
Object-based classification approach for greenhouse mapping using Landsat-8 imagery Wu Chaofan, Deng Jinsong, Wang Ke*, Ma Ligang, Amir Reza Shah Tahmassebi (Institute of Applied Remote Sensing & Information Technology, Zhejiang University, Hangzhou 310058, China) Abstract: Suburban greenhouses with intensive agricultural productivity have increasingly influenced the daily diet and vegetable supply in Chinese cities.
With their enormous input of fertilizers and pesticides, greenhouses have considerably
changed the local soil quality and environmental risk factors.
The ability to obtain timely and accurate information regarding
the spatial distribution of greenhouses could make an important contribution to local agricultural management and soil protection.
This paper attempts to present a practical framework for extracting suburban greenhouses, integrating remote
sensing data from Landsat-8 and object-oriented classification.
Inheritance classification was implemented, and various
properties, including texture and neighborhood features in addition to spectral information, were investigated through the popular random forest technique for feature selection prior to SVM classification to improve the mapping accuracy.
The
results demonstrated that object-based classification incorporating non-spectral features yielded a significant improvement compared with the classification results obtained using only the spectral information in traditional per-pixel classification. Both the producer’s and user’s accuracy were higher than 85% for greenhouse identification.
Although it remained a
challenge to completely distinguish greenhouses from sparse plants, the final greenhouse map indicated that the proposed object-based classification scheme, providing multiple feature selections and multi-scale analysis, yielded worthwhile information when applied to a continuous series of the freely available Landsat-8 imagery data. Keywords: greenhouse, mapping, Landsat-8, object-based classification, feature selection, multi-scale DOI: 10.3965/j.ijabe.20160901.1414 Citation: Wu C F, Deng J S, Wang K, Ma L G, Tahmassebi A R S. mapping using Landsat-8 imagery.
1
Introduction
Object-based classification approach for greenhouse
Int J Agric & Biol Eng, 2016; 9(1): 79-88.
Greenhouses appear most commonly in suburban districts
as a result of rapid urbanization and the urban population
Beginning in the 1970s, the greenhouse has a history
explosion, modifying the characteristics of seasonal
of more than 40 years of rapidly increasing use in China.
agricultural production, reshaping the landscape, and even changing the local climate.
Received date: 2014-11-06
As reported by the
national State Statistics Bureau, the entire country of
Accepted date: 2015-11-10
Biographies: Wu Chaofan, PhD candidate, Research interests:
China contained 81 000 hm2 of greenhouses as of 2006.
remote sensing image classification and ecological applications,
The
Email:
[email protected]; Deng Jinsong, PhD, Associate Professor, Research interests: application of remote sensing and GIS, Email:
[email protected]; Ma Ligang, PhD, Lecturer,
total
greenhouse
area
worldwide
reached
2
36 5760 hm in 2010, of which China accounts for 42.8%. The remarkable increase in the use of greenhouses
Research interests: remote sensing image classification, Email:
reflects the development of modern agriculture, and the
[email protected]; Amir Reza Shah Tahmassebi, Postdoctoral
increasing rates of greenhouse production are gradually
researcher,
changing the daily lives of inhabitants.
Research
interest:
impervious
surfaces,
Email:
[email protected]. *Corresponding author: Wang Ke, PhD, Professor, Research interests: the application of remote sensing and GIS as well as land
Simultaneously,
the expansion of greenhouses is exerting controversial effects on the environment, such as soil degradation[1] and
use planning. Address: 866 Yuhangtang Road, Hangzhou, China.
vegetable and plastic waste[2].
Tel.: +86-571-88982272, Email:
[email protected].
greenhouse poses new challenges in land-use planning, as
Furthermore, the
80
January, 2016
Int J Agric & Biol Eng
Open Access at http://www.ijabe.org
Vol. 9 No.1
greenhouse regions can be confused with construction
studies were generally conducted on high-resolution
lands in certain cases. As a result, a reliable method for
images, providing many object details.
determining the number and spatial distribution of
hand, such images are acquired at higher cost, offer
greenhouses from remote sensing imagery would
narrower spatial coverage and are less readily available
contribute
than the latest Landsat-8 imagery, which offers a 15 m
to
land-use
planning
and
agricultural
management.
On the other
panchromatic band, is currently freely downloadable as a
As a prompt and effective technique, remote sensing
continuous record of 41 years of earth observations and
is playing an increasingly important role in land-use
offers novel opportunities for classification[14], especially
mapping.
for developing countries with rapid greenhouse growth,
Among the operating remote sensing satellites,
the Landsat satellites (NASA, National Aeronautics and
such as China.
Space Administration) have produced a series of images
In recent decades, improvements in the resolution of
for longer than 40 years that are widely applied in land
satellite images as well as the popularization and
cover classification
[3-7]
greenhouses, Li et al.
.
To address the special case of
advancements in software have made object-based
[8]
created a greenhouse index for
classification a priority[15].
Compared with traditional
the extraction of greenhouses used as vegetable fields
per-pixel classification, in which different objects are
[9]
predominantly classified based on spectral features,
from TM (Thematic Mapper) images in 2004.
Ma
used Landsat 5 TM data combined with additional
object-oriented
classification
offers
the
following
information to perform an SVM (Support Vector
advantages: the classification is based on objects
Machine) classification of vegetable greenhouses. Both
represented by combinations of several similar pixels
studies were conducted based on images of 30-meter
rather than on single pixels to avoid the salt-and-pepper
resolution using the traditional per-pixel classification
effect; instead of a single scale, multiple scales of
approach, and they achieved reasonable accuracy using
vertically connected (super-objects and sub-objects) and
Landsat data, which encouraged further research into
horizontally connected (neighbor objects) heritance
applications using 15 m fused imagery.
relationships can be used to optimize the classification
An increasing number of studies are being conducted
process; and the spatial relationships, textural properties
on the topic of greenhouse identification via remote
and contextual information of objects in addition to the
sensing based on multiple types of imagery in addition to
traditional spectral characteristics are all attractive
Landsat data worldwide.
Carvajal et al.
[10]
compared
features for classification[16,17].
different high-resolution satellite images (e.g., QuickBird
A multitude of papers have utilized Landsat data for the
Southeastern Spain. DilekKoc-San et al. compared the
However, there are few papers concerning such
application of different classification techniques to
applications for greenhouse classification, and in
WorldView-2 satellite imagery for the detection and
particular, there has been no research on greenhouse
discrimination of plastic and glass greenhouses
[11]
.
An
object-based classification scheme was applied by Tarantino et al. from
[12]
true-color
to identify plastic-covered vineyards aerial
data.
Agüera
performed
application
of
object-based
classification[18-22].
and IKONOS) in a study of greenhouse detection in
classification utilizing Landsat-8 imagery. Tarantino et al.[12]
extracted
object-based
plastic-covered
classification,
Tarantino et al.
[23]
as
vineyards mentioned
using above.
also monitored plastic-covered
greenhouse delineation through maximum likelihood
vineyards based on true-color aerial data using an
classification
efficient object-based classification approach.
and
completed
the
extraction
and
By
classification of homogeneous objects combined with
contrast, the present study focused on object-based
calibration and pseudo-calibration using images from the
classification with an emphasis on testing both the
[13]
QuickBird and IKONOS satellites
.
On the one hand, all of these greenhouse detection
limitations and advantages of fused Landsat-8 data for detecting greenhouses in Xiaoshan District.
January, 2016
2 2.1
Wu C F, et al.
Object-based classification for greenhouse mapping using Landsat-8 imagery
81
Xiaoshan’s economic performance is among the highest
Materials and methods
of all districts in China. At the end of 2012, the local
Study area Xiaoshan District is located in the northeastern region
of Hangzhou City, the capital of Zhejiang Province in China, and is the largest center for the growth of flower seedling in the region as well as one of the largest vegetable planting areas. The region has a subtropical monsoon-type climate with four distinct seasons.
Figure 1
2.2
Vol. 9 No.1
GDP reached 161.2 billion Yuan, and the GDP per capita was approximately $17 000. Figure 1 shows the study area that located in the northeastern of Xiaoshan district, where most of the greenhouses are distributed.
The
study area consists of a rectangular experimental area of approximately 77 km2.
Location of the study area in Hangzhou and the 7-5-4 composition of Landsat-8 satellite imagery
downloaded and geometrically corrected to Universal
Experimental design The objective of this study was to accurately extract
Transverse Mercator map projection zone 50 with the
greenhouses from Landsat imagery using object-based
spheroid and datum of WGS 84. Panchromatic images
classification.
We first downloaded an image of
with a spatial resolution of 15 m and multispectral images
Xiaoshan District and preprocessed it to obtain the 15 m
with a spatial resolution of 30 m were acquired on April
fusion image.
Afterward, object-based classification
14, 2013, with a 16-bit radiometric resolution, as all
was performed using the eCognition software suite.
Operational Land Imager (OLI) and Thermal Infrared
Multiple scales were considered to complete the
Sensor (TIRS) spectral bands were stored as geo-located
segmentations and the image was divided into different
16-bit digital numbers[14].
objects at respective suitable scales.
Furthermore,
on which the main imaging instrument was ETM+,
multiple features were used synthetically to improve the
Landsat-8 carried two sensors, the OLI and the TIRS.
accuracy of the segmentation through effective machine
The OLI offers the following multi-spectral bands: blue
learning methods, namely, the random forest (RF) and
(0.45-0.51 μm), green (0.53-0.59 μm), red (0.64-0.67 μm),
SVM techniques for object-based feature selection and
near-infrared (0.85-0.88 μm), shortwave infrared (1.57-
classification,
1.65 μm), shortwave infrared (2.11-2.29 μm), and
respectively.
Finally,
an
accuracy
Unlike the Landsat 7 satellite,
assessment of the classification results was performed
panchromatic (0.50-0.68 μm).
through a comparison with the results of the traditional
additional
per-pixel SVM classification method.
shorter-wavelength blue band (0.43-0.45 μm) and a new
2.3
cirrus band (1.36-1.38 μm).
Preprocessing of the remote sensing data The Landsat-8 image remote sensing data were
reflective
It also recorded in two
wavelength
bands:
a
new,
Although the other two
thermal bands provided by the TIRS were excluded from
82
January, 2016
Int J Agric & Biol Eng
Open Access at http://www.ijabe.org
Vol. 9 No.1
the original bands because of their reduced spatial
and the number of pixels within the object, whereas the
resolution (100 m), the improvements of the remaining
smoothness is a function of the object’s perimeter and the
bands in terms of their higher radiometric resolution,
perimeter of the object’s bounding box; both criteria
narrower spectral wavelength and improved sensor
determine the shape of the object.
signal-to-noise performance remain attractive.
The
together describe the homogeneity of the object.
Landsat-8 scientific team has detailed the promising
Researchers have proposed numerous methods for
properties of Landsat-8 in a previous paper
[14]
.
The shape and color
No
segmentation assessment; however, manual interpretation
atmospheric correction of the imagery was performed
is generally accepted to be the most accurate method.
because there were no clouds or shadows in the study
Tests of a variety of values for each parameter and for
area, and the analysis was performed based on single data.
various combinations of parameters were conducted to
We disregarded the new cirrus band, which was more
evaluate their impacts on the segmentation accuracy.
suitable for cloud detection, and it exhibited serious
a
striping and yielded minimal information in our study.
segmentation procedure were defined based on a
To acquire better spatial information, one of the most widespread and
best performing fusion methods,
result,
the
trial-and-error
parameters analysis
to
of
the
As
multi-resolution
ensure
that
the
final
segmentation matched the visual interpretation.
After
Gram-Schmidt spectral sharpening, was utilized to fuse
multiple attempts, we established two different sets of
the panchromatic and multispectral Landsat-8 satellite
parameter values, as shown in Table 1.
images.
Both the spectral characteristics of the
Table 1
Sets of parameter values for two levels of
multispectral image and the spatial resolution of the panchromatic
image
were
successfully
segmentation
preserved,
Scale
Shape
Compactness
Num. of objects
yielding clearer characteristics of greenhouses and other
L1
200
0.3
0.5
1239
components in the fused image compared with the
L2
100
0.4
0.6
3175
original multispectral imagery[24].
In this study, we chose a scale value of 200 for the
2.4
Object-based image classification In
this
study,
object-based
primary level of segmentation, Level 1.
classification
was
A correlation
analysis was first conducted to reduce the redundancy of
implemented using the Definiens® platform.
the original bands considered in the segmentation.
2.4.1
Because of the high correlations between bands (for
Image segmentation
Image segmentation is a preliminary step of
instance, the correlation coefficient between bands 1 and
object-oriented image classification in which the image is
2 was 0.998), the bands were weighted in the two-level
divided
primitives.
segmentation procedure as follows: the weights of bands
Multi-resolution segmentation, which locally minimizes
2, 4, 5, and 7 were all set to 1, whereas the remaining
the average heterogeneity of image objects at a given
bands (bands 1, 3, and 6 and the panchromatic band) were
resolution, was chosen for the segmentation of the study
given a weight value of 0 and were used only for
area. The scale parameter is an abstract quantity that
classification.
determines the maximum allowed heterogeneity for the
satisfactory for identifying “large” objects such as
into
homogeneous
resulting image objects
[25]
.
object
A larger scale value
produces larger objects, and the inverse also holds.
Segmentation
at
this
level
was
paddies, rivers, buildings and plants, as shown in Figure 2.
It is
Object features such as NDVI, NDWI and Brightness
advisable that the image objects should be slightly
were calculated to separate the obvious vegetation (NDVI
smaller than the real objects, as overly large objects may
above 0.25), open water (NDWI above -0.054) and light
be more highly subject to error. Once the scale has been
buildings (Brightness above 13500).
determined, three other criteria define the heterogeneity
spectral properties of paddy fields are similar to those of
of an object: its color, smoothness, and compactness.
water and our focus was on the classification of
The compactness is a function of the object’s perimeter
greenhouses, paddy fields were simply classified as open
Because the
January, 2016
water.
Wu C F, et al.
Object-based classification for greenhouse mapping using Landsat-8 imagery
Vol. 9 No.1
83
As a result of this procedure, the remaining
training samples based on interpretation of the image and
objects, which contained all of the greenhouses in the
on the spatial auto-correlation evident throughout the
area, were assigned to the unclassified category for
displayed image.
further classification.
2.4.2
Feature selection
The object features extracted from a segmented image can potentially be incorporated into further analysis. Determining the most important features significantly contributes to the final classification.
Many feature
selection methods have been applied in object-based image classification to reduce the dimensionality of the data[26,27].
In addition to the basic spectral information,
other attributes can also be utilized in object-based classification, unlike in traditional classification methods. In this study, the spatial relationships between image objects – such as the contrast with respect to neighboring pixels, which measures the difference in contrast between an object and the surrounding area – were incorporated into the object-based image classification.
Because a
greenhouse is an artificial facility, shape and texture information were also considered in the classification. Figure 2
Typical objects obtained via segmentation at Level 1
Different classes are better adapted to different scale levels; therefore, determining the ‘best’ scale parameter using only one level of segmentation for classification is not advisable.
In total, 53 object features, including the layer values, shape and texture, were considered in this study: (1) customized object features, including the NDVI ((mean layer NIR – mean layer Red)/(mean layer NIR + layer Red)) and NDWI ((mean layer Green – mean layer
For this reason, Level 2 segmentation
NIR)/(mean layer Green + mean layer NIR)); (2) the
was applied to separate greenhouse objects from mixed
mean value, standard deviation and ratio of each object in
segments by using a smaller scale value of 100 for finer
all input layers, including 7 fused multi-spectral bands
segmentation within the unclassified category inherited
and the panchromatic band; (3) the mean difference from
from Level 1.
The finer segmentation at Level 2
neighbors and contrast with respect to neighboring pixels
addressed basic “land use” types – Open Water, Plants,
in all input layers; (4) shape features, including density,
Buildings & Soil, and Greenhouse – among the remaining
length/width and shape indices; and (5) the GLCM,
unclassified objects.
The Plants category was further
including the homogeneity, contrast, entropy, and second
divided into farmland (plants with moderate canopies),
moment, mean and correlation of each object, calculated
dense vegetation (plants with mostly thick canopies), and
from the panchromatic band.
sparse areas (mostly consisting of plants with the
selected by considering the relationships among the
presence of visible ground).
Based on the different
segmented objects and the potential for greenhouses to be
color properties observed when the image was displayed
discriminated from the other categories based on previous
in 754 band combinations, Buildings & Soil was divided
researches[2,16,28].
into dull residential, highlighted factory, colorful
be found in the Reference Book documentation for the
industrial and road regions. After the segmentation and
software[25].
These features were
Details regarding these features can
in reference to the above classification system, 63 objects
To determine the effectiveness of the features
with strongly characteristic features were chosen as
mentioned above, all features were used to perform
84
January, 2016
Int J Agric & Biol Eng
feature selection using one of the most efficacious
Open Access at http://www.ijabe.org
quality indices utilized in previous research[34]:
methods, the RF algorithm in the Waikato Environment for Knowledge Analysis (Weka) system, which was a
1) True positive (TP): labeled as greenhouse in both the classification and the manual interpretation.
collection of machine learning algorithms for data mining [29-31]
tasks
.
The RF algorithm is a modern machine
2) False positive (FP): labeled as greenhouse only in the classification.
learning algorithm developed by Leo Breiman to improve the classification of diverse data.
Multiple random trees
3) False negative (FN): labeled as greenhouse only in the manual interpretation.
were constructed by choosing a random number of attributes for each tree without pruning.
The most
The following statistics derived from the above three quantities were also considered:
important feature of the RF algorithm is that it estimates the importance of variables according to voting values during the classification process.
1) Branching factor (BF): FP/TP. Measuring the rate of incorrect greenhouse labeling.
In this study, a 10-fold
cross-validation procedure was implemented within the
Vol. 9 No.1
2) Miss factor (MF): FN/TP. Measuring the rate of greenhouse omission.
Weka environment, meaning that 90% of the samples
3)
Greenhouse
detection
percentage
(GDP):
were used for training and the other 10% were used for
100TP/(TP+FP). Measuring the percentage of correct
testing.
greenhouse categorization achieved by the classification.
The number of trees was set to 100, and the
number of features required to split the nodes was set to 8 [32]
based on the total number of input features
.
Measuring the likelihood of correct classification.
2.4.3 Classification and accuracy assessment The SVM classification method is a popular nonparametric
classification
technique
4) Quality percentage (QP): 100TP/(TP+FP+FN).
with
great
3 3.1
Results and discussion Feature selection
It makes
To select the most appropriate features for Level 2
no assumptions about the data distribution and simplifies
classification, the RF analysis was conducted prior to the
the number of training samples while providing higher
classification.
accuracy.
In this research, object-based supervised
features such as texture and shape were expected to be
SVM classification was also performed in eCognition
important information in the classification, the feature
potential for application in remote sensing[33].
Developer 8.7
[25]
As shown in Table 2, although object
selection results indicated that spectral properties
.
In remote sensing, classification accuracy refers to the
composed the majority of the most important features.
level of agreement between the selected reference
At a finer spatial resolution (such as 1 m), greenhouses
materials and the classified data.
In total, 294 points
could be easily recognized based on their regular shape
were created using the stratified random method to form
and texture, but the usefulness of the shape and,
the error matrix for the 4-category classification results of
especially, the texture information was considerably
the applied object-based SVM classification approach.
weakened because mixed pixels were still commonly
Based on a visual greenhouse analysis of the Landsat-8
present in the fused Landsat-8 data as a result of the
satellite imagery, with verification from Google Earth and
heterogeneity of the landscape and the limitations
the high-resolution imagery with the closest temporal
imposed by the 15 m spatial resolution of the image.
match, various accuracy statistics were calculated from
Moreover, because of the small value of the scale
the error matrix, including the class producer’s accuracy
parameter used in the segmentation, most objects
(PA), the class user’s accuracy (UA), the overall accuracy
consisted of small numbers of pixels; therefore, neither
(OA), and the overall kappa (OK).
the texture features nor the object geometry were
In addition, to assess the area accuracy of the greenhouse classification results, a total greenhouse area 2
of 3.71 km in 104 objects was checked for the following
particularly distinct[35].
Regarding the neighborhoods
surrounding the greenhouses in the study area, most greenhouses are adjacent to farmlands and irrigation
January, 2016
Wu C F, et al.
Object-based classification for greenhouse mapping using Landsat-8 imagery
Vol. 9 No.1
85
canals and ditches, benefiting from the well-developed
classification scheme, the same 7 fused multi-spectral
water systems in these locations.
To facilitate
bands and the panchromatic band were stacked to
management, these greenhouses are also not far from the
perform per-pixel SVM classification in the ENVI
residents they serve.
software using the “Linear” kernel.
As a result, among the most
significant features, neighborhood relationships clearly
3.3 Accuracy assessment
However, based on the
The results in Figure 3 reveal that compared with the
distribution of all greenhouses, there were no significant
per-pixel classification map, which exhibits the inevitable
unified
the
salt-and-pepper effect, the object-based classification
greenhouses and the other categories, reflecting the fact
incorporating different features in addition to the original
that the distribution of most greenhouses was not
spectral properties yielded more integrated objects and
rigorously planned in the study area.
Finally, from the
improved accuracy, in terms of both the total KIA and the
53 total features, the RF algorithm selected 24 features
OA when compared in Table 3 and Table 4.
that yielded a correct classification rate of 0.96 in the
Furthermore, among the 60 greenhouse test samples, the
Weka system, thereby reducing the number of attributes
object-based classification obtained a 100% user’s
to be calculated.
accuracy, whereas 6 sparse vegetation and 1 road region
played an important role. neighborhood
Table 2
relationships
between
RF results indicating the most important features in
was falsely classified as greenhouse showed in Table 3. As the results shown in Table 4 that exhibit a comparable
terms of their relevance values Feature
Relevance value
producer’s accuracy with much lower user’s accuracy, it
1
NDVI
2.8
further reveals that there is no rigorous spectral
2
Mean diff. from neighbors b5 (0)
1.7
discrimination between roads and sparse plants in the
3
Ratio b5
1.7
4
Mean b7
0.9
5
Brightness
0.8
vegetation (perhaps because several roads are located
6
Mean diff. from neighbors b2 (0)
0.8
very close to greenbelts).
7
NDWI
0.8
8
Mean diff. from neighbors b7 (0)
0.7
Order
study region because they are both covered with low These three categories could
be readily confused because of their similar spectral
9
Contrast with respect to neighboring pixels b5 (3)
0.7
properties showed in Figure 4 for about 50 samples for
10
Mean diff. from neighbors b6 (0)
0.6
each classes, as most greenhouses carry vegetation
11
Mean diff. from neighbors b1 (0)
0.5
12
GLCM mean p (all dir.)
0.4
13
Mean b4
0.4
14
Mean b5
0.4
15
Ratio b4
0.4
16
Mean b1
0.3
17
Standard deviation b4
0.3
83.49, after applying the Hough transformation for
18
Density
0.3
greenhouse discrimination to the best results obtained in
3.2
information in April but this information is weakened by the reflection from different covering materials. Agüera et al.[34] achieved the highest TP value, with a BF of 0.12, an MF of 0.09, a GDP of 91.45 and a QP of
multi-spectral image classification; these results were
Image classification features
superior to those of all previous studies and are
selected in the previous step, was applied to the training
considered as a benchmark for satisfactory performance.
samples.
From this perspective, the values of the indices presented
SVM
classification,
incorporating
the
The “Linear” kernel implemented in the
eCognition software was used. It was observed that the SVM classification required considerable time when the objects’ texture information
in Table 5 are of suitable quality compared with other historical results using the same assessment method reported by the author.
was included either in the training for feature
A total of 104 objects were selected to compare the
determination or in the application of the classification
correlations between the correct area of each object and
procedure.
its areas as determined via classification and manual
For comparisons with the results of the object-based
identification.
The Pearson correlation between the
86
January, 2016
Int J Agric & Biol Eng
Open Access at http://www.ijabe.org
Vol. 9 No.1
correct and classification areas in Figure 5a is 0.995,
manual areas in Figure 5b is 0.996; both of these values
whereas the Pearson correlation between the correct and
indicate significant differences at p