Author(s) Affiliation Author(s) Affiliation Author(s)

2 downloads 0 Views 279KB Size Report
Ehsani. ASAE Member. [email protected]. Affiliation. Organization. Address. Country. The Ohio State University. 590 Woody Hayes Dr., Columbus, OH 43210.
Author(s) First Name

Middle Name

Farhad

Surname

Role

Misaghi

Email [email protected]

Affiliation Organization

Address

Country

M.Sc Student in Tarbiat Modarres University

Iran

Author(s) First Name

Middle Name

Shadi

Surname

Role

Dayyanidardashti

Email [email protected]

Affiliation Organization

Address

Country

Jamab Consulting Engineers Company

46, Zartosht St. West, Tehran

Iran

Author(s) First Name

Middle Name

Kourosh

Surname

Role

Mohammadi

Email [email protected]

Affiliation Organization

Address

Country

Tarbiat Modarres University

P.O. Box 14115-336, Tehran

Iran

Author(s) First Name

Middle Name

Surname

Role

Email

M.

R.

Ehsani

ASAE Member

[email protected]

Affiliation Organization

Address

Country

The Ohio State University

590 Woody Hayes Dr., Columbus, OH 43210

USA

Publication Information Pub ID

Pub Date

041147

2004 ASAE/CSAE Annual Meeting Paper

The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the American Society of Agricultural Engineers (ASAE), and its printing and distribution does not constitute an endorsement of views which may be expressed. Technical presentations are not subject to the formal peer review process by ASAE editorial committees; therefore, they are not to be presented as refereed publications. Citation of this work should state that it is from an ASAE meeting paper. EXAMPLE: Author's Last Name, Initials. 2004. Title of Presentation. ASAE Paper No. 04xxxx. St. Joseph, Mich.: ASAE. For information about securing permission to reprint or reproduce a technical presentation, please contact ASAE at [email protected] or 269-429-0300 (2950 Niles Road, St. Joseph, MI 49085-9659 USA).

An ASAE/CSAE Meeting Presentation

Paper Number: 041147

Application of Artificial Neural Network and Geostatistical Methods in Analyzing Strawberry Yield Data Farhad Misaghi, M.Sc. Student Tarbiat Modarres University, Tehran, Iran, [email protected].

Shadi Dayyanidardashti Jamab Consulting Engineers Company, Tehran, Iran, [email protected].

Kourosh Mohammadi, Assistant Professor Tarbiat Modarres University, P.O. Box 14115-336, Tehran, Iran, [email protected].

M.R. Ehsani, Assistant Professor Ohio State University, [email protected].

Written for presentation at the 2004 ASAE/CSAE Annual International Meeting Sponsored by ASAE/CSAE Fairmont Chateau Laurier, The Westin, Government Centre Ottawa, Ontario, Canada 1 - 4 August 2004

Abstract. It is imperative that farm management decisions now be made based on the knowledge of interactions between plants, soil, and the environment on a site-specific basis to maximize yield, reduce costs, improve quality of products, and/or reduce the environmental impact of farming. Understanding the input-output relationship in a crop production system is the major step to implement the knowledge-based management decisions. The specific goal of this study was to The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of ASAE or CSAE, and its printing and distribution does not constitute an endorsement of views which may be expressed. Technical presentations are not subject to the formal peer review process, therefore, they are not to be presented as refereed publications. Citation of this work should state that it is from an ASAE/CSAE meeting paper. EXAMPLE: Author's Last Name, Initials. 2004. Title of Presentation. ASAE/CSAE Meeting Paper No. 04xxxx. St. Joseph, Mich.: ASAE. For information about securing permission to reprint or reproduce a technical presentation, please contact ASAE at [email protected] or 269-429-0300 (2950 Niles Road, St. Joseph, MI 49085-9659 USA).

determine the input-output relationship and develop a model to predict strawberry yield using different Artificial Neural Network (ANN) techniques. Aerial images, soil parameters, and plant parameters were collected from 2 acres strawberry field in London, Ohio. Several ANN models were used to generate the relationship of strawberry yield to soil and plant parameters. In order to increase the data set for training and testing the model, geostatistical methods have been used to interpolate and produce more input and resulting set of outputs. The predicted yield was compared with the actual yield map to determine the accuracy of the model. Results showed that using aerial images as input to these models can be a practical way to forecast yield relatively accurately. Among tested models, the Modular Neural Networks (MNN) model provided the best results. Keywords. Artificial neural network, geostatistics, strawberries, yield map, aerial photography

The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of ASAE or CSAE, and its printing and distribution does not constitute an endorsement of views which may be expressed. Technical presentations are not subject to the formal peer review process, therefore, they are not to be presented as refereed publications. Citation of this work should state that it is from an ASAE/CSAE meeting paper. EXAMPLE: Author's Last Name, Initials. 2004. Title of Presentation. ASAE/CSAE Meeting Paper No. 04xxxx. St. Joseph, Mich.: ASAE. For information about securing permission to reprint or reproduce a technical presentation, please contact ASAE at [email protected] or 269-429-0300 (2950 Niles Road, St. Joseph, MI 49085-9659 USA).

Introduction It is imperative that farm management decisions now be made based on the knowledge of interactions between plant, soil, and the environment on a site-specific basis to maximize yield, reduce costs, improve quality of products, and/or reduce the environmental impact of farming. Usually, conventional methods require several different input data which are not easily accessible and make it a very expensive and time consuming process. On the other hand, artificial neural networks (ANN) develop a solution system by training on examples given to it. An ANN learns to solve a problem by developing a memory capable of associating a large number of input patterns with a resulting set of outputs or effects. Therefore, ANN does not require algorithm or rule development, a feature that often significantly reduces the quantity of software that must be developed. An innovative use of precision farming would be to apply technological advances to high-value crops, such as strawberries, which could be produced at a higher profit margin. Berries, a high-dollar crop, could be an alternative to corn, soybeans or wheat for small to mid-size farms. Strawberries generate an annual return of $150 per acre as compared with $13 per acre for no-till corn and $16 for no-till soybeans. Conventionally grown corn and soybeans average a net annual profit of only $1 per acre. Even more profit can be earned with pick-your-own operations, about $500 per acre for strawberries and over $1,500 per acre for raspberries (Kuhn, 2000). Farm income can be increased by utilizing better input-management techniques. Precision agriculture is an information-based management concept that aims to maximize profit by managing variability. Precision agriculture, when applied to higher value crops such as strawberry, has the potential to show more pronounced profits than for lower value crops because fruits and vegetables require more inputs. Application of precision farming to high value crops requires developing techniques for collecting information and data analysis. The long term goal of this research was to work towards developing techniques and methods that enhance the profitability of high-value crop production by helping the high-value crop grower to make better management decisions. The specific goal of this study was to determine the input-output relationship and develop a model to predict strawberry yield using different Artificial Neural Network (ANN) techniques.

Artificial Neural Networks Artificial Neural Networks is a relatively new approach to analyzing yield data. An ANN learns to solve a problem by developing a memory capable of associating a large number of input patterns with a resulting set of outputs or effects. Also, an ANN is capable of finding the inputoutput relationship in the presence of missing data. The ANN approach is based on the highly interconnected structure of the brain cells. This approach is faster in comparison to its conventional counterparts, robust in noisy environments, flexible in terms of solving different problems, and highly adaptive to new environments (Jain et al., 1999). Due to these established advantages, currently the ANNs have extensive applications in the system engineering-related fields. The basic structure of a network usually consists of three layers: the input layer, where the data are introduced to the network; the hidden layer or layers, where data are processed; and the output layer, where the results for given inputs are produced (Figure 1). The input values, xi, are multiplied by weights, wij, and summed in the neuron forming n

ξ j = ∑ xi wij . This result is then acted upon by an activation function, yielding the output of the i =1

2

jth neuron

y j = σ (ξ ) , as shown in Figure 2.

Only when ξ j exceeds (i.e., is stronger than) the

neuron’s threshold limit (also called bias, b), will the neuron fire (i.e., becomes activated). The architecture of ANN is designed by the number of layers, number of neurons in each layer, weights between neurons, a transfer function that controls the generation of output in a neuron, and learning laws that define the relative importance of weights for input to a neuron (Caudill, 1987). Multilayer perceptrons, generalized feed forward and modular neural networks were three model architectures were used in this study. Multilayer perceptrons (MLPs) are layered feedforward networks typically trained with static backpropagation. These networks have found their way into countless applications requiring static pattern classification. Their main advantage is that they are easy to use, and that they can approximate any input/output map. The key disadvantages are that they train slowly, and require lots of training data, typically three times more training samples than network weights (Lefebvre and Principe, 1998). Generalized feed forward (GFF) networks are a generalization of the MLP such that connections can jump over one or more layers. In theory, a MLP can solve any problem that a generalized feedforward network can solve. In practice, however, generalized feedforward networks often solve the problem much more efficiently. A classic example of this is the two spiral problem. Without describing the problem, it suffices to say that a standard MLP requires hundreds of times more training epochs than the generalized feedforward network containing the same number of processing elements. Modular neural networks (MNNs) are a special class of MLP. These networks process their input using several parallel MLPs, and then recombine the results. This tends to create some structure within the topology, which will foster specialization of function in each sub-module. In contrast to the MLP, modular networks do not have full interconnectivity between their layers. Therefore, a smaller number of weights are required for the same size network. This tends to speed up training times and reduce the number of required training exemplars. There are many ways to segment a MLP into modules. It is unclear how to best design the modular topology based on the data. There are no guarantees that each module is specializing its training on a unique portion of the data.

Figure 1. The basic structure of an artificial neural network.

3

X2

y = σ (ξ)

. . .

ξ = Σ w. x

Xn

w

y

X1 Threshold gate Figure 2. Signal interaction from n neurons to signal summing in the single layer perceptron.

Geostatistics The main drawback in using ANN models is their need for a relatively large amount of data for training. For this study, geostatistical methods have been used to interpolate point data and produce more data sets for training the ANN model. In order to interpolate the measured data, a robust and accurate method has to be selected. A branch of statistics known as geostatistics, when used in combination with a database of spatial information known as a geographic information system (GIS), offers a unique means of analyzing spatial relations between the diverse data. Kiriging, inverse distance weighted and radial basis function methods have been used and tested in ArcGIS software (ESRI, 2001). Results proved that the Kriging method performed best in predicting values for any geographic point data. Using the Kriging method in ArcGIS software, from a set of125 data, a set of 500 interpolated data were produced. The assumption that makes interpolation a viable option is that spatially distributed objects are spatially correlated; in other words, things that are close together tend to have similar characteristics. The Kriging method fits a mathematical function to a specified number of points, or all points within a specified radius, to determine the output value for each location. The Kriging method is a multistep process which includes exploratory statistical analysis of the data, variogram modeling, creating the surface, and (optionally) exploring a variance surface. This function is most appropriate when there is a spatially correlated distance or directional bias in the data. The experimental semivariogram, γ(h), is computed as half the average squared difference between the components of data pairs (Isaaks and Srivastava, 1989):

γ (h ) =

1 N (h ) [Z (x ) − Z (x + h )]2 ∑ 2 N (h ) i =1

(1)

4

where N(h) is the number of pairs of data locations a vector h apart and Z(x) is the measurement at point x. Before using a semivariogram in estimation, it is necessary to fit a suitable mathematical model. In this study, a Gaussian model performed best; and it is defined as follow: h2   − 2  γ (h ) = C 0 + C 1 − e a     ( ) γ h C C = +  0   γ (0) = C 0 

   

if

h〈 a

if

h〉 a

if

h=0

(2)

where C0 is the nugget effect, a is range of effluence, and C is the difference between nugget effect and sill. Geostatistical interpolation for estimating the variable at unsampled location xp, Z*(xp), as a linear combination of neighboring observation, Z(xi), is:

Z * (x p ) = ∑ λi Z ( xi ) n

i =1

n

with

∑λ i =1

i

=1

(3)

where λi are the Kriging weights for observation points.

Materials and Methods A two acre site at Circle S farm in London, Ohio about 13.2 km southwest of Columbus was selected and strip trials were conducted (Figure 3). Within two acres, composite soil samples on a 30 ft ´ 30 ft grid were obtained. In the summer of 2002, an intensive data collection effort of crop yield, soil characteristics and plant measurements was undertaken. Soil samples were analyzed for mineral-N content (nitrate and ammonia), available phosphorus (Bray P), exchangeable K, calcium, magnesium, Cation Exchange Capacity (CEC), organic matter, and pH. Moreover, soil compaction, moisture and electrical conductivity were mapped during the growing season. Two multispectral aerial images were captured on May 14th and June 18th , 2002 utilizing the services of the commercial firm M/S GeoVantage, Madison County, Ohio. The 4-spectral bands were filtered for observations in the blue (410-495), green (510 to 590 nm), red (610 to 690 nm), and NIR (800 to 900) wavelength intervals, respectively. The imagery had a ground pixel size of approximately 0.5 m. The NIR, red, green and blue band images in each Color Infrared Red (CIR) composite image were registered to correct the misalignment among them. The registered images were geo-referenced or rectified to Universal Transverse Mercator (UTM), World Geodetic Survey 1984 (WGS-84), Zone 17, coordinate system based on USGS maps. Registered images were supplied by the firm whereas further image analysis to examine vegetative vigor was performed by converting multi spectral images to a single vegetation index;

5

the normalized vegetation index (NDVI) and soil adjusted vegetation index (SAVI) using ERDAS IMAGINE software (Leica Geosystems, Atlanta, GA). The imagery was supplied in the bitmap format with bmp extension along with a separate header file containing the image geometry. Since ERDAS IMAGINE software was used for image analysis which does not support bmp file extension, the image was converted to tiff format. Prior to resorting to tiff conversion using Microsoft Photo Editor, the header file was saved using Windows Explorer into tfw format. This enabled the tiff image to contain geometry information with it which otherwise is lost. Thereafter, the tiff image was imported into ERDAS IMAGINE using the import utility. A subscene, having a 437 x 212 pixel area for all three bands for May and a 477x234 pixel area for June image was extracted which represents the study plot area. The extracted images were re-projected in the Universal Transverse Mercator (UTM) projection, Zone 17 North using NAD 83 datum with units in meters and resampled to 0.5 meter pixels using the nearest neighbor pixel algorithm. These image-processing steps prepared the imagery for data extraction and analysis. Each band of both the images was also exported in GRID format for statistical analysis. Thereafter, using IMAGINE Modeler, NDVI was derived by using the ratio (NIRR/NIR+R) and SAVI by the ratio {(NIR-R)*(1+L)/(NIR+R+L)} where L = 0.5. A semi-mechanical strawberry harvesting aid was used for collecting yield data (Ehsani et al., 2001). Yield was determined by weighing the fruit in each box per grid point. The strawberries were picked five times during the 2002 season lasting from late May to mid-June. The weight of strawberries picked from each grid throughout the season was summed to determine geo-referenced yield for each grid in kg/ha. Figure 4 shows the grids locations. Data was interpolated using the Kriging method and used as input for the ANN model. Different combinations of input data and model structures were tested to obtain a relatively accurate predictive tool.

Figure 3. Sampling location in the farm.

6

Figure 4. Produced yield map using Kriging method.

Results and Discussion Several different neural networks have been tested and amongst them, three network models had the best results: multilayer perceptrons, generalized feed forward and modular neural networks. Comparisons between the models are shown in Tables 1 and 2. In these tables, MSE is the mean square error, NMSE is the normalized mean square error, and r is the regression coefficient. It is evident that for different input data, the models performed differently. The MNN model could reproduce the results better than the other two models. The results had a good agreement with the output using AVG and SVG values as input data. Figure 5 shows the comparison between measured and predicted yields for the best ANN model.

Conclusions In this study, an attempt has been made to forecast the strawberry farm yield in a practical and inexpensive way. ANN and geostatistical methods were applied on data gathered during the strawberry growing season in a two-acre field. Results showed that using aerial images as input to these models can be a practical way to forecast yield relatively accurately. Other measurements such as soil and plant parameters could help to increase the accuracy. Since such an increase in accuracy is not large enough to cover the expenses of measurement and lab analysis, they cannot be recommended for use in the model.

7

8000

Measured Yield (lbs/acre)

r = 0.940 7000

6000

5000

4000 4000

5000

6000

7000

8000

Predicted Yield (lbs/acre)

Figure 5. Comparison between measured and predicted yield by ANN model in test period.

Table 1. ANN models’ structures and inputs. Model Model Structure Epoch Number 1 MLP 4-10-1 1000 2 GFF 4-8-1 962 3 MNN 4-4(4)-4(4)-1 962 4 MLP 2-14-1 1000 5 GFF 2-14-1 1000 6 MNN 2-5(5)-4(4)-1 1000 7 MLP 21-4-1 319 8 GFF 21-4-1 917 9 MNN 21-4(4)-4(4)-1 1000 10 MLP 3-10-1 1000 11 MLP 21-10-1 268 12 GFF 21-4-1 1000 13 MNN 21-4(4)-4(4)-1 901 14 MLP 4-8-1 1000

Transfer Function Tanh Tanh Tanh Tanh Tanh Tanh Tanh Tanh Tanh Tanh Tanh Tanh Tanh Tanh

Input Data Average of AVG & SVG for May & June Average of AVG & SVG for May All Soil Data K,Mg,Ca (soil) All Plant Data N,P,K,Ca (Plant)

8

Table 2. ANN models’ performances in training and test periods. Model Training Test Number MSE NMSE r MSE NMSE r 1 0.0182 0.9380 0.952 0.0770 0.8010 0.883 2 0.0227 0.1210 0.937 0.1275 0.4440 0.874 3 0.0180 0.0985 0.949 0.0631 0.1660 0.936 4 0.0231 0.1238 0.936 0.0452 0.4707 0.918 5 0.0250 0.1370 0.931 0.1377 0.4801 0.881 6 0.0230 0.1256 0.935 0.0534 0.1541 0.940 7 0.0090 0.0528 0.973 0.0944 0.3291 0.892 8 0.0099 0.0544 0.972 0.1740 0.4597 0.848 9 0.1009 0.0443 0.977 0.1894 0.6730 0.7600 10 0.0884 0.4903 0.718 0.2090 0.7310 0.573 11 0.0099 0.5170 0.974 0.0286 0.2982 0.938 12 0.0130 0.0694 0.964 0.2071 0.7218 0.7140 13 0.0099 0.0544 0.972 0.0645 0.1701 0.933 14 0.0282 0.1410 0.929 0.2141 2.2800 0.940

References Caudill, M. 1987. Neural networks primer: Part I, AI Expert, December, 46-52. Ehsani , R.M., Durairaj, D., Xu, L., Sullivan, M. and Walker, J. 2001. Low cost techniques to predict strawberry yield in field. ASAE Paper No. 01-1103. ASAE, St. Joseph, MI. ESRI. 2001. ArcGIS: GIS and Mapping Software. Redlands, Calif.: Environmental Systems Research Institute. Isaaks, E.H. and R.M. Srivastava. 1989. An Introduction to Applied Geostatistics. New York, N.Y.: Oxford University Press. Jain, S.K., A. Das, and D.K. Srivastava. 1999. Application of ANN for reservoir inflow prediction and operation, J. Water Resources Planning and Management, ASCE, 125(5): 263-271. Kuhn, S. 2000, Berries: An Alternative Cash Crop that Packs a Healthy Punch, Farm Facts for Fairfield County The Ohio State Extension Dec. 8th, http://fairfield.osu.edu/ag/farmfactsdec8.html) Lefebvre, C. and J. Principe. 1998. NeuroSolotions User’s Guide. Gainesville, FL.: Neurodimension Inc.

9