An Example Based Image Retrieval System for the ... - Semantic Scholar

2 downloads 0 Views 300KB Size Report
AdaBoost gives the best results. Tables 2 shows the results for the Coronal-Loop la- beled data. The sample data sets and results were. C4.5. SVM. AdaB C4.5.
An Example Based Image Retrieval System for the TRACE Repository Robbie Lamb & Rafal Angryk Department of Computer Science Montana State University Bozeman, MT 59715-3880 lamb, [email protected]

Piet Martiens Harvard-Smithsonian Center for Astrophysics 60 Garden Street, MS 58 Cambridge, MA 02138 [email protected]

Abstract The ability to identify particular features and structures, such as faces or types of scenery in images, is a topic with many available applications and potential solutions. In this paper we discuss solar images and the results of our investigation of techniques that can be used to identify solar phenomena in images from the TRACE satellite for the creation of a search engine. Being able to automatically identify and search for various phenomena in solar images is of great interest for scientists studying the sun. We discuss a set of characteristics that can be quickly extracted from solar images. These characteristics are used to create classifiers for various phenomena contained in solar images. The classifiers are then used to create an example based information retrieval system There are many obstacles that need to be overcome when extracting features and creating these classifiers. These include the inherent unbalanced data sets due to varying rates different phenomena appear and multiple phenomenon that could appear in each image.

1

Introduction

This paper discusses the investigation of creating an Information Retrieval (IR) system for solar images. Currently we are working with images from the TRACE satellite. The image repository size for TRACE is approximately 1 terabyte and is growing at the rate of 260 megabytes a day. Future satellites will be capable of taking higher resolution images at a faster rate than the current satellite. We are developing this system to allow fast example-based querying of this repository. There are several difficulties that need to be accounted for with our IR problem. First, there are multiple classes to choose from for each image as opposed to binary classification where only two labels occur. The

978-1-4244-2175-6/08/$25.00 ©2008 IEEE

second issue is that each image may be labeled multiple times with several different classes, as multiple phenomena can occur in the same image. Also, as with any real world problems, the classes are significantly imbalanced, with some classes appearing more often than others. Finally, ranking images in order from the image closest to the query to the least similar ones is also a challenge. Having multiple classes is not necessarily a problem. [9] states that multi-class problems can always be reduced to multiple binary classification problems. For our solution we are creating a binary classifier for each potential label. As an image can have multiple labels we use multiple binary classifiers to identify different phenomena. We present a brief overview of our IR system in Section 3. In Section 4 we present the results of the classifiers we produced. Finally in Section 5 we present results from our prototype IR system.

2

Background

Automatically detecting phenomena in solar images has become a popular topic in recent years. Zharkova et al. [11] discuss several methods for identifying features in solar images including Artificial Neural Networks, Bayesian interference, and shape correlation. Each technique was only used to find a single type of phenomenon. There was no single technique discussed that could find a variety of phenomena. Zharkov et al. [10] conducted a statistical analysis of sunspots during the years of 1996–2004. Their study showed regular intervals for when sunspots can be expected to appear on the sun. Turmon et al. [8] use a statistical analysis for identifying and labeling active regions of the sun. Wit [2] also used a Bayesian classifier for segmenting solar images and claimed to be able to track structures in near real time.

In the computer science domain, Hoiem et al. [4] broke down images into statistical structures for classification. They used a histogram and texture features from the entire image as well as predefined scalings of the image. The image and downsampled images were also broken down into a dynamic grid of cropped images. Histograms and texture features were also extracted from cropped images and used in the classification process. SnapFind [5] took this technique a step farther and integrated it with the Diamond Framework. The Diamond Framework allows interactive searching of images by discarding potential images early in the processing of the images.

3

System Overview

There are several components to our system. First we will discuss how feature vectors are created from the images. Second we discuss the classifiers we created from the feature vectors. Finally we will give a brief overview of our Information Retrieval (IR) system. Every image that is put into the system is processed and a feature vector, ~i, is created to represent the image. Feature vectors are commonly used in search engines [6] and allow us to speed up the search [7]. Representation of images in the form of attribute vectors also allows us to evaluate the similarity of two images by using a cosine similarity measure to calculate the angle between two image vectors. The vectors are stored in a database and associated with the original images. To produce our ~i, the images are first segmented into smaller regions and texture information is extracted [3]. Our current segmentation technique breaks the image into 128 by 128 pixel blocks for extracting features. We call this technique Grid Segmentation. The values in our attribute vector, ~i, reflect different types of texture information extracted from the intensity of the images and sub-images. The seven attributes we extract are: the mean intensity, the standard deviation of the intensity, the Third Moment and Fourth Moment, Uniformity, Entropy, and Relative Smoothness. We have chosen these characteristics for the preliminary construction because the values they produce are not influenced by different orientations of the same kinds of phenomenon in different images. These attributes can also be extracted from the images quickly, an important aspect when dealing with large sets of solar images. An additional attribute, in which the images were taken, is also added to this vector. For the first version of the system, several types of classifiers have been trained for the six different labels. In prior investigations, C4.5 has been useful as a binary classifier for determining areas of images that con-

tain the empty sun or not. AdaBoost has been shown to increase the accuracy of many classifiers, and have been analyzed for improving our classifiers while training them. A SVM classifier is used as part of the work in [1] for classifying medical images and texture features, and SVM classifiers will be trained and compared with other classifiers in our system. We have created a binary classifier for each type of phenomenon that has been labeled. These classifiers are able to determine if a region of the image contains the particular phenomenon the classifier has been trained for or not. Because the majority of the regions of the images contain the empty sun, we use this classifier to filter out regions that contain the empty sun and only use the remaining classifiers to determine what phenomenon is in the particular region. This configuration is shown in Figure 1.

Figure 1. The layout of the classifiers. The layout of our IR system is presented in Figure 2. The first module, the one users will have the direct interactions with, is the Query Interface. The first step is to start a query by providing image(s) with an example phenomenon. Other constraints, such as dates and wavelengths can also be provided, to limit the scope of the search. The Information Retrieval module is responsible for analyzing the submitted sample image(s) and retrieving similar images from the TRACE image repositories. Distinct features are extracted from the sample image(s) during Image Preprocessing. Classification of the sample image(s) is performed based on the extracted information. After each sample image has been classified, we select similar images from the data catalogs related to the query and order them using a cosine similarity function with the extracted image vector, ~i.

AUC Precision Recall F-Measure

C4.5 0.915 0.916 0.930 0.922

SVM 0.921 0.902 0.941 0.921

AdaB C4.5 0.963 0.910 0.935 0.922

AdaB SVM 0.900 0.910 0.902 0.920

Table 1. Average values for detecting the Empty-Sun.

Figure 2. Data flow through the IR system.

The backbone of the system is our Phenomena Catalogs maintained in our repository. They contain a collection of pointers to the original FITS images, features that have been extracted from the images, and results of our classification. The Searching & Ranking component uses the classification results and extracted attributes from the images to quickly select and rank images that are similar to the query.

4

generated in the same manner as the Empty-Sun data. Once again the classifiers generated have high accuracy. While SVM with AdaBoost does not lose accuracy over the standard version like the Empty-Sun labeled data, C4.5 with AdaBoost once again performs the best overall. Initially we had very high hopes for SVMs, as these classifiers tend to be fast and accurate. Now, we believe that our results show that sampling, forced by our unbalanced data, can cause removal of instances along the maximum margin hyper planes, which resulted in a decrease of an average SVM’s accuracy.

AUC Precision Recall F-Measure

Evaluation of Classifiers

Several techniques were used when training classifiers to use in the IR system. Two types of sampling were used, Random Under Sampling and Random Over Sampling. For each kind of label, we created 10 random sample sets there were then used for training multiple classifiers. The results were analyzed using the Area Under the Curve (AUC) of ROC curves and the FMeasure. A 10-fold cross validation is used to validate the results. To help overcome our class imbalance problem, we are using classifiers with and without AdaBoost along with using Random Over Sampling (ROS) and Random Under Sampling (RUS). The labels in the data sets are imbalanced because the phenomena in the images appear over time at different rates. From these classifiers, we have created our IR system. Table 1 shows the results for the Empty-Sun labeled data. Over all, the classifiers we produce have a large AUC. For the C4.5 classifier, AdaBoost tended to help increase the accuracy of the classifier. As opposed to the SVM classifier, AdaBoost tended to decrease the accuracy of the classifiers. Overall, these results show that for recognizing the Empty-Sun, the C4.5 classifier with AdaBoost gives the best results. Tables 2 shows the results for the Coronal-Loop labeled data. The sample data sets and results were

C4.5 0.943 0.954 0.931 0.942

SVM 0.947 0.981 0.910 0.945

AdaB C4.5 0.976 0.958 0.942 0.950

AdaB SVM 0.969 0.963 0.931 0.946

Table 2. Average values for detecting Coronal-Loops. The Filament labeled data had the least accurate classifiers generated. All of the classifiers generated had an average AUC of under 0.90, except for the C4.5 classifier using RUS. The same sample data sets also created classifiers with the lowest AUC when AdaBoost was applied to C4.5. The Flare labeled data had the best classifiers generated. The RUS technique generated the best classifiers as opposed to the ROS technique for this label. The classifiers for the Coronal Loops are also quite good. Overall the values are quite high. The AUC is highest for the AdaBoost C4.5 at 0.976 and lowest for C4.5 at 0.943.

5

Evaluation of the Information Retrieval System

The query images each contain a particular phenomena such as a Coronal Loop or a Sun Spot. The particular region of interest in the image is selected and

submitted as a query to the system. The first 32 images returned were visually inspected to determine if they contained a similar phenomenon as the query image. Precision and Recall are used for the analysis of the IR system. The equations we use take into account the ranking of the returned images. The equations return the recall or precision at rank k. The first 32 images were used to determine if they contained a similar phenomenon to the sample image or not. For example, if the sample image contained a coronal loop and the returned image in question contained an coronal loop, this result, r, was given a 1. If the returned image did not contain a coronal loop, the result was given a 0. The graph we present in our results show the recall vs. the precision for the queries. Due to space limitations, we are only able to present the average performance of our IR system, as shown in Figure 3.

Figure 3. Average Recall vs. Precision for all of our sample images.

Initially recall is a low number, because we are looking at the total number of accurately returned images up to 32 instead of just up to k. A perfect graph would have precision with a value of 1 all for every position along the x-axis. The results in the graphs show a range of precision varying between 0.5 and 1.0. The average precision for our best query is 0.83. The lowest average precision value for our queries is 0.59. The average precision averaged over all of our queries is 0.75, meaning that on average, we can expect 75% of the returned images to be relevant.

6

Conclusions

These results show us that for a given query image, we can expect more than half of the returned images to be relevant. For the best query, 83% of the returned images were relevant. The accuracy of our systems seems

to be phenomenon specific, as some solar phenomena have more distinctive features than others. For instance images with coronal-loops had a higher average precision than the image containing the sun spot. Part of the reason for this could be due to the unbalanced data set currently being used. Overall this IR system works and returns relevant images, but there is room for improvement. Experimenting with different ranking algorithms for this application is the next step. While using a cosine similarly ranking that provides adequate results, there is obviously room for improvements. Now that this system is in place, it should be much simpler to add extra new features to the system to see how they effect the performance of the system.

References [1] P. Bhattacharya, M. Rahman, and B. C. Desai. Image representation and retrieval using support vector machine and fuzzy c-means clustering based semantical spaces. ICPR, 2:1162–1168, 2006. [2] T. D. de Wit. Fast segmentation of solar extreme ultraviolet images. Solar Physics, 239(1):519–530, 2006. [3] R. C. Gonzalez and R. E. Woods. Digital Image Processing. Prentice Hall, Upper Saddle River, New Jersey, 2002. [4] D. Hoiem, R. Sukthankar, H. Schneiderman, and L. Huston. Object-Based Image Retrieval Using the Statistical Structure of Images. Computer Vision and Pattern Recognition., 2:490–497, June 2004. [5] L. Huston, R. Sukthankar, D. Hoiem, and J. Zhang. SnapFind: Brute Force Interactive Image Retrieval. Proceedings of International Conference on Image Processing and Graphics, 00:154–159, 2004. [6] T. Joachims and F. Radlinski. Search engines that learn from implicit feedback. Computer, 40(8):34–40, August 2007. [7] Y. Rui, T. Huang, and S. Chang. Image retrieval: current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation, 10(4):39–62, 1999. [8] M. Turmon, J. Pap, and S. Mukhtar. Statistical pattern recognition for labeling solar active regions: application to SOHO/MDI imagery. Astrophysical Journal, 568:396–407, 2002. [9] P. Zhang, J. Peng, and B. Buckles. Learning optimal filter representation for texture classification. ICPR, 2:1138–1141, 2006. [10] S. Zharkov, V. V. Zharkova, and S. S. Ipson. Statistical properties of sunspots in 1996-2004: I. detection, northsouth asymmetry and area distribution. Solar Physics, V228(1):377–397, 2005. [11] V. Zharkova, S. Ipson, A. Benkhalil, and S. Zharkov. Feature recognition in solar images. Artifical Intelligence Review, 23(3):209–266, 2005.