Model-Based Neural Network for Target Detection in ... - IEEE Xplore

0 downloads 0 Views 619KB Size Report
to the central limit theorem, the pdf can be approximated by a .... is asymptotically efficient; it reaches the Cramer–Rao bound ..... Oxford, U.K.: Basil Blackwell,.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

203

Model-Based Neural Network for Target Detection in SAR Images Leonid I. Perlovsky, Senior Member, IEEE, William H. Schoendorf, Member, IEEE, Bernard J. Burdick, and David M. Tye

Abstract—A controversial issue in the research of mathematics of intelligence has been that of the roles of a priori knowledge versus adaptive learning. After discussing mathematical difficulties of combining a priority with adaptivity encountered in the past, we introduce a concept of a model-based neural network, whose adaptive learning is based on a priori models. Applications to target detection in SAR images are discussed. We briefly overview the SAR principles, derive relatively simple physics-based models of SAR signals, and describe model-based neural networks that utilize these models. A number of real-world application examples are presented.

I. INTRODUCTION A. Adaptivity, A Priority, and Model-Based Recognition

A

UTOMATIC target recognition (ATR) in synthetic aperture radar (SAR) imagery, which is a specific subject of this paper, is intimately related to the development of mathematical approaches to modeling intelligence in several fields. A contemporary direction in the theory of intellect based on modeling neural structures of the brain was founded by McCulloch and his coworkers [1]. In search of a mathematical theory unifying neural and cognitive processes, they combined an empirical analysis of biological neurons with the theory of information and mathematically formulated the main properties of neurons. McCulloch believed that the material basis of the mind is in complicated neural structures of a priori origin. Specialized, genetically inherited a priori structures have to provide for specific types of learning and adaptation abilities. An example of such a structure investigated by McCulloch was a group-averaging structure providing for scale-independent recognition of objects, which McCulloch believed serves as a material basis for ideas or concepts [2]. However, this investigation into the a priori aspect of the intellect was not continued during the neural network research in the 1950’s and 1960’s and neural networks developed at that time utilized simple structures. These neural networks were based on the concept of general, nonspecific adaptation using concrete empirical data. By emphasizing the adaptive aspect of intellect and neglecting its a priori aspect, this approach deviated from the program outlined by McCulloch. Simple structures of early neural networks and learning based entirely on the concrete empirical data were in agreement with behaviorist psychology dominant at the time. When the Manuscript received June 14, 1996; revised September 13, 1996. L. I. Perlovsky, W. H. Schoendorf, and B. J. Burdick are with Nichols Research Corporation, Lexington, MA 02173 USA (e-mail: [email protected]). D. M. Tye is with Datacube, Inc., Danvers, MA 01923 USA. Publisher Item Identifier S 1057-7149(97)00348-5.

fundamental, mathematical character of limited capabilities of perceptrons was analyzed by Minsky and Papert in 1969, interest in the field of neural networks fell sharply. Even relatively mild criticism had a devastating effect on the interest in artificial neural systems. Why did this happen? It is interesting to note that the crisis in the field of early neural networks coincided with the contemporaneous downfall of behavioristic psychology and philosophy. Thus, it is revealing to trace metaphysical origins of mathematical concepts of intellect developed by the mathematical and engineering community. Being dissatisfied with limited capabilities of mathematical methods of modeling neural networks, which existed at the time, Minsky suggested a different concept of artificial intelligence based on the principle of a priority. He argued that intelligence could only be understood on the basis of extensive systems of a priori rules [3]. This was the next attempt (after McCulloch) to understand the intellect on the principle of a priority. The main advantage of this method is that it explicitly incorporates detailed, high level a priori knowledge into the decision making. This knowledge is represented in a symbolic form similar to high-level cognitive concepts utilized by a human in conscious decision making processes. The main drawback of this method is the difficulty of combining rule systems with adaptive learning; while modeling the a priori aspect of the mind, rule-based systems were lacking in adaptivity. Although Minsky emphasized that his method does not solve the problem of learning [4], notwithstanding, attempts to add learning to rule-based artificial intelligence continued in various fields of modeling the mind, including linguistics and pattern recognition [5]–[9]. In linguistics, Chomsky has proposed to build a self-learning system that could learn a language similarly to a human, using a symbolic mathematics of rule systems [10]. In Chomsky’s approach, the learning of a language is based on a language faculty, which is a genetically inherited component of the mind containing an a priori knowledge of language. This direction in linguistics, which is known as the Chomskyan Revolution, was about recognizing the two questions about the intellect as the center of a linguistic inquiry and of a mathematical theory of mind [7]: First, how is it possible, and second, how is learning possible? However, combining adaptive learning with a priori knowledge proved difficult: Variabilities in data required increasingly more detailed rules, leading to exponential complexity of logical inference [5], which is not physically realizable for complicated real-world problems.

1057–7149/97$10.00  1997 IEEE

204

Concurrent with early neural networks and rule based intelligence, adaptive algorithms for pattern recognition have been developed based on statistical techniques [11]–[14]. In order to recognize objects (patterns) using these methods, the objects are characterized by a set of classification features that are designed based on a preliminary analysis of a problem and thus contain a priori information needed for a solution of these types of problems. Applications of statistical pattern recognition methods have been limited by the fact that general mathematical methods of the design of classification features have not been developed. Design of classification features is based on a priori knowledge of specific problems and remains an art requiring human participation. When a problem complexity is not reduced to a few classification features in a preliminary analysis, these approaches lead to difficulties related to exorbitant training requirements. In fact, training requirements for these paradigms are often exponential in terms of the problem complexity [15]. Model-based approaches in machine vision have been used to extend the rule-based concept to 2-D and 3-D sensory data. Use of physically based models permits utilization of detailed a priori information on objects’ properties and shapes in algorithms of image recognition and understanding [5], [8]–[9], [16]–[24]. Models used in machine vision typically are complicated geometrical 3-D models that require no adaptation. These models are useful in applications where variabilities are limited and types of objects and other parameters of the recognition problem are constrained. When unforeseen variabilities are a constant factor in the recognition problem, utilization of such models faces difficulties that are common to rule-based systems. Increasingly more detailed models are required, potentially leading to a combinatorial explosion. Parametric model-based approaches have been proposed to overcome the difficulties of previously used methods and to combine the adaptivity of parameters with a priority of models. In these approaches, adaptive parameters are used to adapt models to variabilities and uncertainties in data. However, complicated adaptive models often lead to combinatorial explosion of the complexity of the recognition process. This situation is summarized in recent reviews as follows. “ Much of our current models and methodologies do not seem to scale out of limited ‘toy’ domains” [22]. “The key issues the inherent uncertainty of data measurements” and (are) “combinatorial explosion inherent in the problem” [25]. Factors of a priority and adaptivity ought to be combined by acceptable concepts of the intellect. Therefore, approaches to combining both factors are of paramount interest. A mathematical analysis of existing approaches to the design of systems and algorithms of mathematical intelligence leads to the conclusion that computational concepts of most of today’s neural networks originate in pattern recognition algorithms and that there are four basic concepts forming the foundation for all the multiplicity of learning algorithms and neural networks [15]. These are 1) the concept of rule-based systems [3] defined by the factor of a priority; 2) the concept of nearest neighbors;

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

3) the concept of discriminating surfaces, both defined by the factor of adaptivity [11], [13], and 4) the concept of parametric models [5], [23], that attempts to combine a priority and adaptivity. While most of neural network paradigms are related to concepts 2) and 3), this paper describes a type 4) concept of a model-based neural network. The concept of model-based neural networks is aimed at combining a priority and adaptivity and eliminating the combinatorial explosion that seems to be inherent in other existing methods of modeling intellect and, in particular, performing automatic recognition. This concept originates from earlier work utilizing successively more complicated internal models. The first model-based neural network—Widrow’s Adaline [26]—was limited to a relatively simple model. During the last 10 years, several neural networks have been developed utilizing much more complicated nonlinear models. Neural networks based on statistical models have been developed for recognition in [27]–[30]. This paper describes a general model-based neural network as well as its specific realizations using models for SAR image recognition. In Section II, we describe properties of SAR signals and data sets used to illustrate the applications. Section III describes the general model-based neural network (MBNN). The MBNN models combine deterministic and statistical aspects of the data so that adaptation of model parameters is used to learn expected variabilities in the data, whereas unpredictable variabilities are treated statistically. The MBNN interconnections provide for fuzzy association between the data and models, and its learning mechanism provides for the adaptation of model parameters. Section IV describes relatively simple physically based models suitable for detection of unresolved targets. This technique is applied to the detection of tactical targets and downed small aircraft in heavy clutter. Application examples are described in Sections V and VI. Section VII summarizes the results. II. SAR

AND

DATA SET DESCRIPTION

A description of SAR operations and image formation can be found in [31]. SAR signal contains in-phase and quadrature components, forming a complex processed signal . In order to preserve all the information contained in the radar return, every pixel in SAR images must be characterized by a complex value. Most of examples illustrated in this paper are taken from the analysis of data obtained from a SAR, which was employed by NASA in its search-and-rescue mission, when looking for small downed aircraft. This radar was described in [32]. It transmits two different polarizations of electromagnetic waves and for each transmitted polarization, the two polarizations are received. Altogether, four signals are received so that four complex images are available for performing detection and identification: , , , and . Here, and are the two copolarized returns (vertical or horizontal transmitted and the same received), and and are the crosspolarized returns. It is usually assumed that . In the examples considered below, the cross-polarized return is actually computed using both the and the returns and

PERLOVSKY et al.: MODEL-BASED NEURAL NETWORK FOR TARGET DETECTION IN SAR IMAGES

performing phase equalization to account for the fact that the phase in the SAR images might not be properly calibrated. The radar return of the th pixel may therefore be written as a 3-D complex vector (1) The data provided by the NASA Goddard Space Flight Center have been recorded as a pixel covariance averaged over four higher resolution SAR subpixels: (2) Here, subpixels are indexed by two indexes: a pixel index and a subpixel index . The imaged areas contained simulated aircraft wrecks placed in forest and snow environments under foliage canopies. The approaches to the NASA/JPL data study were dictated by the nature of the SAR data, in particular, by its resolution and by the occurrence of phase miscalibration. The pixel size of the SAR imagery is 12.0 m in azimuth and 6.67 m in range. Since the wavelength of the radar is roughly 0.75 m, the resolution pixel spans many wavelengths. A large, manmade object, such as a metal plane or a corner reflector several meters in size, can dominate scattering from a pixel. In many applications, large man-made structures produce a strong specular reflection, which is a large amplitude glint associated with the scattering of electromagnetic energy when the wavelength and pixel size are small compared with the characteristic length of the scatterer. These types of specular returns are often used as a key for detecting targets. In the considered example, however, the area subsumed by a downed destroyed aircraft is considerably less than the area of a resolution cell so that the typical wreck of a small plane occupies only a fraction of a pixel. Due to the small size, most pieces of aircraft wreckage do not produce strong specular returns. As a result, “bright spots” in SAR imagery considered here are not very useful for detection, resulting in a need to exploit polarimetric differences between the wreckage and the background. Another complication in using the NASA/JPL SAR data is that it is not phase-calibrated so the phase relationships between the scattered returns at the different polarizations are not consistent from image to image. The approach taken to circumvent the phase miscalibration problem is to perform adaptive target detection to each image independent of the other images. This also simplifies the problem in that we do not have to compensate for image-to-image variations arising from different depression angles, weather, and other measurement conditions. This approach does result in one serious limitation, however, in that is no longer possible to use the data from previously obtained images to train an algorithm or a neural network for application to new images. Since there are many clutter pixels on each image, the clutter model adaptation can be performed using a single image, but target models cannot be estimated since there is only one target pixel on each image. In the approach taken, the clutter is characterized independently on

205

each image, with the target being treated as an anomaly and chosen as the pixel with the smallest likelihood of belonging to the clutter class. An alternative approach to mitigate unknown phase calibration could be based on the generalized likelihood ratio test (GLRT). This test can use likelihood ratio with free parameters to account for the unknown change in conditions—the unknown phases in our case [33]. GLRT would be justified if a significant database of target returns was available and if we could reliably parameterize all effects of image-to-image variations that also arise from different depression angles, weather, and other measurement conditions. In our database, there were nine images, each containing a single target; thus, GLRT was not applied to these data. Altogether, the NASA data set contained nine images containing one target each. Target positions were unknown. Strong clutter was present due to targets being under a heavy foliage and, in addition, due to the presence of ice in many of the images. The image size varied from 30 50 1500 pixels to 100 100 10 000 pixels. Two images from the NASA data set, the Michigan data, and the North Carolina data are considered in more detail in Section V. Another data set analyzed in this paper was acquired using a Lincoln Laboratory 1-ft resolution fully polarimetric SAR. Records of the radar complex returns were available, and the polarization measurements were calibrated. The data set was acquired near Stockbridge, MA. We analyzed three strip maps (3 512 2048 pixels 1/6 km at 0.75 ft/pixel sampling) of this data set. Available ground truth for this scene shows that within the range coverage of the SAR, there were nine military vehicle targets and nine corner reflectors (all but one of which were pointing within 90 of the SAR line of sight). The ground cover truth was available, indicating five types of ground-cover clutter present in the scene: “tree-lines,” “forest,” “hedges,” “roads,” “fields,” and the sixth type of clutter due to radar: “shadows.” Analyses of Stockbridge data are provided in Section VI. III. MODEL-BASED PATTERN RECOGNITION AND MBNN The MBNN is developed with a goal of combining the available a priori knowledge of models of data with adaptivity to changing data properties, while avoiding combinatorial complexity difficulties encountered in the past model-based approaches. The combinatorial complexity issue is resolved by utilizing fuzzy (or probabilistic) associations between the data and models in place of combinatorial search. In a neural network, fuzzy associations are accomplished via “synaptic” weighted connections. The model-based neural network described here is a further development of the maximum likelihood adaptive neural system (MLANS) that was first introduced by Perlovsky [27]. The architecture of MLANS was described in [28] and [34]. Similarly to MLANS, MBNN consists of the two main subsystems: the Association subsystem, which estimates data-to-model association weights, and the Modeling subsystem, which estimates model parameters; see Fig. 1. The internal dynamics of MBNN learning/adaptation consists of iterative estimation of association weights and

206

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

cases, appropriate distributions should be used, as discussed in following sections, but in this section, in order to be specific, we consider Gaussian distributions for class-conditioned pdf’s. Object recognition consists of finding the class and the values of parameters that result in the best segmentation and match consistent with (3) or (4). Before describing the concurrent estimation-segmentation process of MBNN, let us summarize a more conventional mathematical formulation of the model-based pattern recognition problem. Consider all possible segmentations or partitions of data among all classes Fig. 1. Model-based neural network architecture. Association subsystem computes weights that associate data with models. Modeling subsystem estimate parameters of the models.

model parameters. The MBNN dynamics is described in this paper in algorithmic terms (we also explain relationships of algorithmic terms to the neural ones, but an understanding of these references is not necessary for the readers who are not interested in neural terminology). A model-based pattern recognition is based on models. Models of data should be developed for every class of objects . These models are functions of model parameters (such as pose), and they represent the data that we expect to observe. In a case of perfect models, when the observed object is of class , there are values of parameters such that the data perfectly matches the model (3) For imaging data, every data vector is either a subset of image pixels obtained in the process of segmentation or features extracted from this subset. The model is a prediction of this vector, and it should account for the properties of objects and a sensory system. In reality, a perfect match as in (3) cannot be attained because there are multiple sources of deviations between the model and the data. If all deterministic aspects of the problem are accounted for in the model , the deviations can be treated statistically, that is, the model can be considered to be a class-conditional statistical expectation of the observable vector

(6) pixels comprising This notation means that all the , the image are partitioned into subsets (segments) , and pixels are used to form a data vector to which then the model is matched. For example, a “background” class would likely appear many times in this partition; therefore, there is no one-to-one correspondence between indexes (data vectors) and (classes); we number . All possible segmentations, as described above, also include all possible combinations of segments and classes (models). In order to reduce the number of all these combinations, usually, model-based approaches use indexing. The matching involves estimation of the unknown parameters of the models for every segment . The ML estimation is obtained by maximizing the likelihood or overall pdf of all pixels in the image. Given segmentation (6), the conditional likelihood can be written as pdf pdf pdf

pdf

(7)

It is usually sufficient to consider the prior probability of the set of hypotheses as a product of individual prior pdf’s so that pdf

pdf

pdf pdf

pdf

(4) Here, is the class- hypothesis, and the expectation is taken with respect to pdf , which is not known a is accurate and priori. If the deterministic model deviations are caused by multiple random effects, according to the central limit theorem, the pdf can be approximated by a Gaussian distribution conditioned on class and model parameter values. Combining deterministic and statistical properties of the data, the combined model can be formulated as pdf

(5)

where is a Gaussian distribution with the mean given by the deterministic model and the covariance , which should either be modeled or estimated from the data. In certain cases including those considered later in this paper, variabilities in the data are caused by specific physical mechanisms that do not satisfy conditions of the central limit theorem, and Gaussian distributions might not be appropriate; in these

(8) It is worth emphasizing that although (8) is a product of individual image-segment pdf’s, it can account for intersegment deterministic relationships through the models, due to the fact that expected values of observations for multiple segments can be modeled using a common set [35], [36]. of model parameters The ML estimate of the set of parameters for the th model and th segment is obtained by maximizing (8) over these parameters. The problem factors into maxiover mizing pdf’s of each individual segment pdf the parameters of each individual model . This greatly simplifies the problem of parameter estimation, which still could be quite substantial for complicated models. After the best set of model parameters is obtained for every partition, the likelihood for every partition is computed, and parameter values corresponding to the maximum likelihood partition

PERLOVSKY et al.: MODEL-BASED NEURAL NETWORK FOR TARGET DETECTION IN SAR IMAGES

are selected so that the total likelihood is the maximal conditional likelihood over all segmentations. The above formulation of the model-based pattern recognition problem is fairly broad. It addresses the top level of the problem while omitting details that are important for specific application areas and for specific approaches to controlling the combinatorial explosion. Most of existing approaches can be formulated within this framework. The need to consider all or many of the partitions or groupings (6) is the main source of the intrinsic combinatorial complexity of the model-based pattern recognition. An alternative mathematical formulation of the model-based pattern recognition problem leading to a model-based neural network for solving the problem without the combinatorial explosion can be approached as follows. The model-based neural network considers a joint problem of concurrent segmentation, model estimation, and matching. Correspondingly, the joint likelihood for this problem is obtained by considering segmentation as a part of hypotheses so that segmentation emerges in the process of model estimation. The likelihood for each pixel is, thus, a weighted sum of pdf’s of alternative hypotheses: pdf

pdf

pdf

(9)

and the total likelihood is a product of likelihoods of individual pixels pdf pdf

pdf

pdf

(10)

This expression is much more complicated than (8) because it contains items of the type of (8). Nevertheless, maximization of this likelihood can be achieved in the model-based neural network without combinatorial explosion. To simplify the following formulation, we consider covariances in (5) as known and only expected values to be functions of . In this case, the equations describing model parameters the model-based neural network can be derived by the gradient ascent maximization of the joint likelihood (10), and they can be expressed as follows. The Modeling subsytem equations for the parameter update are (11) (12) (13) Here, indexes , refer to the components of the vector of model parameters, and indexes , refer to the components of the data vectors; summation is assumed over repeated indexes , , , and (;) denotes partial derivatives with respect to parameters with corresponding indexes: (14)

207

Association subsystem estimates the weights that associate each pixel with each model ; they are computed by the Association subsytem as the a posteriori Bayes probabilities according to pdf pdf

pdf

(15)

pdf

The system of (11)–(15) defines the model-based neural network as a nonlinear, nonstationary system that exhibits stable learning in the face of a continuously changing flow of stimuli. It always converges [37]. The computational complexity of this system is only linear in terms of the number of pixels times the number of classes . The following convergence issues are now briefly discussed: local versus global convergence, biases and precision, learning or adaptive efficiency, and the number of iterations. As is usually the case with complicated nonlinear systems, convergence to the global maximum of the likelihood function is not guaranteed. However, in our experience, convergence to local maxima does not represent a significant problem for several reasons. The problem of convergence to local versus global minima can be discussed in terms of the related problem of the accuracy and precision of the parameter estimation. We will consider, first, parameter estimation bias (accuracy) and, second, the standard deviation (precision). The ML estimation is asymptotically unbiased; thus, if the probabilistic model corresponds to the data, the model-based neural network is guaranteed to converge to the global maximum for sufficiently large number of pixels. The asymptotic requirement is well satisfied in this case since a large number of clutter pixels is available for the model estimation, and the clutter model usually is a good approximation of the data. Thus, the global convergence can be expected. Cases when the model is not sufficiently accurate are further considered in Section V. Of course, when only a small number of pixels is available, the problem of global versus local convergence can be easily mitigated by considering several initial guesses. We have investigated this problem in many examples and found that usually two or three different initial guesses are sufficient to find the global maximum. A fundamental issue related to convergence is adaptive or learning efficiency, that is, how precisely are model parameters estimated from a limited amount of data? The ML estimation is asymptotically efficient; it reaches the Cramer–Rao bound (CRB). We verified numerically in several cases that our neural network attains CRB and that this asymptotic behavior does not require too many samples. An empirical rule of thumb is that on the order of 10 (real scalar), measurements are needed per each estimated (real scalar) parameter [28]. Learning requirements of this system are also moderate due to utilization of a priori models. These models for particular applications are to be designed to strike a balance between the available a priori knowledge and available training data. Thus, the mathematical apparatus described in this section outlines an approach to resolving the issue of seemingly inescapable exponential complexity of the modeling of intellect discussed in Section I. The concept of the model-based neural network

208

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

promises to fulfill the McCulloch vision of learning based on complicated a priori neural structures. The next section describes the development of physically based models for SAR ATR applications. The models described in this paper, on the one hand, are relatively sophisticated as far as they model radar returns from physical principles, and they account for multiple sources of signals present in the image. On the other hand, as far as modeling target geometry, the models are simple as they address single-pixel detection and classification problems. Multipixel classification requires a development of more complicated multipixel geometrical models, which is beyond the scope of this paper. The convergence process of MBNN, which is considered to be a real-time evolution of a dynamical system, is determined by the flow of stimuli, the a priori models, and the ML estimation-learning principle. The ML learning principle drives this neural network to maximize the likelihood of the internal model. In other words, the model-based neural network possesses an internal ML drive to improve its internal representation of the world. IV. PHYSICALLY BASED CLUTTER AND TARGET MODELS A. Background Clutter Model This section describes the background clutter model in case when there is no specific physical mechanism for a single or a few dominant scatterers in the pixel so that a clutter pixel return is composed of multiple scatterers in this pixel with random relative phases of the scatterers. Therefore, both and components of the signal are sums of multiple positive and negative values so that their distributions across similar type of pixels (same type of terrain) can be modeled as a Gaussian with zero mean [38]. The phase of the return from the pixel, thus, may be considered to be uniformly distributed between zero and , which is called a circularly symmetric distribution. Different types of terrain are described by Gaussian distributions with different covariances. Multiple types of terrain that might be present in an image are alternative sources of a signal; thus, a probability distribution of signals in an image is modeled as a superposition of alternatives. We consider the complex scattering amplitudes to be statistically independent from pixel to pixel. This assumption is not necessarily valid because nearby clutter pixels might contain contributions from the same scatterers. It is used here because the pixels subsume many clutter returns so that the assumption of clutter pixel independence may be used as an approximation. In addition, prior probabilities pdf are modeled as constants for each class and denoted . Thus, the three complex scattering amplitudes associated with a particular pixel are modeled with a zero-mean, circularly symmetric, multivariate, complex Gaussian mixture density: pdf pdf

pdf (16)

where Hermitian conjugate vector (this is, complex conjugate and transposed); complex covariances for the component describing different types of terrain present in an image; priors or relative occurrences of the types of terrain. Adaptation of this model is achieved by estimation of the parameters and . The circular symmetry property implies that the complex covariance matrices have a specific relationship between the real and imaginary parts of the elements, corresponding to the definition of these matrices: (17) The exponent in (16) can be written as Tr Tr

(18)

Here, a pixel covariance was introduced in (2). According to (16) and (18), is a sufficient statistic for the complex scattering amplitudes distributed according to the circularly symmetric complex Gaussian mixture distribution (16). An equivalent interpretation of this model is that is distributed according to a mixture distribution of complex Wishart density functions: pdf pdf

pdf Tr (19)

In case of NASA data sets, the data have been recorded as averaged over four higher resolution SAR subpixels (2). Our modeling procedure is still adequate since several adjacent subpixels in most cases belong to the same type of clutter and are distributed according to a single mixture component so that is still a sufficient statistics for the data. The MBNN equations estimating parameters of the above clutter model are obtained similarly to the general (11)–(15) by maximizing the likelihood (10) with pdf’s given by (19) over the parameters and . Equations for the Association subsystem are exactly same as the general (15), and equations for the Modeling subsystem are simplified to

(20) is the total number of pixels, and can be Here, interpreted as the expected number of pixels associated with model . Equations (15) and (20) lead to the maximum likelihood (ML) estimation, which can be proved similarly to the real mixture models [28].

PERLOVSKY et al.: MODEL-BASED NEURAL NETWORK FOR TARGET DETECTION IN SAR IMAGES

For the above case of the clutter model, the number of parameters is 10 per model (clutter type), and every pixel measurement contains nine independent parameters; thus, the order of 10 pixels per clutter type is sufficient for an accurate estimation of the model parameters. A typical number of pixels per clutter type used in examples described below was on the order of hundreds or thousands. When estimating a model for the first time, the number of iterations could be from a few to a few tens, which is acceptable. Even far fewer iterations are required in real-time operation when the model is continuously updated as every few new pixels are acquired; a typical number of iterations is 1, which is a very fast convergence. B. Outlier Models In some cases, a relatively few Wishart components were sufficient to accurately model the clutter distribution, resulting in a few outliers of which one was the target. However, there were cases in which the outliers could not be unambiguously selected, and additional processing had to be performed. Since these outliers are located in the tails of the likelihood distribution, an appropriate procedure must be brought to bear to obtain a more accurate representation of these tails. The application of Wishart mixture model provides the best representation of the central portion of the likelihood distribution because the majority of the observations are associated with that part of the distribution. In order for Wishart mixture model to provide a characterization of similar fidelity to the tails, a large number of components might be required. However, if the outlier observations could be separated from the remainder of the observations and the density function of the outliers estimated in the absence of the central portion of the likelihood distribution, then the distribution of the tails would be accomplished with a much smaller number of components. In removing the central portion of the likelihood distribution, it is apparent that while the total distribution may have zero mean, the outlier distribution will have nonzero mean because outliers are displaced from the central portion of the zero-mean distribution. It is also possible that outliers are caused by large objects such as cliffs or by man-made objects such as cars or high power line towers. Thus, it is important that the likelihood function of the outliers be characterized by a mixture of components with nonzero means. Two models chosen for this purpose of a parsimonious representation of the tails of the distributions of the likelihood function are described below. 1) Pixel Eigenvalue Model (PEM): The PEM model uses a multivariate Gaussian mixture density for the three eigenvalues of the pixel covariance matrix (18). Because the pixel covariance matrix is Hermitian, the eigenvalues are real-valued and nonnegative; these considerations, on one hand, permit using real-valued distribution mixtures and, on the other hand, indicate that Gaussian components may not be the best since they do not account for nonnegativeness. The PEM model does not convey all the information in the data because the eigenvalues provide only a portion of the available information in the data. The MBNN implementation of the real Gaussian mixture model was described in [28]. It differs from the Wishart

209

mixture (19) by using the Gaussian components, which are given by pdf (21) is the dimensionality of the model, are the expected values of the vectors of eigenvalues for each model component, and the covariance matrices of the eigenvalue distributions for each component are defined as . The Association subsystem equations are given by (15) with Gaussian pdf’s. The Modeling subsystem equations for the ML estimation of the parameters , , and , are simplified to the following form [28]:

were

(22) An important difference between the PEM model and Wishart mixture model is that the PEM components form a complete set of functions in the functional space of all possible pdf’s of the eigenvalues, whereas Wishart components do not form a complete set of functions because they are based on zero mean amplitude model (16). This issue of completeness is important if assumptions leading to the Wishart mixture are not exactly satisfied. In addition, the PEM can be used in supervised target detection approach because the eigenvalues are unaffected by phase miscalibration. 2) Covariance Matrix Real Gaussian Mixture Model (CMM): The pixel covariance matrix , being a Hermitian matrix, contains nine nonredundant real components: three diagonal components ( ) and real and imaginary parts of the three independent off-diagonal components ( ). Here, we consider a pdf of a realvalued vector formed by these nine nonredundant components of the pixel covariance matrix , Re Re Re Im Im Im . The covariance matrix real Gaussian mixture (CMM) models the pdf of this vector as a multivariate Gaussian mixture. Parameters of this model are the rates, means, and covariances of the mixture components. The MBNN equations for this model are exactly the same as those considered above in Section IV-B1 with dimensionality and used in place of . The components of CMM form a complete set of functions; this model does not utilize any specific a priori knowledge of scattering mechanism, and CMM utilizes all the information present in the pixel covariance data.

210

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

C. Rician Model Models developed above are suitable for detecting a single target pixel in a SAR image as a least likely clutter pixel. If several target measurements are available, a two-class classifier can be developed that utilizes available information for each class. In case of a target present in a SAR pixel, the assumptions that led to the zero mean circularly symmetric hypothesis for the clutter model may no longer be valid. A more appropriate distribution for target plus clutter has been derived in [39]. In this case, the radar pixel amplitude for the th-type scatterer can be written as a sum of the random clutter scatterers and a nonrandom one:

(27) Here, the weights and are computed by the Association subsystem. The weights are computed as previously—(15) using the Rice pdf—and are additional Rician weights that are the complex numbers with the amplitude (28) and the phase factor

(23) (29)

where circularly symmetric zero mean complex Gaussian vector with covariance matrix ; complex deterministic but unknown 3-D scattering amplitude vector; phase of this scatterer. The phase is related to the range from the radar to the pixel, and it is very sensitive to the altitude; therefore, it should be modeled as uniformly distributed over 0 to and independent of . Therefore, the probability density of conditioned on the component and on is pdf (24) is random, the interest is in the probability Since the phase density of conditioned only on its component membership pdf

where

denotes the complex conjugate. V. NASA DATA EXAMPLES

The NASA data set was processed using Wishart mixture model (19) and (15). We must be reminded that the NASA data set is comprised of nine images containing one target each with unknown target positions. All nine images were processed. In all cases, the targets were identified without a single false alarm. In five cases, the Wishart-model processing results were deemed sufficient for target identification. In four other cases, we determined that additional processing was required using outlier models. Section V-A describes a typical example of the five cases using the Wishart model alone, and the way we determined that additional processing was not needed in this case. Additional processing for the other four cases is discussed in Section V-B.

pdf A. Michigan Data and Other Similar Cases

(25) is the zeroth-order modified Bessel function. It is where seen that if is phase shifted, the probability density is unchanged so that only the magnitudes and the relative phases of the components of are important. The distribution (25) is called the Rice pdf. The parameters of the Rician mixture include rates , covariances , and scattering amplitudes . These parameters are estimated by the Modeling subsystem. The ML neuronal estimation equations for the rates are same as before (20):

(26) The neuronal equations for the estimation of the covariance and scattering amplitude becomes [39]

The upper row in Fig. 2 shows the results of applying the Wishart mixture model (19) to SAR data taken in a heavily wooded area in Michigan. The first three images display the diagonal components of the pixel-covariance matrix data (these are the power in the two copolarized and in the crosspolarized signals). The likelihood function was computed from the entire complex pixel-covariance matrix data according to the Wishart model. The negative log-likelihood function image is shown next. The rightmost image shows the thresholded negative log-likelihood function, which indicates that there are two outliers that may be considered as possible targets and, in fact, the most negative log-likelihood pixel was indeed the pixel containing the target. Thus, the target was detected without a single false alarm. The number of mixture components (clutter types) was estimated as follows. The data were processed with four components. The estimated expected covariances for each component were manually examined to detect any unexpected phenomenology. This was repeated using four, five, and six components. Nothing unusual was detected, but in case of six components, two were very similar to each other. Therefore, it was decided to use five components in this example. A similar procedure was repeated for every image. From three to five

PERLOVSKY et al.: MODEL-BASED NEURAL NETWORK FOR TARGET DETECTION IN SAR IMAGES

211

Fig. 2. Identification of the downed aircraft wreckage under a heavy foliage canopy for two examples from the NASA data set. The upper row illustrates an application of the Wishart mixture model to the Michigan data set. The P-band SAR, 50 30 pixel image part is shown. The Wishart mixture model results in target detection without false alarms. The lower row illustrates the North Carolina data set. The P-band SAR, 50 50 pixel image part is shown. The first five images show same type of data as the upper row. In this case, the Wishart mixture model does not lead to a reliable detection (j). The outlier model is used to reduce the potential number of false alarms, resulting in a successful target detection without false alarms (k).

2

components were required for different images. The process of the number of component selection can be easily automated. Detection thresholds were selected as follows. A negative log-likelihood (NLL) histogram was plotted for each image (that is, the number of pixels versus binned NLL values). Each histogram was manually examined. For five out of nine images, the histograms could be divided into two nonoverlapping parts: most of pixels on the left (low NLL) and one to three pixels on the right. The detection threshold was selected to separate these few outlier pixels. For the other four images, the histograms showed continuous distributions, where outliers could not be unambiguously detected. These four images were subjected to additional processing described in the next subsection. B. North Carolina Data and Other Similar Cases The first four images in the lower row of Fig. 2 shows the North Carolina data set processed using the Wishart model similarly to the Michigan data set in the upper row. The fourth image illustrates results of this processing with a threshold being selected to yield 20 outliers. The PEM model with just a single component is applied to these outlier pixels. The rightmost image illustrates the results: The characterization of the tails of the likelihood function distribution by estimating the distribution of the outlier pixels results in few “outliers of the outlier distribution.” Only three pixels exceed the threshold, and the target is located in the pixel with the lowest likelihood. The target is detected without a single false alarm. Altogether, we have applied the outlier model processing to four data sets. In each case, we choose between the PEM or CMM models based on which one resulted in fewer remaining outliers. In every case, the least likely clutter point happened to be the target; therefore, all the targets were detected with zero false alarms.

2

VI. STOCKBRIDGE DATA EXAMPLE A. Stockbridge Clutter Scene Segmentation In the Stockbridge data case, clutter scene segmentation was of interest, as well as target detection, and a detailed characterization of the clutter types was performed using the Wishart mixture model. Wishart mixture model estimation automatically produces probabilistic image segmentation according to (15). (For the previous NASA data set, segmentation is not discussed because little is known about the image-truth clutter types). An example of the Stockbridge scene segmentation is shown in Fig. 3. A minimum number of the required cluttermodel mixture components was six because there were six classes of clutter given by the image truth. Six components were determined to be sufficient to characterize clutter. This determination was based on comparing six-component segmentation results versus image truth. An examination of the a posteriori probability plots for these components in Fig. 3 indicates that components 1 through 6 correspond to the various types of clutter present in the scene. The components in this list are ordered according to the decreasing value of the estimated variance, (28). Previous results reported by other researchers [40] and can be [41] indicated that the covariance matrices approximated as follows: (30) In this approximation, , , and are constant and fixed, and . the only parameter that varies among components is Let us compare these findings with the covariance matrices estimated by MLANS and tabulated in Table I. Examination

212

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

Fig. 3. Wishart mixture model scene segmentation. The top image is the grey-scale composite of the a posteriori probabilities (shown below) for the six clutter components that were automatically segmented by the network. The mean RCS values of the components (bottom figures) show that the clutter requires a multicomponent model (e.g., component 5, while having the hh and vv RCS intermediate between components 4 and 6, has the largest hv RCS of all modes).

of Table I shows that our results are generally consistent with the previous findings; however, the approximation of (30) is a very rough one. Significant differences exist in covariance structures between components. In particular, or varies by the same order of magnitude as . These differences are exploited in our work for an accurate scene segmentation and target detection. B. Stockbridge Target Detection A much larger image area was processed in this example, as compared with Section V. Correspondingly, our approach to

target detection in this case is more complicated than the one used there and is more representative of a real-time operation, when it is not feasible to process all data at once, but the data should be processed as it is acquired. A natural technique often used for this type of processing employs a sliding window. Within a sliding window, we estimate parameters of models for both the local clutter and target. Targets can be identified by thresholding the likelihood of the clutter mixture probability or by performing a likelihood ratio test. Implementation of such a real-time-like strategy, which adapts to local clutter regions, requires addressing some difficult issues: How does

PERLOVSKY et al.: MODEL-BASED NEURAL NETWORK FOR TARGET DETECTION IN SAR IMAGES

213

TABLE I STATISTICAL PARAMETERS FOR SEGMENTED CLUTTER

one estimate the clutter type and avoid biasing this estimate when the target is present but has not yet been detected? One approach that has been used in the past incorporates a sliding window with a guard region in the center (to avoid the target). This approach brings with it a whole set of additional issues that must be addressed concerning the size of the window and the guard region (and their shapes and relative positions). In addition, how does one handle, for example, tree lines when the window overlaps two quite distinct regions of clutter? Then, there are tradeoffs that must be made in choosing a window size that is large enough to collect meaningful statistics but small enough so that only one clutter type is present. This approach brings with it more issues than it resolves. The approach that we use avoids nearly all of these problems. We begin processing by using a sliding window within which we estimate both the clutter and target models simultaneously using the Wishart mixture model. This solves the problem of biases that could be introduced by estimating only the clutter statistics when the target is present. Starting with initial estimates of the statistics of all the potential clutter and target types, the model-based neural network iterates and adapts to provide a local estimate of the clutter types present and determines their relative proportions. By thresholding the likelihood of the clutter mixture probability, outliers (i.e., targets) can be identified on a pixel-by-pixel basis. Alternatively, a two-class classifier (likelihood ratio) can be invoked since MLANS has estimates of the target statistics for this window. In addition to providing the desired adaptation to clutter regions, this procedure also circumvents the problems attendant with using a guard region inasmuch as the target pixels are “captured” by the target components of the Wishart model (which can be thought of as adaptive, complex-shaped guard regions). We processed three strip maps of the Stockbridge data set 2048 pixels. The sliding window size containing 3 512 20 400 pixels so that it covers a target was chosen as

plus sufficient number of pixels to estimate the clutter model. Available ground truth for this scene shows that within the range coverage of the SAR, there were nine military vehicle targets and nine corner reflectors (all but one of which were pointing within 90 of the SAR line of sight). MLANS detected seven out of the nine targets (the two that were not detected were behind a tree line, hidden in shadows). In addition, MLANS detected two targets of which we had no previous knowledge and that were later confirmed to be actually present in the scene. All eight corner reflectors were also detected and identified as distinct from the targets. There was one not very credible false alarm (that is, just few pixels were detected, whereas in each case of a true target, a number of pixels were detected, indicating a target-like shape). Detection was performed by thresholding the clutter likelihood, that is, by detecting the clutter-model outliers, similar to the discussion in Section V. Using likelihood ratio resulted in the same performance. Simultaneous with target detection, MLANS also segmented the scene into the clutter types that were consistent with the known ground truth. The detection approach described in this section can also be used to collect data on targets in SAR images to facilitate the development of target models. As data are collected, the target model can be updated to improve the likelihood-ratio detection performance. The multipixel target data can then be related to physical scattering mechanisms to obtain a better understanding of the similarities and differences between target types, leading naturally to the development of multiple-pixel target models. The development of multiple-pixel target detection and identification algorithms could proceed in the same manner as the single-pixel algorithm described above—with the exception that multiple pixels are used. A simple approach would be just to use more than one pixel in a model, leading to a higher dimensional space—the dimensionality being higher in proportion to the number of pixels being considered. In this way, all pixel-to-pixel, polarization-to-polarization,

214

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

Fig. 4. Comparison of clutter and target data and their models. The upper row illustrates clutter distributions and Wishart models; (a)–(c) comparison of Wishart models (Rayleigh in these axes) and Parzen density estimates of the individual clutter components demonstrates the multicomponent Wishart nature of clutter (two components with a possible third component evident in the L-clutter distribution); (d) distributions of pixels classified by MLANS as vegetation and shadows; Wishart mixture model (Gaussian in this axis) fit data well. The lower row illustrates target data. Target and clutter distributions are compared in (e)–(g); Parzen density estimates of the hh, hv , and vv RCS magnitudes of the clutter and target single pixel data for the example scene clearly show the structural differences between the target and clutter distributions; it is clear that the target distributions cannot be modeled as Wishart mixtures, whereas Rician mixtures could be a plausible alternative; (h) target data distributions are plotted along with a mixture model utilizing one Wishart and three Rician modes.

and polarization-to-pixel correlations can be automatically accounted for, and no unwarranted assumptions or restrictions need be imposed on the structure of these correlations (for example, by assuming that the polarization statistics are independent of the pixel statistics). This approach, however, leads very quickly to prohibitively high dimensional classification space. In order to avoid the “curse of dimensionality,” analysis of correlations can be used to determine if there are any structures in the covariance matrices (banded, bidiagonal, Toeplitz, etc.) that can be exploited to simplify the modeling. An alternative, model-based approach can utilize adaptive, physically based multipixel models, combining geometric and radiometric information. Such an approach can be based on MBNN formulation in Section VI; it is, however, beyond the scope of the current paper. C. Stockbridge Data Analysis versus Rician and Wishart Mixture Models An assessment of the validity of the Rician model for manmade targets and a comparison to clutter models was performed using a subset of Stockbridge data. The results are shown in Fig. 4. The upper row of Fig. 4 illustrates clutter distributions estimated using the Parzen method (solid line) and using the two-component Wishart mixture models (dotted line). The first three plots [Fig. 4(a)–(c)] show distributions , of amplitude absolute values for three polarizations:

, along the abscissa (Wishart components appear as Rayleigh in these axes). The last plot [Fig. 4(d)] shows similar abscissa (Wishart components distributions using Real appear as Gaussian). The natural clutter is well characterized by the Wishart model and requires two modes: one for the small cross section type clutter and one for the large cross section type clutter. The target data are shown in the lower row, plots [Fig. 4(e)–(h)]. The first three plots [Fig. 4(e)–(g)] compare Parzen estimates of distributions for clutter and targets. The target data clearly show evidence of multimodal behavior that appears to be Rician rather than Wishart (as evidenced in the presence of peaks in the Parzen density estimates that are far from the origin). Some target as an data are shown in plot [Fig. 4(h)] using Real abscissa. A mixture of one Wishart component and three Rician components models this target data well. VII. SUMMARY A novel mathematical technique of a model-based neural network has been developed along with several models for SAR target signatures. A distinguishing property of this technique is that it combines a priori knowledge of the physical laws of electromagnetic scattering with adaptation to the actual environment, and this combination is achieved with linear computational complexity, without considering multiple combinations of models and parameters. The technique has

PERLOVSKY et al.: MODEL-BASED NEURAL NETWORK FOR TARGET DETECTION IN SAR IMAGES

been successfully applied to detecting small, low-signature targets in heavy clutter environments under a foliage canopy and in ice backgrounds. It was also successful in single-pixel detection of resolved multipixel targets. In the future, more complicated, multipixel models can be developed combining adaptivity and a priori knowledge. Adaptive physics-based models can be constructed from target primitives that are relevant to different scattering phenomenologies (e.g., specular, diffuse, multiple-bounce) and geometric shapes (e.g., filled, linear, circular); estimation of the posterior probabilities of these target primitives will be used for identifying target types. This could be advantageous over template matching, for example, by not requiring a large database covering a multitude of target aspect angles and has the ability to adapt to variations in the target scattering geometries. This approach also has the potential for vastly improved performance by combining the functions of detection and identification. We have outlined the general mathematical formalism of a modelbased neural network that can implement such an approach, combining a priori knowledge and adaptive learning and solving the problems of exorbitant training requirements and combinatorial explosion in computations. REFERENCES [1] W. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” Bull. Math. Biophys., vol. 7, pp. 115–133, 1943. [2] W. Pitts and W. S. McCulloch, “How we know universals: The perception of auditory and visual forms,” Bull. Math. Biophys., vol. 9, pp. 127–147, 1947. [3] M. L. Minsky, Semantic Information Processing. Cambridge, MA: MIT Press, 1968. [4] , “A framework for representing knowlege,” in The Psychology of Computer Vision, P. H. Whinston, Ed. New York: McGraw-Hill, 1975. [5] P. H. Winston, Artificial Intelligence, 2nd ed. Reading, MA: AddisonWesley, 1984. [6] J. Koster and R. May, Levels of Syntactic Representation. Dordrecht, Germany, Foris, 1981. [7] R. P. Botha, Challenging Chomsky. Oxford, U.K.: Basil Blackwell, 1991. [8] P. P. Bonnisone, M. Henrion, L. N. Kanal, and J. F. Lemmer, Uncertainty in Artificial Intelligence 6. Amsterdam, The Netherlands: North-Holland, 1991. [9] H. R. Keshavan, J. Barnett, D. Geiger, and T. Verma, “Introduction to the special section on probabilisitc reasoning,” IEEE Trans. Pattern Anal. Machine Intell., vol. 15, no. 3, pp. 193–195, 1993. [10] N. Chomsky, Language and Mind. New York: Harcourt Brace Jovanovich, 1972. [11] N. J. Nilsson, Learning Machines. New York: McGraw-Hill, 1965. [12] K. Fukunaga, Introduction to Statistical Pattern Recognition. New York: Academic, 1972. [13] R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973. [14] S. Watanabe, Pattern Recognition. New York: Wiley, 1985. [15] L. I. Perlovsky, “Computational concepts in classification: Neural networks, statistical pattern recognition, and model based vision,” J. Math. Imag. Vision, vol. 4, no. 1, 1994. [16] R. Nevatia and T. O. Binford, “Description and recognition of objects,” Artif. Intell., vol. 8, no. 1, pp. 77–98, 1977. [17] R. A. Brooks, “Model-based interpretation of images,” IEEE Trans. Pattern Anal. Machine Intell, vol. 5, no. 2, pp. 140–150, 1983. [18] W. E. L. Grimson and T. Lozano-Perez, “Model-based recognition and localization from sparse range or tactile data,” Int. J. Robot. Res., vol. 3, no. 3, pp. 3–35, 1984. [19] R. T. Chin and C. R. Dyer, “Model-based recognition,” ACM Comput. Surv., vol. 18, pp. 67–108, 1986. [20] R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, Machine Learning: An Artificial Intelligence Approach, Vol. II. Los Altos, CA: Morgan Kaufmann, 1986. [21] Y. Lamdan and H. J. Wolfson, “Geometric hashing,” in Proc. 2nd Int. Conf. Comput. Vision, 1988.

215

[22] S. Negahdaripour and A. K. Jain, “Final report of the NSF workshop on the challenges in computer vision research,” in Future Directions of Research. National Sci. Foundation, 1991. [23] A. M. Segre, “Applications of machine learning,” IEEE Expert, vol. 7, no. 3, pp. 31–34, 1992. [24] A. Califano and R. Mohan, “Multidimensional indexing,” IEEE Trans. Pattern Anal. Machine Intell., vol. 16, pp. 373–392, 1994. [25] W. E. L. Grimson and D. P. Huttenlocher, “Introduction to the special issue on interpretation of 3-D scenes,” IEEE Trans. Patt. Anal. Machine Intell., vol. 13, no. 10, pp. 969–970; also vol. 14, no. 2, pp. 97–98, 1991. [26] B. Widrow, “Adaptive sample-data systems,” 1959 WESCON Rec., vol. IV, pp. 74–85, 1959. [27] L. I. Perlovsky, “Multiple sensor fusion and neural networks,” DARPA Neural Network Study, MIT/Lincoln Lab., Lexington, MA, 1987. [28] L. I. Perlovsky and M. M. McManus, “MLANS for adaptive classification and sensor fusion,” Neural Networks, vol. 4, no. 1, pp. 89–102, 1991. [29] D. F. Specht, “Probabilistic neural networks,” Neural Networks, vol. 3, no. 1, pp. 109–118, 1990. [30] R. L. Streit and T. E. Luginbuhl, “ML training of neural networks,” NUWC, New London, CT, Tech. Memo TM 911277, 1990. [31] W. W. Goj, Synthetic Aperture Radar. Norwood, MA: Artech House, 1993. [32] H. A. Zebker, J. J. van Zyl, and D. N. Held, “Imaging radar polarimetry from wave synthesis,” J. Geophys. Res., vol. 92, pp. 683–701, 1987. [33] H. Urkowitz, Signal Theory and Random Processes. Dedham, MA: Artech House, 1983. [34] L. I. Perlovsky, “Neural networks for sensor fusion and adaptive classification,” in Proc. 1st Ann. Int. Neural Network Soc. Meet., Boston, MA, 1988. [35] , “MLANS tracker,” in Proc. ARPA Radar Workshop, Chesapeake, VA, A147-3, 1995. [36] L. I. Perlovsky, W. H. Schoendorf, D. M. Tye, and W. Chang, “Concurrent classification and tracking using MLANS,” J. Underwater Acoust., vol. 45, no. 2, pp. 399–414, 1995. [37] L. I. Perlovsky, “Model-based intellect concepts,” in Proc. World Neur. Net. Congr., San Diego, CA, 1996. [38] W. H. Schoendorf et al., “MLANS target detection in SAR images,” in Proc. SPIE Opt. Eng. Conf., 1994. [39] T. L. Marzetta, “EM algorithm for complex Rician density for polarimetric SAR,” in Proc. Int. Conf. Acoust., Speech, Signal Processing, Detroit, MI, 1995. [40] E. Rignot and R. Chellappa, “Segmentation of SAR data,” IEEE Trans. Image Processing, vol. 1, no. 3, 1992. [41] L. M. Novak, M. C. Burl, and W. W. Irving, “Optimal polarimetric processing for enhanced target detection,” IEEE Trans. Aerosp. Electron. Syst., vol. 29, no. 1, 1993.

Leonid I. Perlovsky (M’86–SM’95) received the M.S. degree (summa cum laude) in physics from Novosibirsk University, Novosibirsk, USSR, in 1971, and the Ph.D. degree in theoretical physics from the Joint Institute for Nuclear Research, Moscow, USSR, in 1975. From 1975 to 1978, he served as Assistant and then Associate Professor of Physics at Novosibirsk University and Professor of Applied Mathematics at the Siberian Engineering Institute. During 1979 and 1980, he served as a Research Professor at New York University. From 1980 until 1985, he was a Senior Research Specialist with Exxon Research. From 1985 to 1986, he was a Principal Research Scientist with Advanced NMR Systems. He then joined Nichols Research, Lexington, MA, where he is currently Chief Scientist and Senior Fellow Member of the Technical Staff. He has also served as a consultant to Software Foundry, the Massachusetts Institute of Technology (MIT) Computer Center, and the MIT Earth Resources Laboratory. He has worked in a number of areas including theoretical physics, quantum field theory, elementary particle physics, applied mathematics, information theory, estimation theory, operations research, signal processing, oil exploration, seismic wave propagation, psychology, vision, and medical imaging. His current interests include model-based neural networks; detection, recognition, prediction, tracking, and control; information fusion; philosophy of science; semiotics; and quantum computing. He has published more than 100 papers in scientific publications.

216

William H. Schoendorf received the Ph.D. degree in electrical engineering in 1963 from Purdue University, West Lafayette, IN. He was a Research Scientist at Conductron Corporation and was with the Astronomy Department of the University of Michigan, Ann Arbor, from 1963 to 1970. He was a leader in the Systems Analysis Group at the Massachusetts Institute of Technology (MIT) Lincoln Laboratory from 1970 to 1984. Currently, he is Corporate Vice President of Nichols Research Corporation, Lexington, MA. He has conducted and led research in the areas of target tracking, sensor fusion, signal and image processing, and neural network algorithm design for radar, laser, infrared, visible, and sonic sensors.

Bernard J. Burdick received the Ph.D. degree in elementary particle physics from Case Western Reserve University, Cleveland, OH, in 1970. He was a Member of the Technical Staff at the Massachusetts Institute of Technology (MIT) Lincoln Laboratory from 1969 to 1984, and joined Nichols Research Corporation, Lexington, MA, in 1984, where he is now Director of their Remote Sensing Division. He has performed and directed research in target detection, recognition and discrimination in several areas, including the development of algorithms for detecting and recognizing time-critical targets in polarimetric SAR imagery and the implementation of protocols and algorithms necessary for accurate and reliable evaluation of the effectiveness of strategic and tactical offensive countermeasures and defensive remote sensing systems.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 1, JANUARY 1997

David M. Tye received the M.S. degree in electrical engineering in 1993 and the B.S. degree in mathematical physics, both from the State University of New York at Binghamton. From 1969 to 1989, he was an Officer in the United States Marine Corps. He was with Nichols Research Corporartion, Lexington, MA, from 1991 to 1995, where he performed research in neural networks, image and signal processing, and automatic object recognition. In January 1996, he joined Datacube, Inc., Danvers, MA, where he develops real-time imaging applications.