Image Registration Under Illumination Variations ... - Semantic Scholar

3 downloads 2441 Views 5MB Size Report
tional image registration approaches do not typically account for changes and ..... For pixel-domain mathematical manipulations, each can be represented by a .... allow for motion models with a larger number of free parame- ters. We can ...
1046

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

Image Registration Under Illumination Variations Using Region-Based Confidence Weighted -Estimators

M

Mohamed M. Fouad, Member, IEEE, Richard M. Dansereau, Senior Member, IEEE, and Anthony D. Whitehead, Member, IEEE

Abstract—We present an image registration model for image sets with arbitrarily shaped local illumination variations between images. Any nongeometric variations tend to degrade the geometric registration precision and impact subsequent processing. Traditional image registration approaches do not typically account for changes and movement of light sources, which result in interimage illumination differences with arbitrary shape. In addition, these approaches typically use a least-square estimator that is sensitive to outliers, where interimage illumination variations are often large enough to act as outliers. In this paper, we propose an image registration approach that compensates for arbitrarily shaped interimage illumination variations, which are processed using robust -estimators tuned to that region. Each -estimator for each illumination region has a distinct cost function by which small and large interimage residuals are unevenly penalized. Since the segmentation of the interimage illumination variations may not be perfect, a segmentation confidence weighting is also imposed to reduce the negative effect of mis-segmentation around illumination region boundaries. The proposed approach is cast in an iterative coarse-to-fine framework, which allows a convergence rate similar to competing intensity-based image registration approaches. The overall proposed approach is presented in a general framework, but experimental results use the bisquare -estimator with region segmentation confidence weighting. A nearly tenfold improvement in subpixel registration precision is seen with the proposed technique when convergence is attained, as compared with competing techniques using both simulated and real data sets with interimage illumination variations. Index Terms—Image registration, local illumination changes, -estimation, segmentation confidence weighting.

I. INTRODUCTION

G

EOMETRIC image registration (GIR) is the process of determining the point-by-point correspondence between two views of a scene and the transformation to align these correspondences. The GIR of a set of images is a common preprocessing step in many applications [1], such as generating Manuscript received March 09, 2011; revised June 30, 2011; accepted August 17, 2011. Date of publication September 08, 2011; date of current version February 17, 2012. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Chares Creusere. M. M. Fouad is with the Department of Company Engineering, Military Technical College, Cairo, Egypt (e-mail: [email protected]). R. M. Dansereau is with the Department of Systems and Company Engineering, Carleton University, Ottawa, ON K1S 5B6, Canada (e-mail: [email protected]). A. D. Whitehead is with the School of Information Technology, Carleton University, Ottawa, ON K1S 5B6, Canada (e-mail: [email protected]). Digital Object Identifier 10.1109/TIP.2011.2167344

panoramic image mosaics [2], performing super-resolution enhancement [3], image stitching [4], change detection [5], medical imaging [6], and remote-sensing applications [7]. For these and similar applications, subpixel registration accuracy is necessary for satisfactory postprocessing results. A primary problem in many image registration scenarios is that, often, there also exists nongeometrical transform variations within the image set. These variations manifest from many factors, including changes in lighting, motion/deformations of features in the scene, changes in occluded and occluding objects, and changes in exposure, aperture, focus, lens distortion or other photometric factors. In addition, multimodal data sets could include larger variations in terms of imaging modality, resolution, and information content. Literature exists on handling many of these confounding factors for GIR, although arguably few approaches have been proposed to flexibly handle local changes in illumination, with the multimodal techniques being a notable exception that can have good registration performance but lack the level of fine subpixel precision that we seek. The focus in this paper is hence to look at how interimage illumination variations in the scene can be compensated within an intensity-based GIR framework, with particular attention to the arbitrary shape typical for illumination variations resulting from shadows cast from one or more light sources. Note that the light sources are assumed to be far field from the scene. Interimage variations in illumination do indeed impact the GIR, as discussed in the following literature. Refs. [8] and [9] present similar global-illumination models (GIMs) with scalar gain and offset modeling in their GIR to relate the intensity levels of an image pair to improve the registration performance. In [8], an additional smoothness constraint, with predefined neighborhoods, is also imposed. In [10], an affine illumination model (AIM) is proposed with an extended gain and offset model that also used triangular or quadrilateral region support (not arbitrarily shaped) to allow an improved modeling of illumination variations. In [11], a dual inverse compositional algorithm is presented based on an assumption that the geometric and photometric transformations can be replaced, thus impeding the use of explicit local photometric transformations. Although Bartoli [11] models the photometric changes instead of illumination variations, he does so in a fashion using scalar gain and offset, when dealing with gray images, quite the same as in the GIM. In this paper, we will refer to the GIM and the AIM as our comparator group. The approaches in [8]–[11] do not take into account multiple shading levels with arbitrariness in the shape of the

1057-7149/$26.00 © 2011 IEEE

FOUAD et al.: IMAGE REGISTRATION USING REGION-BASED CONFIDENCE WEIGHTED

illumination regions, both of which are common in typical imaging scenarios. In turn, the resulting registration from these approaches tends to lack high subpixel geometric registration precision (GRP) when multiple interimage arbitrarily shaped regions with distinct illumination variations exist. In some of our preliminary work [12], interimage illumination variations were jointly considered with the subpixel GIR. However, those variations were limited to only two shading levels (i.e., shadowed and nonshadowed), whereas in [13], we generalized the model to handle image sets with interimage arbitrarily shaped local illumination variations (ASLIV). One drawback of the approaches in [8]–[13] is that a least-square estimator (LSE) is used for the cost function on the interimage residuals. As well understood, the influence function (i.e., the first derivative) of the LSE is proportional to the residuals, thus being sensitive to outliers. Therefore, in [14], we proposed replacing LSE with a more robust estimator, -estimator. Using -estimators was touched such as an upon in [15], which presents an image registration approach using an -estimator correlation coefficient with an exhaustive branch-and-bound search. However, the direction in [14] allowed for a single cost function to be developed and solved using the Gauss–Newton method, which allows for improved subpixel GRP. Given a cost function with a tuning parameter, such as the Huber function as discussed in [14], the residuals were segmented into two classes, i.e., small residuals and large residuals. Then, each class is differently penalized based on the -estimator to reduce the impact of outliers. A problem in this approach is that the selected threshold is not necessarily appropriate for all illumination regions, thus yielding larger error in some illumination regions compared with others. Moreover, during the iterative registration process, the boundary of each illumination region may contain pixels that rightly belong to an adjacent region, thus impacting the GRP. The challenges are how to create a locally spatially adaptive threshold for each region and how to reduce the negative effects of such mis-segmented pixels to improve the GRP. In this paper, we propose a confidence-weighted region-based -estimator approach that improves the subpixel GRP jointly with interimage illumination compensation. In principle, the proposed approach exploits, extends, and generalizes the ASLIV registration model presented in [13] and [14]. The proposed approach is introduced in a generalized form for both objective cost functions and confidence-weighting functions. However, for the experimental work, we exploit the bisquare function as the objective cost function, which we also extend from single threshold to multithreshold in order to cope with several illumination regions. Moreover, the proposed approach employs a confidence-weighting approach to reduce the negative impact of the mis-segmented areas located on the regions’ boundary. The proposed approach uses an iterative coarse-to-fine scheme [16] to allow for estimating larger geometric transformations and to reduce the computational cost. We would like to point out that the coarse-to-fine scheme is also used to increase the geometric registration accuracy (GRA) by improving the chance of convergence but stress that the focus of this paper is on improving subpixel precision on cases where convergence

-ESTIMATORS

1047

already occurs versus addressing whether the image registration actually converges. As such, we do assume for the image sets tested that correct convergence does occur with approaches such as GIM, AIM, or similar image registration approaches, or that we have a priori knowledge that allows for a reasonable initial image alignment. From these initial rough registrations, our proposed approach seeks to improve the GRP of the final registration through discovering and compensating for illumination variations. The rest of this paper is organized as follows: In Section II, we discuss the image formation model used in this paper. In Section III, we propose an intensity-based image regis-estimators with the ASLIV tration approach that uses model. Section IV focuses on mathematically manipulating the proposed approach using the -estimator with a specific objective function and a particular weighting function through an iterative multiresolution framework. In Section V, both real and simulated data sets used in the experiments are described. Section VI presents the performance evaluation measures and implementation setup of competing models. In Section VII, we present experimental results using both simulated and real data sets. Finally, conclusions on the proposed approach are drawn in Section VIII. II. IMAGE FORMATION MODEL Here, we give an overview on the image formation model that is assumed in this paper. To present the image model, we list the assumptions related to capturing images for a scene, such as extrinsic and intrinsic camera assumptions, lighting assumptions, and scene assumptions as follows: 1) Extrinsic camera assumptions: The camera used is assumed to capture images for a scene at different times of the day or the year. The distance of the camera from the scene is large compared with the relief depth of the scene (i.e., an affine motion model would be reasonable to represent the motions between the captured images with negligible parallax). A narrow baseline is also assumed. 2) Intrinsic camera assumptions: The camera follows the pinhole model. The camera is assumed to be perfect so that photometric corruptions and lens distortions are assumed already corrected. 3) Lighting assumptions: Far-field light sources are assumed with respect to (w.r.t.) the scene. The scene exhibits a Lambertian model of illumination on its surfaces. 4) Scene assumptions: The scene potentially has objects with 3-D structure. The scene is static from the perspective that there are no moving objects. The scene is dynamic from the perspective of changes in illumination. An example of a real image pair is shown in Fig. 1, which shows two photos of the same canyon taken at two different times of the day from slightly different camera positions. III. IMAGE REGISTRATION WITH REGION-BASED CONFIDENCE-WEIGHTED -ESTIMATORS Here, we propose the general framework for an intensitybased image registration approach. In Section III-A, we present the ASLIV registration model [13], which results in improved GRP by incorporating interimage illumination compensation.

1048

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

Fig. 1. [(a) and (b)] Two images of a canyon taken at different times of the day showing illumination variations between the images (real pair A). Fig. 2. With a priori known information, [(a) and (b)] two different sets of distinct illumination levels of two input images (i.e., v v ), yielding (c) a masked AID with seven arbitrarily shaped regions; each has its own constant illumination (i.e., J ) assuming far-field light sources and Lambertian surfaces. (a) masked I . (b) masked I . (c) masked I .

=

Then, we present the drawbacks of using the traditional LSE with such a registration model in Section III-B. A more robust estimator using -estimators is proposed in Section III-C. Finally, Section III-D highlights the impacts of mis-segmented areas and how to use segmentation confidence measures to form a weighting function to lessen their negative effects on the GIR. A. ASLIV Registration Model Here, we remind the reader of the ASLIV registration model [13] that will be exploited in our proposed approach. Consider two input images and captured for the same scene under different illumination configurations, following the image formation model outlined in Section II. The different illumination configurations could range from an outdoor scene at two different times of the day/year to a scene where light sources and have have moved or been added or removed. Thus, distinct illumination regions with shapes due to the structure of objects in the scene and the position of the light source(s). The image registration problem can be cast in an LSE framework, such as (1) where (2) and where and and denote the geometric and and , illumination transformations, respectively, between with parameter vectors and to be estimated. The geometric transformation maps pixel in the region of interest defined in image to its corresponding pixel in image . For our purpose, is the field of view common beand . Moreover, the illumination transformation tween with the parameter vector could generally model the created distinct (local) illumination regions between and as (3) where represents the Hadamard product (i.e., an element-wise . You can notice that gain and product) and offset in (3) are matrices instead of scalars or parameterized transformations and are given as .. .

..

.

.. .

.. .

..

.

.. . (4)

=3

=7

We need to explicitly specify that (1) is only valid for all pixels that fall in intersection of the two fields of view between and . Moreover, the geometric transformation between and can be modeled by a motion parameter vector . An obvious problem with and as matrices is that an infiand relating points in nite number of solutions exist for to points in , where , , , respectively. Hence, and to condition a soluconstraints must be placed on tion. Given a scene having some objects with 3-D structure that are illuminated by a far-field light source, some regions with distinct illumination variations are created with a near piecewise constancy in gain and offset when Lambertian surfaces are assumed. This case is common for aerial and satellite imaging. By assuming these illumination changes, we can argue that each image will contain distinct levels of illumination. To exhibit the idea of the ASLIV model, we will start with the case that a priori known information of distinct shading that is levels is available. Fig. 2(a) shows a masked image segmented into three distinct illumination levels , , and (i.e., ). Similarly, Fig. 2(b) depicts the mask of another image in terms of a different set of distinct illumination levels , , and (i.e., ). Then, the mask of an absolute between and has a set of overimage difference (AID) lapping regions , such that [see Fig. 2(c)], where refers to the number of the resulting overlapping regions (i.e., illumination regions). Similarly, in a real imaging scenario, we propose a simple method of segmenting the AID of the roughly aligned images using a segmentation algorithm, such as the -means algorithm [17]. Thus, an estimate of the ilcan be obtained, where . lumination regions Note that the sum of the number of distinct illumination levels ) does not necessarily express the in both images (i.e., number of illumination regions as the latter depends on the former and the intersections among these distinct levels, i.e., . We would like to stress that we assume that each illuhas its own constant gain and offset , mination region where , , which is a reasonable assumption for far-field light sources and Lambertian surfaces in the scene. In addition, we note that, for far-field light sources and Lambertian surfaces, would simply be selected as the number of far-field light sources plus one for ambient light. For example, for a typical daytime outdoor scene when the sun overpowers the is reasonable if lighting and creates shadows, a choice of surfaces are Lambertian. The departure from the assumption of

FOUAD et al.: IMAGE REGISTRATION USING REGION-BASED CONFIDENCE WEIGHTED

Fig. 3. [(a) and (b)] Simulated image pair (34) with three distinct illumination regions (i.e., J ) and (c) the corresponding AID between the overlapping area of the registered images using LS-ASLIV approach.

=3

far-field light sources or Lambertian surfaces would increase , but our experience is that could still be kept relatively small for many image classes, as shown in [13, Sec. 5.3]. can For pixel-domain mathematical manipulations, each , such that be represented by a binary mask

-ESTIMATORS

To further understand how LSE creates additional issues when illumination compensation is attempted during GIR, such as in [12] and [13], let us first focus on the bottom-right quadrant depicted in Fig. 3(c). In this quadrant, we can make a number of observations. We see that remnants of the bottom-right shadow from Fig. 3(a) still exist. Throughout this shadowed region, the AID is larger, on the average, than outside the shadowed region. This observation can be also seen in the other shadowed regions. The interimage local illumination variations affect the behavior of the LSE used in the LS-ASLIV approach. Recall that the influence function (i.e., first derivative) of the LSE is proportional to the residuals without bound, thus being sensitive to outliers. Imperfect illumination compensation at early estimates of the GIR give rise to the interimage illumination variations to act like outliers. In turn, the resulting estimate of is expected to be imprecise and affect the image registration precision. C. General Framework Using

(5)

otherwise . Thus, we can constrain where regions as follows:

and

in (3) to

(6) structuring are

looking

the to

illumination estimate

parameter vector . Therefore, we the unknown vector

in (1), which contains the geometric and illumination parameters, respectively.

1049

-Estimators

The previous subsection illustrates the drawbacks of using LSE within the image registration process. Here, we suggest using an -estimator [18] to refine the estimates of geometric and illumination parameters under the umbrella of the ASLIV registration model. Moreover, we present a generic framework using various -estimators in order to fine tune the unknown parameter vector . Recall that the LSE issue is that it squares both small and large residuals for each illumination region. One solution is to use a robust estimator that is not as susceptible as the LSE to outliers, such as -estimation presented in [18]. Then, the minimization can be reformulated problem in (1) to estimate using an -estimator [18] instead of the LSE, such as (7)

B. LSE Sensitivity to Outliers Using a least-square (LS) cost function such as in (1) has its pitfalls, such as sensitivity to outliers. Here, we discuss and demonstrate this issue of the LSE from an image registration precision perspective. Let us refer to the approach in [8] and [9] as LS-GIM, the approach in [10] as LS-AIM, and the approach in [12] and [13] as LS-ASLIV. It is worth noting that the approaches in [8]–[10], [12], and [13] use an LSE framework. Let us consider the image pair in Fig. 3(a) and (b) with simulated distinct illumination regions (i.e., the somewhat unrealistic cloud shadows). The resulting AID between the overlapping areas of the registered images using the LS-ASLIV approach [12], [13] is shown in Fig. 3(c). The AID reflects how well the two overlapping areas of the aligned images are registered, thereby giving an impression of the GRP and to what extent the interimage illumination variations are compensated. In Fig. 3(c), black is no difference in intensity, whereas nonzero differences are lighter and show some level of mismatch between the aligned images. This mismatch is due to either a subpixel geometric misalignment or a lack of perfect illumination compensation. Similar but worse performance was observed with LS-GIM and LS-AIM.

where is a function that should have the following properties [19, Ch. 4]: ; 1) Positive: and ; 2) Symmetric: 3) Increasing: , : 4) Differentiability: is continuously differentiable. is a residual in matrix at position . where be the derivative of . Differentiating the objecLet tive function w.r.t. the parameters in and setting the partial derivatives to 0 produces a system of equations to estimate . It can be noticed that solving the estimation equations depends on the residuals. In addition, the residuals rely upon the estimated parameters. Therefore, an iterative scheme is required. In Section IV, we present an iterative framework using -estimation with a chosen cost function. Since each illumination region should have its own gain and offset as formulated in (3) and since the LSE is sensitive to outliers, generally speaking, we believe that using a distinct -estimator for each interimage illumination region would yield improvement in the GRP, as well as the illumination compensation, compared with the LSE-based approaches. In other words, we

1050

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

propose an approach that uses the ASLIV model in a framework of multiple -estimators, such that a distinct objective function is assigned to each illumination region as

, where and . Then, a confidence weight is assigned to a region , such that matrix .. .

..

.

.. .

(8) otherwise. where, following (1) and (3), the residuals of each region can be expressed as

(10)

Thus, to include confidence weighting, only (9) needs to be updated to (11)

(9)

D. Confidence Weighting to Reduce Impact of Segmentation Errors As stated in Section III-A, a segmentation algorithm is used to segment the AID of the images to be aligned, yielding a binary mask for each illumination region . This mask is iteratively changed (refined) by applying the currently estimated at iteration or by upscaling the mask to another resolution level when using a coarse-to-fine registration, as shown in [12]. During this iterative process in a coarse-to-fine framework, some of the boundary points of an illumination region may be mis-segmented and may actually belong to an adjacent region. Hence, we propose imposing segmentation confidence weighting so as to decrease the impacts of segmentation errors with the intent of improving the GRP. There may be many solutions to reduce the negative effects of the mis-segmented boundary points of each region. One way is to let the iterative registration process account for the most confident points that are correctly segmented and then impose less weight to those points that are hesitatingly segmented between two adjacent illumination regions. This target may be approached by considering the distance of a point from the boundary of a region, such that the further a point is from the boundary, the more confident we are in that point. However, this approach may require estimating the maximum distance of a point from the boundary, after which some confident points may be considered as nonconfident. Another solution is to let the confidence of the points involved in the registration process rely on the number of iterations that a particular point has remained in a certain segmentation region. In other words, if a point keeps getting segmented into the same region at each iteration, then we should have a higher confidence that we have segmented that point correctly. Similarly, if a point has changed which region it is in, then this should degrade our confidence in the segmentation for that point. This approach would be less computational than the former. To illustrate the idea of the proposed confidence-weighting approach, let us consider two adjacent illumination regions and , is a point that is assumed to be located on the and belong to one of them, and the boundary between and total number of iterations is . At iteration , assume that point has remained iterations in region and iterations in region

where . The minimization problem in (8) and all other related equations still hold without change using the geometric transformation with the motion vector field . We can summarize the proposed registration approach as follows. First, we suggest use a distinct robust estimator, such as an -estimator, for each illumination region to cope with its own residuals to ultimately improve the GRP. Meanwhile, the proposed approach imposes weights in the segmentation using a confidence function to the points located on or near the boundary of each illumination region due to mis-segmentation. IV. PROPOSED APPROACH USING BISQUARE -ESTIMATORS WITH CONFIDENCE WEIGHTING In the previous section, we have presented the proposed approach in a general framework assuming a suitably chosen objective cost function with certain attributes and suitably chosen motion model. Here, we specify a cost function and a motion model to employ in the proposed approach. Moreover, we present a mathematical manipulation that allows us to estimate the unknown vector through an iterative coarse-to-fine scheme. in (8). The First, we have to choose a function to act as can be achieved by many potential functions, properties of such as Tikhonov [20], Geman–McClure [21], Green [22], Hebert–Leahy [23], Rouchouze [24], Charbonnier [25], Huber [18], and the bisquare [26]. Generally speaking, all aforementioned functions and similar ones can be theoretically used by -estimator for each interimage illumination region, as an proposed in Section III-C. As shown in [27], we consider the bisquare function, i.e., (12) where and is a positive tuning threshold. Such adaptively controls the objective function’s bethreshold havior to differently penalize small and large residuals. The advantage of the bisquare function is that its influence function assigns zero weight to the high residuals, which is expected to improve the GRP in (8). Although the bisquare function in (12) can be applied within an -estimator framework, threshold controls the split between small and large residuals in the whole AID, as presented in [27]. A challenge is that a single threshold may be suitable for some regions but unsuitable for others. Therefore, we also

FOUAD et al.: IMAGE REGISTRATION USING REGION-BASED CONFIDENCE WEIGHTED

propose that each illumination region should have its own regions, a certain threshold threshold. Since there are should be assigned to each illumination region (i.e., each region should have its own bisquare -estimator). In this paper, we would like to emphasize that we extend the ordinary formula of the bisquare function in (12) from single threshold as used in [27] to multithreshold (i.e., region-based) as

(13) and denotes an element in mawhere located at position . Note that can be obtrix tained from (11). Then, the bisquare -estimator in (8) can be rewritten as

-ESTIMATORS

Then,

1051

in (11) can be approximated as

(21) by substituting (6), (19), and (20) into (1) using (3) and then obtaining the first-order truncated Taylor series expansion of at and . the pixel location , where Note that other motion models could be also used, where in (21) would need to be accordingly modified, with the limitation that the standard Gauss–Newton method only converges for small-scale problems. Modifications to the Gauss–Newton method for large-scale problems are possible [29], which could allow for motion models with a larger number of free parameters. using the Gauss–Newton method [30, We can estimate Ch.10] to solve the nonlinear minimization problem in (15). Note that is updated by at iteration , such that (22)

(14)

Replace in (15) by its first-order truncated Taylor series expansion. Then, in (15) can be rewritten as

As presented in [28], the Huber cost function could be cast in a convex quadratic function. Similarly, the bisquare function in (14) can be cast as

(23) . Setting the gradient of

where zero, we obtain

w.r.t.

to

(15) The high-residual selective matrix is .. .

..

.

.. .

(24) (16) We can write (24) in matrix notation as

where

(25) where

, ,

(17)

,

and and where d.c. denotes a “do not care” value. The low-residual selective matrix is (26) (18) Following [8]–[12] and [14], we opt to use the six-parameter affine motion model creating the motion vector field , such that (19) (20)

(27) ColVector

(28)

1052

where “ColVector” refers to reshaping the resulting matrix into is an element located at position a column vector and of matrix . Equations (25)–(28) can be used to perform one iteration for to finding a solution of to update in (22). However, in (26) must also be determined. Hence, an iterative framework is proposed by first solving for the unknown vector and then to ultimately determining those distinct illumination regions . These two steps obtain their corresponding binary masks can then be iterated to refine the parameters in and refine the arbitrarily shaped regions and their masks. It is worth noting that segmenting the AID between the two images to be aligned affects the GRP at each iteration. This is because some points of become a part of another region , where a certain region according to the result of the segmentation algorithm is iteratively adapted). In turn, in the current iteration (i.e., the estimates of the geometric and illumination parameters are updated due to (22). Coarse-to-fine framework. It is necessary to realize that our proposed approach exploits the model in (1) and uses a coarse-to-fine framework [16] to cope with large geometric deformations. The main reason for using a coarse-to-fine scheme is to obtain higher GRA, whereas the primary focus of this paper is to improve the precision of that accuracy obtained after a rough registration. A Gaussian pyramid can be generally created in a nondyadic structure. However, to simplify the implementation, we have built a Gaussian pyramid of resolution levels in a dyadic form for input images. The registration process is initialized by setting the unknown vector (an identity initialization, as shown in Section VI-B) at to for a certain the coarsest resolution level and updated by number of iterations per resolution level. The approach turns to the next finer level once one of the following criteria occurs, i.e., either the cost function in (15) is updated by less than a predefined threshold or a maximum number of iterations has been reached. Convergence criteria. When the solution does not diverge, convergence is achieved at the last resolution level using the same criteria as aforementioned. In turn, the illumination pain (5). At rameters are adjusted to the resulting upscaled each iteration, the estimated is used to align the sensed image to the reference image with compensated illumination as in (1). In fact, since an approximate estimate is achieved by iterating at the lowest resolution, the subsequent refinement of that estimate requires only a few iterations at larger resolution levels. The geometric and illumination parameters at each level of the pyramid are accumulated, yielding a single final estimate. We can refer to the proposed approach as weighted -estimators for ASLIVs, i.e., region-based bisquare . In this notation, the number 6 refers to WB-ASLIV the number of unknown affine motion parameters from (19) refers to the number of unknown illumiand (20), while ). nation parameters (i.e., gain and offset for each region Moreover, the proposed approach, i.e., WB-ASLIV, imposes segmentation confidence weights to the boundaries of each interimage illumination region using a confidence approach to lessen the impacts of the misclassified areas. Algorithm 1

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

summarizes the proposed approach for WB-ASLIV that is cast in an iterative coarse-to-fine framework. Algorithm 1 The proposed image registration approach WB-ASLIV 1: Given: two input images Section III-A.

and

, as described in

2: Required: vector that includes the finally estimated geometric and illumination parameters. 3: Initialize:

as the unknown vector by

4:

as the number of resolution levels,

5: resolution level,

as the maximum number of iterations per

6: function update,

as a predefined threshold of the cost

7:

as the number of illumination regions, and

8:

as the iteration counter set to 1.

9: for 10:

to

do

repeat to

.

11:

Apply

12:

Determine AID :

.

13: Apply the -means algorithm to AID to obtain the binary masks of the illumination regions setting to . 14:

Determine residuals

using (21).

15: Determine the boundary layer pixels for each illumination region.

of thickness

16: Determine the confidence-weighting matrix each illumination region using (10). 17: Determine residual using (11).

of

of each illumination region

18: Determine the median absolute residual MAR each illumination region.

of

19: Set the cost function’s (i.e., the bisquare function) of each illumination region to , where threshold MAR . 20: Determine the large-residual selective matrix each illumination region using (16). 21: Determine the small-residual selective matrix of each illumination region using (18). 22: Determine the corresponding value of the cost using (23). function 23:

Determine the cost function update ).

24: Determine the unknown vector’s update using (25).

of

FOUAD et al.: IMAGE REGISTRATION USING REGION-BASED CONFIDENCE WEIGHTED

Fig. 4. Real LANDSAT pair captured at different times of the year (real pair B).

25:

Update

by

-ESTIMATORS

1053

” data set is the 50 simulated image pairs Definition 1: “ constructed using the notations in Fig. 2(a) and (b) as follows: and . These settings means that . Given no intersections between the resulting regions, is set to 3 [e.g., Fig. 3(a) and (b)]. ” data set is the 50 simulated image pairs Definition 2: “ constructed using the notations in Fig. 2(a) and (b) as and . These settings means and . Given no intersections between the that resulting regions, is set to 4. , It is necessary to realize that, following notation ASLIV ASLIV (i.e., “ ”) corresponds to the GIM since both images are assumed to have only global-illumination differences. ”) is a special case discussed Moreover, ASLIV (i.e., “ in [12] and deals with those image pairs having only two regions of distinct illumination levels.

using (22).

26:

.

VI. PERFORMANCE EVALUATION AND IMPLEMENTATION SETUP

27: until

or

Here, we briefly present the performance evaluation measures used to evaluate the competing models in Section VI-A. In addition, the implementation of competing models and an analysis on the stopping conditions are presented in Section VI-B.

28: end for V. DATA SET DESCRIPTION Here, we describe the image data sets used in the experiments that include two categories, i.e., real and simulated. The first category involves a 120 300 natural canyon pair (see Fig. 1) and nine real 400 400 and 600 600 LANDSAT satellite images [31] with unknown ground-truth motion [e.g., Fig. 4(a) and (b)]. Since the real data set does not provide a priori information on the actual number of illumination regions, we assume that each real pair has illumination regions. Then, the proposed model sets to the assumed regions during the alignment process for each real pair. The second category of our data set includes 50 simulated image pairs with different number of regions . The size of each pair is 500 500 pixels extracted from the 3000 3000 pixel IKONOS satellite image for the Pentagon and its surroundings. A “ ” simulated pair is constructed as follows. We have 50 of is geo500 500 reference images. Each reference image metrically deformed using an a priori known affine transformaand by a random tion by a random rotation angle of up to 2-D translation of subpixel accuracy within 6 pixels, in both horizontal and vertical directions. The approach in [32] is used to perform such geometric deformations. Since we are assuming that a rough registration can be obtained before running our algorithm, this Monte Carlo simulation setup provides this initial rough registration of the images. Thus, a sensed image is produced. Then, both sensed and reference images are subjected to varying and levels of illumination changes, similar to those shown in Fig. 3(a) and (b). Thus, varying shaded regions are produced leading to a “ ” simulated pair. Similarly, the rest of the 50 pairs are produced yielding a “ ” data set with different sizes of varying shaded regions. These simulated data sets are used since they allow the ground-truth verification of the estimated parameters to the a priori deformations. In our experiments, we exploit two versions of simulated data sets defined as follows.

A. Performance Evaluation For the simulated data sets with a priori motion information, the average of absolute errors (AAEs) between the estimated affine parameters and the corresponding are computed for each data set. ground-truth values These AAEs are determined to show any mis-estimation in geometric registration using competing models as

AAE

(29)

where is the number of pairs per data set and denotes parameter for pair number . Note that, ideally, the AAEs are all zero for perfect geometric registration. Since the simulated data sets are designed to provide ground-truth verification, we can also evaluate the performance of competing models using full reference measures, such as peak signal-to-noise ratio (PSNR) [33, Ch. 3] and complex wavelet structural similarity (CWSSIM) index [34], where 1 is a perfect match and 0 is a mismatch. While we acknowledge that the PSNR is sensitive to shifts, the sensitivity is largely for shifts above a half-frequency period, and the subpixel shifts that we observe in Section VII are relatively small. The CWSSIM is less sensitive to small shifts [34] and so is suitable for these types of comparisons between registered images with only subpixel misalignments. For the real data set, no ground-truth verification is offered. Thus, we can instead use a closeness measure such as the normalized cross correlation (NCC) [35], by which the correlation between the overlapping areas of the aligned images can be obtained. An NCC value of 1 means perfect correlation and 0 means no correlation.

1054

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

2

AAEs ( 10

TABLE I

) OF THE ESTIMATED AFFINE PARAMETERS OF THE J = 3 AND J = 4 DATA SETS. THE LOWER THE VALUE OF AAES, THE BETTER, WITH THE BEST SHOWN IN BOLD

B. Analysis on Implementation Setup Our implementation runs on a 2-GHz Pentium IV Core 2 Duo, with 2 GB of random access memory. We used a coarse-to-fine scheme [16] based on image pyramids in order to cope with large geometric deformations. Using more resolution levels in the coarse-to-fine scheme tends to improve the accuracy and the resulting precision of the estimates [16]. The biggest disadvantage of correlation-based image registration methods is that they have a limited range of convergence [36]. This subsection analyzes how some critical parameters should be chosen to try to obtain convergence with a reasonable computational time. The proposed model is set with a stopping criteria of either when the cost function update becomes or when a maximum less than a stopping threshold is reached at that level of the number of iterations coarse-to-fine processing. Note that the experiments have been tested on many predefined thresholds down to 0.01 and several number of iterations up to 80 resulting in negligible increases of 0.05%, 0.13%, and 0.16% in NCC, CWSSIM, and PSNR, respectively, with a salient increase in the computational time by factor of 10. The initialization of the parameter vector plays a central role in the registration convergence and in minimizing the computational time. With the simulated data sets, the ranges of the unknown geometric and illumination parameters are ” and already known. For each simulated pair using both “ ” data sets, an experiment of 100 trials was performed “ with a random initialization using those a priori known ranges. This experiment yields an average success rate (ASR) in terms of convergence using random initializations of 93% compared with 100% using an identity initialization, which assumes that the two input images are aligned and no illumination variations exist. Moreover, for the real data set, the ASR using random initializations is 8% compared with 100% using the same identity initialization. These results suggest that a rough preregistration step is important to ensure near 100% convergence. In all subsequent experiments presented, we therefore is initialized either use an identity initialization as follows: by [1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0] when using the ASLIV model or by [1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0] with the ASLIV model. An identity initialization of is equivalent

to assuming that the image pair is already aligned, such as with a rough preregistration. The coarse-to-fine scheme [16], based on image pyramids, is used in order to cope with large geometric deformations and to obtain accurate estimates. The more the resolution levels, the more accurate the estimates [16] (recall that the proposed approach is meant to improve the precision of the accuracy obtained using the coarse-to-fine scheme). We create resolution levels for each real or simulated pair according to the image size. resolution levels for the 500 500 For instance, we used resolution simulated image pairs, although moving to levels tended to degrade the convergence rate of the registration. In the case of the smaller 120 300 real image pair shown . in Fig. 1, we used for the bisquare There are many ways to choose threshold function in (15), as previously shown in Section IV. Folto , where lowing [26], [30], and [37], we can set MAR and MAR is the median absolute residual . For the bisquare function, a at each iteration for region allows 5% of the outliers to be eliminated threshold of assuming Gaussian distribution on the residuals. Note that is employed in the WB-ASLIV approach. Similarly, threshold for a Huber -estimator (used in the HM-ASLIV approach to allow 5% of the outliers to be [14]) can be set to eliminated, assuming the Gaussian distribution [37] on the residuals, where is the standard deviation on the residuals at each iteration. While using a segmentation algorithm in the proposed approach, i.e., WB-ASLIV, some deviations would occur in each illumination region boundary. In real imaging scenarios, these deviations cannot be precisely determined. Thus, we need to run an experiment to test the robustness of the WB-ASLIV approach against segmentation errors as opposed to competing approaches using simulated data sets. Some morphological operations (i.e., dilation and erosion) can be applied to the boundary of each region to simulate such segmentation deviations. Each deviation, referred to as segmentation noise level (SNL), can be simulated as follows. First, a dilation step is applied along the mask boundary. Some of the dilated pixels around the region boundary are randomly kept. Then, in a similar manner, an erosion step is performed on the region boundary. Moreover, some of the eroded pixels around the region boundary are randomly

FOUAD et al.: IMAGE REGISTRATION USING REGION-BASED CONFIDENCE WEIGHTED

-ESTIMATORS

1055

TABLE II PSNR, CWSSIM, AND EXECUTION TIME (IN SECONDS) FOR A RANDOM SUBSET OF THE J AND J DATA SETS. THE HIGHER THE PSNR OR CWSSIM VALUES, THE BETTER, WITH THE BEST SHOWN IN BOLD

=3

removed. Those two steps, i.e., dilation and erosion, are repeated for a number of iterations. We define each SNL by the summaand , which denote the number tion of two parameters as of dilated and eroded pixels, respectively. The results of this experiment will be shown in Section VII-C. VII. EXPERIMENTS AND RESULTS Here, we report the experimental results obtained using competing approaches applied on both simulated and real pairs in Sections VII-A and VII-B, respectively. Moreover, we present an experiment showing the resistance of the proposed approach against simulated segmentation deviations in Section VII-C. In our experiments, we compared the proposed approach, i.e., WB-ASLIV, to the following comparators: 1) An approach that employs the GIM with an LSE as presented in [8] and [9] and its implementation can be found online at [38]; referred to as LS-GIM. Although

2)

3)

4)

5)

=4

LS-ASLIV is equivalent to LS-GIM in [8] and [9], we use the code at [38] for this comparison. A second approach that employs the AIM with a LSE as presented in [10]; referred to as LS-AIM. The code for LS-AIM was implemented by ourselves. A third approach that exploits the ASLIV registration model as presented in [13] with an LSE; referred to as LS-ASLIV. This approach corrects for interimage illumination variations during the registration. A fourth approach that uses the ASLIV model with the ordinary formula of the Huber -estimator as introduced in [14]; referred to as HM-ASLIV. This approach corrects for interimage illumination variations and uses an -estimator during the registration. A fifth approach that uses the ASLIV model with the ordinary formula of the bisquare -estimator as introduced in [27]; referred to as BM-ASLIV. This approach corrects

1056

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

= 3” simulated pair in Fig. 3(a) and (b); pair 34. The darker the AID, the better the approach = 23 35 dB. (c) LS-ASLIV ; PSNR = 27 01 dB. (d) HM-ASLIV ; PSNR = 27 76 dB. = 28 45 dB. (g) WB-ASLIV ; PSNR = 29 57 dB.

Fig. 5. Normalized AIDs of the resulting aligned pairs for the “J dB. (b) LS-AIM; PSNR performance. (a) LS-GIM; PSNR : : dB. (f) W -ASLIV ; PSNR (e) BM-ASLIV ; PSNR

= 21 91 = 28 12 1

:

:

for interimage illumination variations and uses an -estimator during the registration. 6) A sixth approach that is identical to the WB-ASLIV apin (11) are set to 1 proach, except that all elements of (i.e., no confidence weighting); referred to as W1-ASLIV. This approach corrects for interimage illumination variations and uses an -estimator with multiple thresholds for different regions during the registration. A. Experiments on Simulated Data Sets The first set of experiments is performed on the simu” and “ ” using the LS-GIM, lated data sets “ LS-AIM, LS-ASLIV, HM-ASLIV, BM-ASLIV, W1-ASLIV, and WB-ASLIV approaches. Table I shows that the proposed approach, i.e., , provides more precise motion estimates WB-ASLIV , compared with the LS-GIM, LS-AIM, LS-ASLIV , BM-ASLIV , and W -ASLIV HM-ASLIV approaches by an average of 90.4%, 86.5%, 33.7%, 18.4%, ” data set 12.6%, and 5.9%, respectively, using the “ and by averages of 86.9%, 81.5%, 29.2%, 15.8%, 10.9%, and ” data set. An important 5.6%, respectively, using the “ point to make is that, in all individual runs in our experiments, the rank-order-based absolute error (AE) for each component in Table I was always (from best to worst) WB-ASLIV, W1-ASLIV, BM-ASLIV, HM-ASLIV, LS-ASLIV, LS-AIM, and LS-GIM. To show the efficiency of the WB-ASLIV approach, let us have a look at Table II, which shows results for a random subset of individual runs. For the “ ” data set, Table II

:

:

:

shows that the proposed approach, i.e., WB-ASLIV , outperforms the LS-GIM, LS-AIM, LS-ASLIV , HM-ASLIV , BM-ASLIV , and W -ASLIV approaches by averages of 27.5%, 23.4%, 5.3%, 3.2%, 2.6%, and 1.2%, respectively, in terms of PSNR and by averages of 14.8%, 11.2% 2.6%, 1.2%, 1.1%, and 0.7%, respectively, in terms of CWSSIM. The improvement of WB-ASLIV comes with a reasonable increase in computational time by averages of 33.1%, 29.2%, 11.6%, 4.7%, 3.2%, and 0.5%, respectively, as compared with competing approaches. An important point to make is that, in all individual runs in our experiments, the rank-order-based AE for each component in Table II was always (from best to worst) WB-ASLIV, W1-ASLIV, BM-ASLIV, HM-ASLIV, LS-ASLIV, LS-AIM, and LS-GIM. Recall that these image quality measures express some measure of similarity between the overlapping region of the registered pairs. ” data set, Table II shows Moreover, for the “ that the proposed approach, i.e., WB-ASLIV , outperforms the LS-GIM, LS-AIM, LS-ASLIV , HM-ASLIV , approaches by averages BM-ASLIV , and W -ASLIV of 27.1%, 23.3%, 5.1%, 2.9%, 2.1%, and 1.0%, respectively, in terms of PSNR and by averages of 14.5%, 11.5%, 2.4%, 1.1%, 0.9%, and 0.6%, respectively, in terms of CWSSIM. A reasonable increase in computational time can be noticed with the WB-ASLIV approach by averages of 33.4%, 30.5% 12.0%, 5.1%, 3.4%, and 0.9%, respectively, as compared with competing approaches. Fig. 5(a)–(g) shows the normalized AIDs of the aligned ” simulated pair shown in image pairs of the “ Fig. 3(a) and (b), i.e., pair 34, using competing approaches,

FOUAD et al.: IMAGE REGISTRATION USING REGION-BASED CONFIDENCE WEIGHTED

NCC VALUES FOR REAL DATA SET k

-ESTIMATORS

1057

TABLE III

= J = 3. NOTE THAT PAIRS A AND B REFER TO FIGS. 1(A) AND (B) AND 4(A) AND (B), RESPECTIVELY. RECALL THAT THE HIGHER THE NCC VALUE, THE BETTER THE PERFORMANCE OF THE APPROACH

Fig. 6. Gamma-corrected AIDs for the real pair in Fig. 4(a) and (b); = 1:7. The darker the AID, the better the approach performance. (a) LS-GIM; NCC = 0:9642. (b) LS-AIM; NCC = 0:9691. (c) LS-ASLIV ; NCC = 0:9789. (d) HM-ASLIV ; NCC = 0:9802. (e) BM-ASLIV ; NCC = 0:9809. (f) W1-ASLIV ; NCC = 0:9818. (g) WB-ASLIV ; NCC = 0:9857.

yielding CWSSIM and

, respectively, and PSNR and dB, respectively. It is worth noting that many areas in Fig. 5(g) are darker than their corresponding areas in Fig. 5(a)–(f), particularly inside each illumination region and around its boundary layer. B. Experiments on Real Data Set

Our next set of experiments were performed on the real data set, as described in Section V, using the LS-GIM, LS-AIM, LS-ASLIV , HM-ASLIV , BM-ASLIV , W -ASLIV , and WB-ASLIV approaches with the -means algorithm for . Table III shows NCC values segmentation setting that express the correlation between the overlapping areas

of the two aligned images using competing approaches. The WB-ASLIV approach surpasses the competing approaches. Fig. 6(a)–(g) shows gamma-corrected AIDs of the registered pairs of the real pair in Fig. 4(a) and (b), i.e., pair B, using the LS-GIM, LS-AIM, LS-ASLIV , HM-ASLIV , BM-ASLIV , W -ASLIV , and WB-ASLIV approaches, with NCC and , respectively ( for better ). One can notice that Fig. 6(g) has visualization, many areas that are darker than the corresponding areas in Fig. 6(a)–(f). Enlarged views of the white boxes in Fig. 6(a)–(g) are shown in Fig. 7(a)–(g), respectively, for further comparison. Since the AIDs may not display well in all reproductions, we also present the histograms for each AID from Fig. 6 in Fig. 8. The points in Fig. 8 represent bins of width of 5 for the AID. Although the histograms do not show the spatial distribution

1058

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

=17

=3

Fig. 7. Enlarged views of the white boxes in –(g), respectively; : and J . The darker the AID, the better the approach performance. (a) LS-GIM. (b) LS-AIM. (c) LS-ASLIV . (d) HM-ASLIV . (e) BM-ASLIV . (f) W -ASLIV . (g) WB-ASLIV .

1

Fig. 8. Histogram distributions for the AIDs from Fig. 6.

of the AIDs, the histograms do show how the distribution of AIDs change across the compared approaches. Of particular note is that higher AIDs are pushed to lower values as we progress through techniques LS-GIM, LS-AIM, LS-ASLIV , HM-ASLIV , BM-ASLIV , and W -ASLIV up to the final proposed technique, i.e., WB-ASLIV . This real example gives an impression on how well the WB-ASLIV approach yields improvements in GRP and illumination compensation, as well as how the AID is improved. C. Experiments on Simulated Segmentation Deviations Here, we demonstrate the robustness of the WB-ASLIV approach against possible segmentation deviations compared with

Fig. 9. AE of the affine parameter a using the LS-ASLIV , HM-ASLIV , BM-ASLIV , W -ASLIV , and WB-ASLIV approaches, respectively, at different SNLs. Both LS-GIM and LS-AIM approaches are not included for better visualization. The lower the value of AEs, the better the approach performance.

1

competing approaches. Changing the segmentation deviations in real imaging is unavailable as there is no a priori information on the exact illumination region segmentation. Therefore, some morphological operations (dilation and erosion) are applied on the boundary of the a priori illumination regions in the simulated data sets to simulate the segmentation deviations (see Section VI-B). ” data set This experiment is performed on the “ using multiple SNLs using the LS-ASLIV , HM-ASLIV , BM-ASLIV , W -ASLIV , and WB-ASLIV approaches.

FOUAD et al.: IMAGE REGISTRATION USING REGION-BASED CONFIDENCE WEIGHTED

Fig. 9 shows the AEs of the estimated affine parameters compared with their ground-truth values of the six affine parameters versus multiple SNLs. Note that each SNL corresponds to a . certain summation of eroded and dilated pixels The LS-GIM and LS-AIM approaches are considered in the experiment; however, they are not shown in the corresponding figure for better visualization. In Fig. 9, one can notice that the AE using the WB-ASLIV approach is significantly less than those of using other approaches while increasing the SNL. One can also observe that the slope of the AE using the WB-ASLIV approach is less than those of other approaches while the SNL increases. This observation means that the proposed approach, i.e., WB-ASLIV, provides more resistance than competing approaches versus segmentation deviations by average increases of 90.8%, 83.4%, 36.5%, 21.0%, 16.4%, and 7.6%, respectively. VIII. CONCLUSION In this paper, we have presented an image registration approach to cope with images having arbitrarily shaped distinct illumination variations. In general, the proposed approach exploits different -estimators. In theory, each -estimator employs a distinct cost function with certain properties. It is preferable for a candidate cost function to have a threshold or a tuning parameter by which the residual’s penalty could be controlled and an illumination region would be segmented into small and large residuals. The proposed approach is cast in an iterative coarse-to-fine scheme. The proposed approach imposes a weighting function to decrease the impact of the possibly mis-segmented pixels located near the boundary of illumination regions. With no need for any feature-based approach for initialization, the proposed approach converges well when a rough registration is available. The GRP has been significantly jointly improved with illumination compensation using the proposed approach, as opposed to competing ones, on both real and simulated data sets with only reasonable increase in the computational time. Despite the robustness of the proposed approach against arbitrarily shaped interimage illumination variations and segmentation deviations, the approach is limited by certain constraints that are related to the camera model followed, the scene, the objects in the scene, the captured images, etc. The proposed approach has been also formulated such that many functions could be selected for many of the subparts. However, to implement the resulting mathematical equations, some functions and models have been specified, such as the affine motion model and the bisquare function. REFERENCES [1] B. Zitová and J. Flusser, “Image registration methods: A Survey,” Image Vis. Comput., vol. 21, no. 11, pp. 977–1000, Oct. 2003. [2] L. Lucchese, S. Leorin, and G. M. Cortelazzo, “Estimation of two-dimensional affine transformations through polar curve matching and its application to image mosaicking and remote-sensing data registration,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 3008–3019, Oct. 2006. [3] M. Gevrekci and B. K. Gunturk, “Superresolution under photometric diversity of images,” Adv. Signal Process., vol. 2007, no. 1, p. 205, Jan. 2007. [4] Y. Wang and L. Xu, “A global optimized registration algorithm for image stitching,” in Proc. IEEE CISP, May 2008, vol. 3, pp. 525–529.

-ESTIMATORS

1059

[5] D. P. Roy, “The impact of misregistration upon composited wide field of view satellite data and implications for change detection,” IEEE Trans. Geosci. Remote Sens., vol. 38, no. 4, pp. 2017–2032, Jul. 2000. [6] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 90–110, Nov. 2004. [7] L. Lou, F. Zhang, C. Xu, F. Li, and M. Xue, “Automatic registration of aerial image series using geometric invariance,” in Proc. IEEE ICAL, Sep. 2008, pp. 1198–1203. [8] S. Periaswamy and H. Farid, “Elastic registration in the presence of intensity variations,” IEEE Trans. Med. Imag., vol. 22, no. 7, pp. 865–874, Jul. 2003. [9] P. Aguiar, “Unsupervised simultaneous registration and exposure correction,” in Proc. IEEE ICIP, Jun. 2006, pp. 361–364. [10] Y. Altunbasak, R. M. Mersereau, and A. J. Patti, “A fast parametric motion estimation algorithm with illumination and lens distortion correction,” IEEE Trans. Image Process., vol. 12, no. 4, pp. 395–408, Apr. 2003. [11] A. Bartoli, “Groupwise geometric and photometric direct image registration,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 12, pp. 2098–2108, Dec. 2008. [12] M. M. Fouad, R. M. Dansereau, and A. D. Whitehead, “Geometric registration of images with arbitrarily-shaped local intensity variations from shadows,” in Proc. IEEE ICIP, Nov. 2009, pp. 201–204. [13] M. M. Fouad, R. M. Dansereau, and A. D. Whitehead, “Geometric image registration under arbitrarily-shaped locally variant illuminations,” in Proc. Signal Image Video Process., Aug. 2010, pp. 10–18. [14] M. M. Fouad, R. M. Dansereau, and A. D. Whitehead, “Geometric image registration under locally variant illuminations using Huber -estimator,” in Proc. ICISP, A. Elmoataz, O. Lezoray, F. Nouboud, D. Mammass, and J. Meunier, Eds., Heidelberg, Germany, Jun. 2010, vol. 6134, LNCS, pp. 10–18, Springer-Verlag. [15] K. Arya, P. Gupta, P. Kalra, and P. Mitra, “Image registration using robust -estimators,” Pattern Recognit. Lett., vol. 28, no. 15, pp. 1957–1968, Nov. 2007. [16] J. Jiang, S. Zheng, A. W. Toga, and Z. Tu, “Learning based coarse-to-fine image registration,” in Proc. IEEE ICCVPR, Jun. 2008, pp. 1–7. [17] T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, and R. Silverman, “An efficient -means clustering algorithm: Analysis and implementation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 881–892, Jul. 2002. [18] P. J. Huber, Robust Statistics, 1st ed. New York: Wiley, 1981. [19] S. Z. Li, Markov Random Field: Modeling in Computer Vision, 2nd ed. New York: Springer-Verlag, 2000. [20] A. Tikhonov, “Regularization of incorrectly posed problem,” Soviet Mathematic Dolk., vol. 4, no. 6, pp. 1624–1627, 1963. [21] S. Geman and D. E. McClure, “Bayesian image analysis: An application to single photon emission tomography,” in Statistical Computation Section. Alexandria, VA: Amer. Stat. Assoc., 1985, pp. 12–18. [22] P. J. Green, “Bayesian reconstructions from emission tomography data using a modified EM algorithm,” IEEE Trans. Med. Imag., vol. 9, no. 1, pp. 84–93, Mar. 1990. [23] T. Hebert and R. Leahy, “A generalized EM algorithm for 3-D Bayesian reconstruction from Poisson data using Gibbs priors,” IEEE Trans. Med. Imaging, vol. 8, no. 2, pp. 194–202, Jun. 1989. [24] B. Rouchouze, P. Mathieu, T. Gaidon, and M. Barlaud, “Motion estimation based on Markov random fields,” in Proc. IEEE ICIP, Nov. 1994, vol. 3, pp. 270–274. [25] P. Charbonnier, L. Blanc-Féraud, G. Aubert, and M. Barlaud, “Deterministic edges-preserving regularization in computed imaging,” IEEE Trans. Image Process., vol. 6, no. 2, pp. 289–311, Feb. 1997. [26] J. Fox, Robust regression Sage, Thousand Oaks, CA, Tech. Rep., Jan. 2002. [27] M. M. Fouad, R. M. Dansereau, and A. D. Whitehead, “Image registration under local illumination variations using robust bisquare -estimation,” in Proc. IEEE ICIP, Sep. 2010, pp. 917–920. [28] O. L. Mangasarian and D. R. Musicant, “Robust linear and support vector regression,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 9, pp. 950–955, Sep. 2000. [29] G. Fasano, F. Lampariello, and M. Sciandrone, “A truncated nonmonotone gauss-newton method for large-scale nonlinear least-squares problems,” Comput. Optim. Appl. vol. 34, no. 3, pp. 343–358, Jul. 2006. [30] J. Nocedal and S. Wright, Numerical Optimization, 3rd ed. New York: Springer-Verlag, 1999. [31] [Online]. Available: http://vision.ece.ucsb.edu/registration/satellite/

M

M

k

M

1060

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012

[32] M. Guizar-Sicairos, S. T. Thurman, and J. R. Fienup, “Efficient subpixel image registration algorithms,” Opt. Lett., vol. 33, no. 2, pp. 156–158, Jan. 2008. [33] A. Bovik, Image and Video Process, 2nd ed. New York: Academic, 2000. [34] Z. Wang and E. P. Simoncelli, “Translation insensitive image similarity in complex wavelet domain,” in Proc. IEEE ICCASP, Mar. 2005, vol. II, pp. 573–576. [35] D. Barnea and H. Silverman, “A class of algorithms of fast digital image registration,” IEEE Trans. Comput., vol. C-21, no. 2, pp. 179–186, Feb. 1972. [36] R. Szeliski, “Image alignment and stitching: A tutorial,” Found. Trends Comput. Graph. Vis., vol. 2, no. 1, pp. 1–104, Jan. 2006. [37] P. J. Huber, Robust Statistics Procedures, 2nd ed. Bayreuth, Germany: SIAM, 1996. [38] [Online]. Available: http://users.isr.ist.utl.pt/~aguiar/mosaics

Mohamed M. Fouad (S’09–M’11) received the B.Eng. (honors, with great distinction) and M.Eng. degrees from the Military Technical College (MTC), Cairo, Egypt, in 1996 and 2001, respectively, and the Ph.D. degree from Carleton University, Ottawa, ON, Canada, in 2010, all in electrical engineering. He is currently a Staff Member with the Department of Computer Engineering, MTC. His research interests are in online handwritten recognition, image registration, image reconstruction, super-resolution, and video coding.

Richard M. Dansereau (S’96–M’01–SM’10) received the B.Sc. and Ph.D. degrees in computer engineering from the University of Manitoba, Winnipeg, MB, Canada, in 1995 and 2001, respectively. From 1996 to 2000, he was a Researcher with the Telecommunications Research Laboratories, dealing with wavelet and fractal image analysis. From 1999 to 2000, he was also with SpectraWorks, Inc., developing Motion Pictures Expert Group DB satellite communication systems. From 2000 to 2001, he was an Instructor and a Research Engineer with the Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, where he was also with the Center for Signal and Image Processing. Since September 2001, he has been with the Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada, where he is currently an Associate Professor.

Anthony Whitehead (SM’98–M’03) received the B.Sc. and Ph.D. degrees in computer science from Carleton University, Ottawa, ON, Canada, in 1996 and 2004, respectively. He has worked in the industry for several years and with the National Research Council of Canada prior to working toward his academic career. Since 2005, he has been with the School of Information Technology, Carleton University, where he is currently the Director. His current research interests are in the areas of computer vision, video analysis, and pattern matching.