Gradient Domain Layer Separation under ... - Semantic Scholar

2 downloads 4152 Views 1MB Size Report
information for full reconstruction of separated layers by ... image registration [20] or layer separation [13, 14, 15, 4]. .... Mutual Information in Gradient Domain.
Gradient Domain Layer Separation under Independent Motion Yunqiang Chen, Ti-Chiun Chang, Chunxiao Zhou and Tong Fang Siemens Corporate Research, 755 College Rd. East, Princeton, NJ 08540 {Yunqiang.Chen, Ti-Chiun.Chang, Chunxiao.Zhou.ext, Tong.Fang} AT siemens.com

Abstract Multi-exposure X-ray imaging can see through objects and separate different material into transparent layers. However, layer motion makes the separation task underdetermined. Instead of aligning the non-rigid motion, we address the layer separation problem in gradient domain and propose an energy optimization framework to regularize it by explicitly enforcing independence constraint. It is shown that gradient domain allows more accurate and robust independence analysis between non-stationary signal using mutual information (MI) and hence achieves better separation. Furthermore, gradient fields contain sufficient information for full reconstruction of separated layers by solving the Poisson Equation. For efficient regularization of the gradient separation, energy terms based on the Taylor expansion of MI is further derived. Evaluation on both synthesized and real datasets proves the effectiveness of our algorithm and its robustness to complex tissue motion.

1. Introduction Images consisting of transparent layers are often observed in many applications such as emerging X-ray imaging devices, security surveillance, or industrial inspection and pose great challenges for image analysis. Some recent research addresses transparency in stereo estimation [16], image registration [20] or layer separation [13, 14, 15, 4]. In this paper, we mainly address layer separation in twoexposure X-ray imaging scenario where layers move between exposures, resulting in an under-determined problem. The proposed algorithm might also be extended to other source separation applications when there are fewer observations than sources. In X-ray imaging, high energy photons can pass through objects before being absorbed. This allows detailed imaging of the internal structures/organs for inspection or medical diagnosis. Unlike most photographic images, internal structures of a 3D object are projected and overlaid onto a 2D image in which different structures appear as superimposed semi-transparent layers. Since X-rays at different

spectra are absorbed at different ratio by different material, it is possible to separate the superimposed transparent layers if we have dual-energy exposures of the object, thereby allowing more accurate inspection and visualization. For imaging human bodies, there are mainly two types of tissues (i.e., layers): bone and soft-tissue [19]. The logarithmically corrected X-ray images can be treated as weighted additive mixtures of different layers. Assuming there is no motion between two exposures, the dual energy images I1 and I2 can be formulated as follows: I1 = aB + bS I2 = cB + dS

(1)

where constants a, b, c, and d are the known absorption coefficients of the bone B and soft tissue S to different Xray spectra. The B and S layers can then be simply reconstructed by weighted subtraction between I1 and I2 . In practice, however, motion is usually inevitable between two exposures. To make things worse, different layers usually moves differently. In chest imaging, for example, rib (i.e., bone) motion is caused by breathing while heart (soft-tissue) is constantly beating. Bone motion might be stopped by holding breath or compensated by registration algorithms. However, soft-tissue has different motion and cannot be easily compensated. In this paper, we assume that the bone layer is already aligned in a preprocessing step but soft-tissue has residual motion. This simplified model still captures most of the challenges caused by moving layers but allows us to focus on the layer separation task. This leads to the following more realistic image formation model that accounts for independent soft-tissue motion: I1 = aB + bS I2 = cB + dST

(2)

where ST is the deformed soft tissue layer. Since ST = S, the layer separation task becomes under-determined with three unknown layers but only two observations. To the best of our knowledge, the start-of-the-art dualenergy applications or digital subtraction imaging rely on weighted subtraction between bone-aligned images; but they cannot handle independent layer motion ([7, 21]). The 694

2009 IEEE 12th International Conference on Computer Vision (ICCV) 978-1-4244-4419-9/09/$25.00 ©2009 IEEE

soft-tissue is estimated as Sˆ = cI1 − aI2 = cbS − adST , which results in superposition of the deformed soft-tissue layer ST over the true solution S as we will see in the experiments. To compensate motion induced errors, a layer separation method assuming known deformation between S/ST is proposed in [15], and the deformation of each layer might be estimated as described in [1]. In [20], an iterative optimization approach is proposed to address layer separation and registration jointly. However, 2D image registration in general is not suitable for this task because soft tissue is moving in 3D space and cannot necessarily be registered by a 2D deformation field. In [14], Sarel and Irani consider the problem of separating transparent layers in an image sequence where one layer has repetitive dynamic behaviors. Relying on the intrinsic image estimation proposed in [18], the algorithm requires a large number of image frames to perform median filtering and hence cannot be readily extended to dual energy imaging scenario where only two images are acquired. A tissue classification and separation method is proposed in [4], which utilizes eigen-analysis of structure tensor under the sparseness assumption. In this paper, we take a very different approach. Without assuming any pre-trained prior layer models or relying on soft-tissue registration, we employ the independence constraint between the layers to design a more general layer separation algorithm. Independence constraint is widely assumed in most layer separation algorithms. However, it is usually not fully exploited. Comparing related work in the literature, we introduce the following novelties: • an optimization framework for explicitly enforcing the independence constraint to regularize the underdetermined problem in Eq. (2), • an equivalent task in gradient domain for layer separation, where we can achieve more accurate independence analysis between non-stationary layers based on mutual information and then fully reconstruct each layer after the gradient fields are separated, • an efficient energy minimization scheme for the computationally expensive mutual information terms based on Taylor-expansion, and • an algorithm that assumes no domain-specific knowledge or prior models and requires no motion estimation, thus demonstrating robustness to complicated non-rigid deformation or even motion in the 3D space. The rest of this paper is organized as follows. Section 2 describes the derivation and implementation of the proposed separation algorithm. Section 3 provides comparison and evaluation of the new algorithm based on both synthetic data and real images. Concluding remarks are given in Section 4.

2. Layer Separation in Gradient Domain To better understand our optimization framework, it is helpful to first elaborate, in Section 2.1, the context in which the independence constraints regularize the solution. We then justify in Section 2.2 that using mutual information in the gradient domain leads to more accurate measure of independence. In Section 2.3, the final cost function is derived based on the above two ingredients. Section 2.4 describes an efficient method to solve the cost function. Finally, each layer is reconstructed from the separated gradient fields by solving the Poisson Equation in Section 2.5.

2.1. Independence Constraint To regularize under-determined problems, Maximum a Posteriori (MAP) estimation based on generative models has achieved great success in many computer vision tasks [8]. In layer separation, according to Eq. (2), layer S and layer ST can be directly computed if we know layer B. A direct application of traditional MAP estimation assuming independent layers would look like the following: ˆ = arg max P (B)P (I1 , I2 |B) B (3) B

I2 − cB I1 − aB )PST ( ) b d where PB (), PS () and PST () are the prior probability distribution densities of different layers. In our case, however, this generative model based formulation cannot help much because different layers can have very similar statistical properties. Furthermore, we don’t want to restrict our algorithm to a specific domain with pretrained layer appearance models. To design a more general layer separation algorithm, we have a more detailed study on the usually assumed independence constraint. Independence constraint is the basis for factoring the joint distribution in almost all MAP frameworks. However, it is then usually ignored and rarely explicitly enforced or validated. The final result is not guaranteed to satisfy the independence constraint. With careful analysis below, we can see that the independence constraint can be very powerful when no accurate models are available or different layers cannot be readily discriminated due to similar statistical properties. Assuming we have an inaccurate estimation of the bone ˆ = B + EB (where EB is the estimation error). layer, say B ˆ Plugging B into Eq. (2), we can estimate the soft tissue layˆ ˆ ers, i.e., Sˆ = (I1 − aB)/b and SˆT = (I2 − cB)/d. The ˆ ˆ estimation error of S and ST is as follows: ES = Sˆ − S = −aEB /b (4) ES = SˆT − ST = −cEB /d = arg max PB (B)PS ( B

T

We can see that the estimation error of the bone layer EB is directly reflected in the estimation error of soft tissue layers, which results in increased dependence between the estimated layers. Hence, explicitly enforcing independence 695

ˆ / Sˆ and B ˆ / SˆT should reduce the estimation between B error and effectively regularize the layer separation task. It is worth noting that each layer usually contains nonstationary signal. Measuring independence accurately and robustly for them is not trivial. Mutual information has been successfully used to measure dependence/similarity in image registration [11, 9], stereo correspondence [6] and image restoration [3]. But as pointed out in [13], it might not be optimal when dealing with non-stationary signal. Sarel and Irani propose a new metric based on generalized normalized gray-scale correlation. However, its behavior is not as extensively studied as mutual information. Instead of designing new metrics for independence, we propose to convert the layer separation task into a different feature space rather than working on traditional grayscale values. The new feature space should: • allow more accurate mutual information analysis; • provide a complete representation so that the separated layers can be reconstructed from this feature space. Gradient feature space satisfies these requirements as we elaborate in the next sub-sections.

2.2. Mutual Information in Gradient Domain Gradient is an important feature in image processing, which emphasizes the change occurred in the image. Most importantly, gradient field contains sufficient information to reconstruct the original image, achieved by solving the Poisson Equation with a given boundary condition. Some recent image processing techniques (e.g., [5, 10, 17])) rely on this property to convert the image editing and processing into gradient domain and fully reconstruct the desired image from the processed gradient field. The layer separation task can be converted into gradient domain by taking gradient operator on both sides of Eq. (2). Due to the linearity of the gradient operator, the equations still hold. Instead of working on grayscale values, we now need to separate the gradient into separate layers. Our previous reasoning about regularizing the underdetermined problem by independence analysis is still valid in gradient domain. After the gradient is separated into different layers, we can reconstruct each layer from the separated gradient fields by solving the Poisson Equation. Different from the original task where we need to do independence analysis on image grayscale values, the gradient separation task requires independence analysis on the gradient features. One way to look at the independence in the gradient domain is that the tissue density change (i.e., gradient) in one layer should be irrelevant to the density change in the other layer. Subsequent analysis and comparison demonstrate that mutual information based on gradient features provides much more accurate and robust measure

of independence than those based on grayscale values for spatial variant signal. Mutual information (MI) is defined based on the probability distribution of two images X and Y as follows:  Px,y (x, y) M I(X, Y ) = Px,y (x, y) log , (5) Px (x)Py (y) x y where Px (x) and Py (y) are, respectively, probability density functions of X and Y . The joint density of X and Y is denoted by Px,y (x, y). Mutual information can be sensitive to non-stationarity and be biased by some partially overlapped structures if the probability distribution is based on image graysclae values. When based on the distribution of the gradient features, on the other hand, it can measure even a slight increase of dependence caused by the estimation error in Eq. (4) much more accurately and robustly. To compare the accuracy and robustness of mutual information based on grayscale values and gradient features, we use the images in our experiments to obtain quantitative measures. To simulate the error terms in Eq. (4), the first image is created by subtracting an error image from the bone image (Fig. 4 (a)). The second image is obtained by adding the same error image to the soft tissue image (Fig. 2 (b)). The error image is selected to be the typical error introduced by simple weighted subtraction (i.e., S − ST ) and is multiplied by a coefficient α. If α = 0, there is no error and we should have the minimum mutual information. As the magnitude of alpha increases, the mutual information should increase due to increasingly significant error term. The two mutual information terms are tested for a large range of α values from −0.5 to 0.5 as shown in Fig. 1. For the purpose of comparison, the max value is normalized to 1 for both curves. In the entire tested range, the gradient domain mutual information (blue solid line in Fig. 1) shows a clearly defined minimum at α = 0 and consistently increases with

Figure 1. Comparison between mutual information based on grayscales and gradient respectively: red dashed line indicate grayscale mutual information. Blue solid line indicate gradient mutual information. The gradient MI shows very clear defined minimum at α = 0 while grayscale MI shows much more inconsistent behavior.

696

the absolute value of α. The mutual information based on grayscale values (red dashed line in Fig. 1) shows a similar trend in large scale. However, when close to the optimal solution (i.e., alpha ≈ 0), it fails to provide a correct and well defined minimum. From this comparison, we can see that mutual information in gradient domain is preferable as an optimization criterion to regularize our layer separation task. Next, we describe our new optimization framework for layer separation.

2.3. Energy Optimization for Gradient Separation To work in gradient domain, the input images I1 and I2 are first processed by a gradient operator ∇. Based on the linearity of the gradient operator, Eq. (2) can be reformulated as follows: ∇I1 = a∇B + b∇S ∇I2 = c∇B + d∇ST

(6)

Instead of directly estimating each layer’s grayscale values, we now need to estimate the gradient field of the three ˆ B ≈ ∇B, G ˆ S ≈ ∇S, and G ˆ S ≈ ∇ST . layers, i.e., G T The analysis in subsection 2.1 still holds in gradient domain and independence analysis between the estimated gradient ˆB , G ˆ S and G ˆ S ) should be able to help detecting fields (G T and even reducing the estimation errors. As mentioned before, different layers can have very similar statistical properties and hence difficult to be discriminated by generative models. Instead of relying on any pre-trained domain-specific generative models, we design an optimization framework by explicitly enforcing independence constraint in gradient domain. The gradient separaˆ B that tion problem is equivalent to finding the optimal G minimizes the following objective function: ˆB , G ˆ S ) + M I(G ˆB , G ˆ S ) + λe e(− C = min M I(G T ˆB G

ˆ 2 G B σ2

)

(7) The 1st and 2nd terms are the mutual information terms that enforce the independence between the separated graˆ S = (∇I1 − aG ˆ B )/b and dient fields and we know that G ˆ S = (∇I2 − cG ˆ B )/d. As pointed out, incorrect gradiG T ent separation will increase the mutual information between ˆ B and G ˆ S , or G ˆ B and G ˆ S . Minimizing the mutual inG T formation terms would reduce the separation error. The last energy term in the objective function is important to prevent a trivial solution where bone gradient is all zero (i.e., ˆ B = 0, G ˆ S = ∇I1 /b and G ˆ S = ∇I2 /d). This term apG T plies a penalty λe for pixels that have zero bone gradient. It favors the solution where estimated bone gradient is strong and still satisfy the independence constraints. Mutual information is usually very computationally expensive to optimize. For efficient optimization, energy terms based on local neighborhood summation are further

derived in the next subsection based on the Taylor expansion of mutual information.

2.4. Optimization and Implementation To find the optimal solution of the objective function in Eq. (7), we use an iterative approach. The bone gradient ˆ (0) = 0. In each iteration, we upfield is initialized with G B ˆ (k+1) to minimize the objective function. Since the date G B mutual information terms are evaluated based on the distriˆB , G ˆ S and G ˆ S , we use the result on the prebution of G T vious iteration to estimate the distributions and they are refined after each iteration. The challenge in the optimization is the complexity of calculating the exact mutual information terms [6, 2]. An approximation to the mutual information term based on Taylor expansion are derived in this section for efficient optimization. The proposed approximation can be combined with other type of optimization procedures as well, e.g. the Graph Cut method in [6]. The MI terms just use the standard mutual information definition based on the gradient distributions and can be exˆ B ), H(G ˆ S ), and the joint pressed by the entropy terms H(G ˆ ˆ entropy H(GB , GS ): ˆB , G ˆ S ) = H(G ˆ B ) + H(G ˆ S ) − H(G ˆB , G ˆ S ) (8) M I(G   P1 (z) logP1 (z) − P2 (z) logP2 (z) =− z

+



z

P1,2 (z1 , z2 ) logP1,2 (z1 , z2 )

z1 ,z2

where P1 (z), P2 (z) and P1,2 (z1 , z2 ) are the probability ˆB , G ˆ S and their joint density respectively. density of G These probabilities should be estimated directly from the images. Parzen Window is used to approximate the probability by a superposition of Gaussian densities: 1  P1 (z) = Gψ (z − n1,p ) (9) |D| p 1  Gψ (z − n2,p ) P2 (z) = |D| p 1  Gψ [(z1 , z2 ) − (n1,p , n2,p )] P1,2 (z) = |D| p where, Gψ (z) = √

1 (2π)2 ψ

exp(− 12 z T ψ −1 z) and |D| is the

size of the samples. n1,p and n2,p are the estimated gradient magnitude in bone and soft-tissue layer at position p reˆB , G ˆ S and G ˆS spectively. When updating the estimation G T at iteration k for pixel p, The densities P1 (z), P2 (z) and P1,2 (z1 , z2 ) need to be re-estimated. Finding the best upˆB , G ˆ S ) requires re-estimating these date to minimize M I(G ˆB , G ˆ S and probabilities for each possible value of the new G ˆ S , which is computationally prohibitive. G T 697

For efficient computation, we use the Taylor expansion to approximate the mutual information similar to [6]. The idea is to approximate the mutual information with summation operations. The computational cost will decrease draˆ B , we use the Taylor matically. Since P1 (z) depends on G expansion, x log x = −x0 + (1 + log x0 )x + O((x − x0 )2 ), ˆ B ) as: to approximate H(G  ˆB ) = − H(G P1 (z) log P1 (z) ∼ =

 z

=

 z

= −

z

[P10 (z)

− (1 + log(P10 (z)) · P1 (z)]

P10 (z) −

 z



P1 (z) −

z

log(P10 (z))

 z

log(P10 (z)) · P1 (z)

· P1 (z)

f = min

(10)

where P10 (z) is the probability density of the estimated bone

gradient in previous iteration and P1 (z) is the new probability density after update. From Eq. (9) and (10), we can ˆ B ) as follows: compute H(G  ˆB ) ≈ − 1 H(G log(Pi0 (z)) · Gψ (z − ni,p ) |D| p z 1  fi (ni,p ) (11) = − |D| p where fi (z) = log(Pi0 (z))⊗Gψ (z) and ‘⊗’ denotes convolution. Similarly, we can approximate the joint probability term in Eq. (8) as follows:  0 ˆB , G ˆS ) ≈ − 1 H(G log(P1,2 (z1 , z2 )) |D| p z z 1

ˆ S and G ˆ S ). We still need to reconstruct each layer from G T its gradient field. Suppose we have the gradient field of one layer, e.g., ˆ S , our task is to reconstruct the layer S, ˆ whose graG=G dient field matches G. One natural way to achieve this is to solve the equation ∇Sˆ = G. However, since the gradient field is estimated, the resulting gradient field is not necessarily integrable. Some part of the estimated gradient may violate ∇ × G = 0 (i.e. the curl of gradient is 0). In ˆ whose such a case, our task is to find a potential function S, gradients are closest to G in the sense of least squares by searching the space of all 2D potential functions, that is, to minimize the following integral in 2D space: 

2

·Gψ ((z1 , z2 ) − (n1,p , n2,p )) 1  = − f1,2 (n1,p , n2,p ) |D| p

(12)

0 (z1 , z2 ))⊗Gψ (z1 , z2 ). Based where f1,2 (z1 , z2 ) = log(P1,2 on these approximations, we can find the cost value in Eq. (7) efficiently without re-estimating the probability density for every possible value. For iteration k + 1, we first 0 ˆk , estimate the P10 (z), P20 (z) and P1,2 (z1 , z2 ) based on G B ˆ k and G ˆ k . Then we convolute them with Gaussian G S ST kernels Gψ (z) and Gψ (z1 , z2 ) to obtain f1 (z), f2 (z) and ˆ k+1 , we can easily calculate f1,2 (z1 , z2 ). For any updates G B the cost of the MI term by looking up the corresponding value in the f () and find the one minimizing the cost. In our implementation, we set a search range r and ˆ k+1 = search step δ so that we update in each iteration G B k ˆ + δ) till it converges. arg min−r