NON-RIGID POINT SET REGISTRATION WITH MULTIPLE FEATURES HaoLin Tang 1, 2 , Yang Yang 1, 2, ∗ 1. School of Information Science and Technology, Yunnan Normal University Kunming,650092,China 2. The Engineering Research Center of GIS Technology in Western China of Ministry of Education of China, Yunnan Normal University,Kunming, 650092,China * Email: yyang [email protected](Corresponding Author)

ABSTRACT

edge connections [1], the applicabilities of shape context and graph based methods are limited, respectively. What’s more, it is also difficult to attain a good registration with only a single estimation for relatively large non-rigid deformation.

We present a new method for non-rigid registration with multiple features in this work. The proposed method is based on an alternating two-step process:correspondence estimation and transformation updating. We first define two vector features for measuring global and local structural differences between two point sets, respectively. We then combine the two features to build a multi-feature based energy function which provides a novel way to estimate correspondences by minimizing global or local feature differences using a linear assignment solution. To enhance the interaction between the two steps, we design an annealing scheme to gradually change the energy minimization from local to global feature differences and the thin plate spline transformation from rigid to non-rigid during the registration process. The registration results demonstrate that our method outperforms four state-ofthe-art methods in most experiments.

Iterative methods contain two alternating steps: (i) correspondence estimation and (ii) transformation updating. The TPS-RPM method [2] is one of the most notable methods in this area. It adopts softassign [3] and deterministic annealing [4] to estimate correspondences and control thin plate spline [5] (TPS) transformation updating, respectively. Myronenko et al. [6] introduced a Coherent Points Drift algorithm and the registration is treated as a maximum likelihood estimation problem with motion coherence constraint [7] for preserving the topological structure of the point sets. Later, they [8] (CPD) extended the former algorithm [6] for both rigid and non-rigid registration, and employed the fast Gauss transform [9] and the low-rank matrix approximation [10] for a fast registration. Recently, Jian and Vemuri[1] (GMMREG) proposed a Gaussian mixture model approach for both rigid and non-rigid registration. They treat the problem of point set registration is reformulated as the problem of aligning two Gaussian mixtures, and the transformation updating based on minimizing the L2 distance [11] between the two models. More recently, Yang et al.[12] proposed a method based on mixture distance and Ma et al.[13, 14] applied the shape context feature to establish rough correspondence, and estimated the transformation using a robust L2E estimator [15].

Index Terms— Non-rigid, point set registration, multiple features, correspondence estimation, transformation updating

1.

Introduction

Non-rigid point registration is a fundamental problem in many computer vision, machine learning, medical imaging, pattern recognition and geographic information system applications. At the present time, non-rigid point set registration methods have two major types of classification according to their methodological differences: (i) iterative or non-iterative methods, and (ii) learning or no learning methods. We mainly introduce current methods based on the first category because of concentrating on the iterative approach for non-rigid point set registration. Correspondences between two point sets in noniterative methods are recovered by a single estimation using high level structures such as shape context descriptors and graph relations which are two of the most popular features for non-iterative methods. Between two point sets, such methods seek to minimize the point distribution differences or graph relation differences for recovering correspondences. However, when neighboring points are close to each other and have similar 978-1-5090-0654-0/16/$31.00 ©2016 IEEE

In this work, we present a non-rigid point set registration method with multiple features. Our contributions in this work include the following three aspects: (i) We proposed a global feature descriptor which is defined by a sum of point to point vectors. (ii) We proposed a local feature descriptor which is defined by a sum of point to point vectors between two sets of neighboring points for measuring local similarities between two point sets. (iii) A multi-feature based energy function is designed to employ multiple features for correspondence estimation between two point sets. 268

ICALIP 2016

2.

Method

2.2.

Suppose we have two point sets {si , i = 1, 2, ..., n} and {tj , j = 1, 2, ..., m} for the source point set s and the target point set t, respectively.

Multi-feature based Energy Function

The multi-feature based energy function is defined as E(M) = ∑ ∑ Gsi tj Mij + α ∑ ∑ Lsi tj Mij i∈n j∈m

2.1.

Global and Local Feature Difference

where ∑i∈n ∑j∈m Gsi tj Mij and ∑i∈n ∑j∈m Lsi tj Mij represent the aforementioned global and local feature differences, and can be considered as linear assignment problems. The element values in Gst and Lst are resized in [0, 1], respectively. n and m indicate the point numbers in s and t, respectively. The binary correspondence matrix Mij represents the corresponding relationship between s and t, taking value 1 if si corresponds to tj and 0 otherwise. Mij still satisfies ∑ Mij = 1 for j ∈ m and ∑ Mij = 1 for i ∈ n. α is a weighting parameter that controls the balance between the ∑i∈n ∑j∈m Gsi tj Mij and ∑i∈n ∑j∈m Lsi tj Mij . During the registration, α is gradually reduced by a linear annealing scheme T = T ⋅ r where r is the annealing rate as 0.7.

The global feature difference designed for measuring global feature differences from s and t, and formulated as Ð→ Gsi tj =∥ Ð v→ si − vtj ∥

(1)

where Gst describes the global structural differences between s to t, The value of each element in Gst is the length Ð→ Ð→ of the subtraction of two sum vectors Ð v→ si and vtj . vsi and Ð → vtj are the presented global feature descriptor and defined as n Ð → Ð → Ð v→ (2) ∑ (sk − si ) si = k=1 and k≠i

Ð → Ð → ( tk − tj )

m

Ð v→ tj =

∑

(3)

k=1 and k≠j

2.3.

Ð → Ð → Ð → where (Ð s→ k − si ) and (tk − tj ) are the Euclidean vectors from the point si to the point sk and from the point tj to Ð→ the point tk , respectively. Ð v→ si and vtj are used to describe the global structural features for each point in the point sets s and t, respectively. The local feature difference is used to estimate local structural differences from s and t, and formulated as → → Lsi tj =∥ Ð v T (N(si )K ,tj ) − Ð v N(tj )K ∥

(4)

2.3.1. Correspondence Estimation At each iteration, correspondences between sw to t are estimated by minimizing the multi-feature based energy function (8). We treat the problem of the multi-feature based energy optimization is reformulated as a linear assignment problem, and the Jonker-Volgenant algorithm [16], which is used to solve the linear assignment problem, provides the shortest augmenting path and has worst-cost time O(N 3 ). For solving the integer cost problem, the global and local feature difference matrices Gsw t and Lsw t are rounded by ⌊Gsw t × R⌉ and ⌊Lsw t × R⌉, respectively, where R is set by 106 . To address the non-square cost matrix problem, the non-square matrices Gsw t and Lsw t are converted into a square matrix by assigning dummy entries [17]. Then, E(M) is solved as previously described and still show the best solution. At the current iteration, the new corresponding points tc ( in t) is then updated by

(5)

. We consider that each point and its neighboring points in the point set s and t construct a small local segment, point correspondences between point set s and t can be converted to estimate geometric similarities of such local segments. More specifically, we first move a set of neighboring points N (si )k=1,...,K to each point tj=1,2,...,m ÐÐ→ in t according to a displacement vector si tj . The corresponding point of si is then determined by the point tj has the minimum length of the subtraction of two sum vectors Ð → → v T ((N(si )K ,tj )) and Ð v N(tj )K . The two vectors are the proposed local feature descriptor and defined as K

Ð → v T ((N(si )K ,tj )) =

∑

Ð → (Ð s→ k − si )

(6)

k=1 and k≠i

Ð → v N(tj )K =

K

∑

Ð → Ð → (tk − tj )

Main Process

We employ a warping template sw and its initial location is defined as sw = s. During the registration, sw is gradually changed its locations and geometrical shapes to close to the target point set t. The main process of our method is first (i) to employ the aforementioned multi-feature based energy function (8) to estimate correspondences between sw and t at each iteration, and then (ii) to update the locations of sw using a regularized TPS transformation built by the recovered current correspondences in (i). (i) and (ii) are iterated so that sw gradually approaches t, and finally matches the exact corresponding points in t.

. K is the number of neighboring points. N (si )K and N (tj )K are two sets of closest point for the points si and tj , respectively. T is the translation function defined as T (N (si )K , tj ) = N (si )K + (tj − si )

(8)

i∈n j∈m

(7)

k=1 and k≠j

Ð → Ð → Ð → where (Ð s→ k − si ) and (tk − tj ) are the Euclidean vectors from the point si to the point sk and from the point tj to the point tk , respectively and used to describe the local structural features for each point in the point sets N (si )K and N (tj )K , respectively.

tc = M ⋅ t

(9)

2.3.2. Transformation Updating The transformation is bulit by tc and the original s after the current corresponding points tc updated. In this work, 269

we map s to tc using the TPS transformation f (s, d, w) = s ⋅ d + φ(s) ⋅ w

feature differences, and (ii) to reduce the regularization parameter λ in equation (11) and equation (13) to adjust the TPS transformation from a more rigid to a more non-rigid. Based on an initial Trial-and-Error Experiment using the Fish1 point set [2], Tinit is set to 1/10 of the largest squared distance between s and t, and Tf inal is chosen to be equal to 1/8 of the mean squared distance between the closest points in a. r is usually set to 0.7.

(10)

where w is a non-rigid warping coefficient and d is an affine coefficient. The TPS kernel function φ(s) is defined by φ(s) = ∥s − sc ∥2 log ∥s − sc ∥ and φ(s) = ∥s − sc ∥ for the 2D and 3D cases, respectively. sc is a set of control points chosen from s. For mapping s to its correspondence tc with the optimized d and w, the regularized TPS energy function is defined as

● Weighting parameters: α is reduced by α = αinit × T at each iteration, and the initial values αinit is set to the squared number of the neighboring points K 2 .

ETPS (d, w) = ∥tc − sd − Φw∥2 + λtrace(wT Φw) (11)

● Regularization parameters: λ is also reduced by λ = λinit × T at each iteration, and the initial value of λinit is set to the length of point set a,

where the regularization parameter λ penalizes the nonrigid warping coefficient w, and is controlled by the same annealing scheme applied to the aforementioned weighting parameter α in function (8). Φ is the kernel matrix calculated by the kernel function φ(s). For the least-squares solutions for the d and w, the QR decomposition [18] is used to separate the affine and nonrigid warping space by s = QR = [Q1 ∣Q2 ]

⎛ R1 ⎞ ⎝ 0 ⎠

● The number of neighboring points: K is set by the minimum number of points used for distinguishing local structures. For example, to distinguish between a corner (includes two closest points) and a cross (includes four closest points), at least four closest points are required. The default K is set to 5 for both 2D and 3D.

3.

(12)

We implemented the main process of our method in Matlab, and the Jonker-Volgenant algorithm in C++ as a Matlab mex function. All experiments are tested on a PC with 3.6GHz Intel Core CPU, 8GB memory. We tested the performance of our method in the four series of experiments: (i) 2D contour registration (2D synthetic point set) (ii) 3D contour registration (3D face point set) (iii) Image features (IMM Face Database) (iv) Sequence images (CMU house sequence) and compared against four state-of-the-art iterative methods: (i) CPD [8] (ii) L2E [13, 14] (iii) Caetano et al.[19] (iv) Leordeanu et al.[20]

where Q1 is an N × D matrix, Q2 is N × (N − D), and R1 is D × D. Both Q1 and Q2 have orthogonal columns. Thus (11) becomes ETPS (γ, d) = ∥QT2 tc − QT2 ΦQ2 γ∥2 +∥QT1 tc − R1 d − QT1 ΦQ2 γ∥2 + λγ T QT2 ΦQ2 γ

(13)

where w = Q2 γ and γ is (N − D − 1) × (D + 1). The leastsquares solution for equation (13) can first be minimized with respect to γ and then with respect to d. The final solutions for w and d are ˆ = Q2 γ = Q2 (QT2 ΦQ2 + λIN −D−1 )−1 QT2 tc w ˆ = R−1 (QT tc − Φw) d 1

(14)

3.1.

sw = s ⋅ d + Φ ⋅ w

In this work, we selected four point sets: Line [2], Fish1 [2], Hand and 3D Face [8] to run the experimental comparison. The generation of target point sets, error measurement and performance assessment were followed the same way in CPD [6, 8]. Since the proposed global and local feature descriptors were defined by the sum of vectors which are sensitive to the registration patterns including outliers, we only evaluated the performance of our method without outliers in this work.

(16)

w

After the location of s updated, we return to the first step (2.4.1) for continuing the registration process until the final temperature Tf inal of the annealing scheme is reached.

2.4.

Experimental Setup

(15)

The new location of the warping template sw is updated by

Experiments

Our Algorithm and Parameter Setting

● Target point sets:

There are four groups of free parameters in our method: ● Annealing parameters: T is gradually reduced by a linear annealing scheme T = T × r where r is the annealing rate. The objectives of designing the annealing scheme are: (i) to reduce the weighting parameter α in function (8) to control the multi-feature based energy minimization from local feature differences to global

(a) Deformation: eight control points (six control points for 3D) were set on the boundary of each source point set. Each control point was set with four free moving directions (nine moving directions in 3D) and 0.2 moving distance. The order and the moving directions of control points were 270

randomly determined. TPS transformation was employed to warp source point sets using the eight (or six) control points, and the order of the warped source points was then randomly rearranged. The degree of deformation was defined as the number of moved control points (the maximum degree is the eight and six degree in 2D and 3D, respectively) since the higher level of point perturbations produces the higher deformation. (b) Noise: We use Gaussian white noise with a zero mean and standard deviation from 0.01 to 0.05 to design five noise levels. (c) Rotation: Since the deformation often along with a rotation, we consider that evaluating performance under moderate rotations is essential. we focus on the rotation range from −30○ to 30○ with a 15○ interval (the target point ets in the 3D cases were rotated on the vertical axis) since beyond this range some methods showed an unstable performance.

Fig. 1. Comparison of our results (∗) against CPD (◯) and L2E (◻) on 2D or 3D contour point set registration.

For (b) and (c), each source point set is also randomly warped by a medium degree of deformation (the fourth degree for 2D and the third degree for 3D) before adding noise or rotations. ● Error measurement: In order to follow a direct and fair comparison with the other two methods, we adopted the same error measurement in CPD[6, 8]: the mean squared distance between the warped template sw and ground truth (i.e., their actual corresponding points). ● Performance assessment: The mean error was used to compare the performances between the methods. Experiments on each point set are random repeated one hundred times for each setting (i.e., each degree of deformation, noise level and rotation degree).

3.2.

Fig. 2. Registration examples on 2D or 3D contour point set registration. Initial pose in the first and fifth column. From the second column to the fourth column or the sixth column to the last colunm are examples made by Ours, CPD and L2E, respectively. From the top row to the third row or the fourth row to the last row are examples in the deformation (the 8th degree), noise(0.03) and rotation (−30○ ) experiments. Black rectangles are used to highlight registration errors.

Experiments on 2D Contour Registration

In this section, we evaluate the performance of our method against CPD and L2E on different 2D synthetic point sets. The comparisons results are discussed as follows: results are discussed as follows:

3.2.3.

Hand Point Set

The performance results are given in the third row of Fig.1. Our method shows accurate alignments in all experiments, and shows the best performances over all degrees of deformation and all degrees of rotation. All the three methods show very close accurate alignments on the performance of all noise levels.Registration examples are shown in Fig.2.

3.2.1. Line Point Set The performance statistics (the mean error and its standard deviation) are shown in the first row of Fig.1. Our method shows accurate alignments in all experiments, and gives the best performances over all degrees of deformation, all noise levels and all degrees of rotation. Registration examples are shown in Fig.2.

3.3.

3.2.2. Fish1 Point Set The performances of this section are shown in the second row of Fig.1. Our method shows accurate alignments in all experiments, and gives the best performances over all degrees of deformation and all degrees of rotation. All the three methods give very close accurate alignments on the performance of all noise levels. Registration examples are shown in Fig.2.

Experiments on 3D Contour Registration

In the second series of experiments, we evaluate the performance of our method in 3D registration patterns. The 3D face point set has been used to test the performances of CPD for 3D registration pattern. The performances of our method, CPD and L2E are given in the fourth row of Fig.1. Our method shows accurate alignments in all experiments. Registration examples are shown in Fig.2. 271

3.4.

Experiments on Image Features

Our method shows the best accuracies. We compared against four state-of-the-art iterative methods. The results are shown in Table 2. Registration examples of our method are shown in Fig.5. Table 2. Matching rates on the CMU house for all possible image pairs.

Method Ours CPD L2E Caetano et al Leordeanu et al

Fig. 3. Examples of images in the IMM Face Database.Each row is an individual with different expressions,poses and illuminations.

In this section, we test our method on image feature and compare. We choose the publicly available benchmark IMM Face Database for evaluation.Some examples of the face images are shown in Fig.3. In our experiments, we choose the first eight faces for evaluation. In the each faces, we construct 36 pairs sets without repeat were registered. The results are shown in Table 1. Our method gives higher registration accuracies with respecting to the performances of CPD and L2E. Some registration examples are shown in Fig.4. Table 1. The mean of matching rates on the first eight faces from IMM Face Database.

Method Ours CPD L2E

House 100% 99.57% 97.75%

ABSTRACT

edge connections [1], the applicabilities of shape context and graph based methods are limited, respectively. What’s more, it is also difficult to attain a good registration with only a single estimation for relatively large non-rigid deformation.

We present a new method for non-rigid registration with multiple features in this work. The proposed method is based on an alternating two-step process:correspondence estimation and transformation updating. We first define two vector features for measuring global and local structural differences between two point sets, respectively. We then combine the two features to build a multi-feature based energy function which provides a novel way to estimate correspondences by minimizing global or local feature differences using a linear assignment solution. To enhance the interaction between the two steps, we design an annealing scheme to gradually change the energy minimization from local to global feature differences and the thin plate spline transformation from rigid to non-rigid during the registration process. The registration results demonstrate that our method outperforms four state-ofthe-art methods in most experiments.

Iterative methods contain two alternating steps: (i) correspondence estimation and (ii) transformation updating. The TPS-RPM method [2] is one of the most notable methods in this area. It adopts softassign [3] and deterministic annealing [4] to estimate correspondences and control thin plate spline [5] (TPS) transformation updating, respectively. Myronenko et al. [6] introduced a Coherent Points Drift algorithm and the registration is treated as a maximum likelihood estimation problem with motion coherence constraint [7] for preserving the topological structure of the point sets. Later, they [8] (CPD) extended the former algorithm [6] for both rigid and non-rigid registration, and employed the fast Gauss transform [9] and the low-rank matrix approximation [10] for a fast registration. Recently, Jian and Vemuri[1] (GMMREG) proposed a Gaussian mixture model approach for both rigid and non-rigid registration. They treat the problem of point set registration is reformulated as the problem of aligning two Gaussian mixtures, and the transformation updating based on minimizing the L2 distance [11] between the two models. More recently, Yang et al.[12] proposed a method based on mixture distance and Ma et al.[13, 14] applied the shape context feature to establish rough correspondence, and estimated the transformation using a robust L2E estimator [15].

Index Terms— Non-rigid, point set registration, multiple features, correspondence estimation, transformation updating

1.

Introduction

Non-rigid point registration is a fundamental problem in many computer vision, machine learning, medical imaging, pattern recognition and geographic information system applications. At the present time, non-rigid point set registration methods have two major types of classification according to their methodological differences: (i) iterative or non-iterative methods, and (ii) learning or no learning methods. We mainly introduce current methods based on the first category because of concentrating on the iterative approach for non-rigid point set registration. Correspondences between two point sets in noniterative methods are recovered by a single estimation using high level structures such as shape context descriptors and graph relations which are two of the most popular features for non-iterative methods. Between two point sets, such methods seek to minimize the point distribution differences or graph relation differences for recovering correspondences. However, when neighboring points are close to each other and have similar 978-1-5090-0654-0/16/$31.00 ©2016 IEEE

In this work, we present a non-rigid point set registration method with multiple features. Our contributions in this work include the following three aspects: (i) We proposed a global feature descriptor which is defined by a sum of point to point vectors. (ii) We proposed a local feature descriptor which is defined by a sum of point to point vectors between two sets of neighboring points for measuring local similarities between two point sets. (iii) A multi-feature based energy function is designed to employ multiple features for correspondence estimation between two point sets. 268

ICALIP 2016

2.

Method

2.2.

Suppose we have two point sets {si , i = 1, 2, ..., n} and {tj , j = 1, 2, ..., m} for the source point set s and the target point set t, respectively.

Multi-feature based Energy Function

The multi-feature based energy function is defined as E(M) = ∑ ∑ Gsi tj Mij + α ∑ ∑ Lsi tj Mij i∈n j∈m

2.1.

Global and Local Feature Difference

where ∑i∈n ∑j∈m Gsi tj Mij and ∑i∈n ∑j∈m Lsi tj Mij represent the aforementioned global and local feature differences, and can be considered as linear assignment problems. The element values in Gst and Lst are resized in [0, 1], respectively. n and m indicate the point numbers in s and t, respectively. The binary correspondence matrix Mij represents the corresponding relationship between s and t, taking value 1 if si corresponds to tj and 0 otherwise. Mij still satisfies ∑ Mij = 1 for j ∈ m and ∑ Mij = 1 for i ∈ n. α is a weighting parameter that controls the balance between the ∑i∈n ∑j∈m Gsi tj Mij and ∑i∈n ∑j∈m Lsi tj Mij . During the registration, α is gradually reduced by a linear annealing scheme T = T ⋅ r where r is the annealing rate as 0.7.

The global feature difference designed for measuring global feature differences from s and t, and formulated as Ð→ Gsi tj =∥ Ð v→ si − vtj ∥

(1)

where Gst describes the global structural differences between s to t, The value of each element in Gst is the length Ð→ Ð→ of the subtraction of two sum vectors Ð v→ si and vtj . vsi and Ð → vtj are the presented global feature descriptor and defined as n Ð → Ð → Ð v→ (2) ∑ (sk − si ) si = k=1 and k≠i

Ð → Ð → ( tk − tj )

m

Ð v→ tj =

∑

(3)

k=1 and k≠j

2.3.

Ð → Ð → Ð → where (Ð s→ k − si ) and (tk − tj ) are the Euclidean vectors from the point si to the point sk and from the point tj to Ð→ the point tk , respectively. Ð v→ si and vtj are used to describe the global structural features for each point in the point sets s and t, respectively. The local feature difference is used to estimate local structural differences from s and t, and formulated as → → Lsi tj =∥ Ð v T (N(si )K ,tj ) − Ð v N(tj )K ∥

(4)

2.3.1. Correspondence Estimation At each iteration, correspondences between sw to t are estimated by minimizing the multi-feature based energy function (8). We treat the problem of the multi-feature based energy optimization is reformulated as a linear assignment problem, and the Jonker-Volgenant algorithm [16], which is used to solve the linear assignment problem, provides the shortest augmenting path and has worst-cost time O(N 3 ). For solving the integer cost problem, the global and local feature difference matrices Gsw t and Lsw t are rounded by ⌊Gsw t × R⌉ and ⌊Lsw t × R⌉, respectively, where R is set by 106 . To address the non-square cost matrix problem, the non-square matrices Gsw t and Lsw t are converted into a square matrix by assigning dummy entries [17]. Then, E(M) is solved as previously described and still show the best solution. At the current iteration, the new corresponding points tc ( in t) is then updated by

(5)

. We consider that each point and its neighboring points in the point set s and t construct a small local segment, point correspondences between point set s and t can be converted to estimate geometric similarities of such local segments. More specifically, we first move a set of neighboring points N (si )k=1,...,K to each point tj=1,2,...,m ÐÐ→ in t according to a displacement vector si tj . The corresponding point of si is then determined by the point tj has the minimum length of the subtraction of two sum vectors Ð → → v T ((N(si )K ,tj )) and Ð v N(tj )K . The two vectors are the proposed local feature descriptor and defined as K

Ð → v T ((N(si )K ,tj )) =

∑

Ð → (Ð s→ k − si )

(6)

k=1 and k≠i

Ð → v N(tj )K =

K

∑

Ð → Ð → (tk − tj )

Main Process

We employ a warping template sw and its initial location is defined as sw = s. During the registration, sw is gradually changed its locations and geometrical shapes to close to the target point set t. The main process of our method is first (i) to employ the aforementioned multi-feature based energy function (8) to estimate correspondences between sw and t at each iteration, and then (ii) to update the locations of sw using a regularized TPS transformation built by the recovered current correspondences in (i). (i) and (ii) are iterated so that sw gradually approaches t, and finally matches the exact corresponding points in t.

. K is the number of neighboring points. N (si )K and N (tj )K are two sets of closest point for the points si and tj , respectively. T is the translation function defined as T (N (si )K , tj ) = N (si )K + (tj − si )

(8)

i∈n j∈m

(7)

k=1 and k≠j

Ð → Ð → Ð → where (Ð s→ k − si ) and (tk − tj ) are the Euclidean vectors from the point si to the point sk and from the point tj to the point tk , respectively and used to describe the local structural features for each point in the point sets N (si )K and N (tj )K , respectively.

tc = M ⋅ t

(9)

2.3.2. Transformation Updating The transformation is bulit by tc and the original s after the current corresponding points tc updated. In this work, 269

we map s to tc using the TPS transformation f (s, d, w) = s ⋅ d + φ(s) ⋅ w

feature differences, and (ii) to reduce the regularization parameter λ in equation (11) and equation (13) to adjust the TPS transformation from a more rigid to a more non-rigid. Based on an initial Trial-and-Error Experiment using the Fish1 point set [2], Tinit is set to 1/10 of the largest squared distance between s and t, and Tf inal is chosen to be equal to 1/8 of the mean squared distance between the closest points in a. r is usually set to 0.7.

(10)

where w is a non-rigid warping coefficient and d is an affine coefficient. The TPS kernel function φ(s) is defined by φ(s) = ∥s − sc ∥2 log ∥s − sc ∥ and φ(s) = ∥s − sc ∥ for the 2D and 3D cases, respectively. sc is a set of control points chosen from s. For mapping s to its correspondence tc with the optimized d and w, the regularized TPS energy function is defined as

● Weighting parameters: α is reduced by α = αinit × T at each iteration, and the initial values αinit is set to the squared number of the neighboring points K 2 .

ETPS (d, w) = ∥tc − sd − Φw∥2 + λtrace(wT Φw) (11)

● Regularization parameters: λ is also reduced by λ = λinit × T at each iteration, and the initial value of λinit is set to the length of point set a,

where the regularization parameter λ penalizes the nonrigid warping coefficient w, and is controlled by the same annealing scheme applied to the aforementioned weighting parameter α in function (8). Φ is the kernel matrix calculated by the kernel function φ(s). For the least-squares solutions for the d and w, the QR decomposition [18] is used to separate the affine and nonrigid warping space by s = QR = [Q1 ∣Q2 ]

⎛ R1 ⎞ ⎝ 0 ⎠

● The number of neighboring points: K is set by the minimum number of points used for distinguishing local structures. For example, to distinguish between a corner (includes two closest points) and a cross (includes four closest points), at least four closest points are required. The default K is set to 5 for both 2D and 3D.

3.

(12)

We implemented the main process of our method in Matlab, and the Jonker-Volgenant algorithm in C++ as a Matlab mex function. All experiments are tested on a PC with 3.6GHz Intel Core CPU, 8GB memory. We tested the performance of our method in the four series of experiments: (i) 2D contour registration (2D synthetic point set) (ii) 3D contour registration (3D face point set) (iii) Image features (IMM Face Database) (iv) Sequence images (CMU house sequence) and compared against four state-of-the-art iterative methods: (i) CPD [8] (ii) L2E [13, 14] (iii) Caetano et al.[19] (iv) Leordeanu et al.[20]

where Q1 is an N × D matrix, Q2 is N × (N − D), and R1 is D × D. Both Q1 and Q2 have orthogonal columns. Thus (11) becomes ETPS (γ, d) = ∥QT2 tc − QT2 ΦQ2 γ∥2 +∥QT1 tc − R1 d − QT1 ΦQ2 γ∥2 + λγ T QT2 ΦQ2 γ

(13)

where w = Q2 γ and γ is (N − D − 1) × (D + 1). The leastsquares solution for equation (13) can first be minimized with respect to γ and then with respect to d. The final solutions for w and d are ˆ = Q2 γ = Q2 (QT2 ΦQ2 + λIN −D−1 )−1 QT2 tc w ˆ = R−1 (QT tc − Φw) d 1

(14)

3.1.

sw = s ⋅ d + Φ ⋅ w

In this work, we selected four point sets: Line [2], Fish1 [2], Hand and 3D Face [8] to run the experimental comparison. The generation of target point sets, error measurement and performance assessment were followed the same way in CPD [6, 8]. Since the proposed global and local feature descriptors were defined by the sum of vectors which are sensitive to the registration patterns including outliers, we only evaluated the performance of our method without outliers in this work.

(16)

w

After the location of s updated, we return to the first step (2.4.1) for continuing the registration process until the final temperature Tf inal of the annealing scheme is reached.

2.4.

Experimental Setup

(15)

The new location of the warping template sw is updated by

Experiments

Our Algorithm and Parameter Setting

● Target point sets:

There are four groups of free parameters in our method: ● Annealing parameters: T is gradually reduced by a linear annealing scheme T = T × r where r is the annealing rate. The objectives of designing the annealing scheme are: (i) to reduce the weighting parameter α in function (8) to control the multi-feature based energy minimization from local feature differences to global

(a) Deformation: eight control points (six control points for 3D) were set on the boundary of each source point set. Each control point was set with four free moving directions (nine moving directions in 3D) and 0.2 moving distance. The order and the moving directions of control points were 270

randomly determined. TPS transformation was employed to warp source point sets using the eight (or six) control points, and the order of the warped source points was then randomly rearranged. The degree of deformation was defined as the number of moved control points (the maximum degree is the eight and six degree in 2D and 3D, respectively) since the higher level of point perturbations produces the higher deformation. (b) Noise: We use Gaussian white noise with a zero mean and standard deviation from 0.01 to 0.05 to design five noise levels. (c) Rotation: Since the deformation often along with a rotation, we consider that evaluating performance under moderate rotations is essential. we focus on the rotation range from −30○ to 30○ with a 15○ interval (the target point ets in the 3D cases were rotated on the vertical axis) since beyond this range some methods showed an unstable performance.

Fig. 1. Comparison of our results (∗) against CPD (◯) and L2E (◻) on 2D or 3D contour point set registration.

For (b) and (c), each source point set is also randomly warped by a medium degree of deformation (the fourth degree for 2D and the third degree for 3D) before adding noise or rotations. ● Error measurement: In order to follow a direct and fair comparison with the other two methods, we adopted the same error measurement in CPD[6, 8]: the mean squared distance between the warped template sw and ground truth (i.e., their actual corresponding points). ● Performance assessment: The mean error was used to compare the performances between the methods. Experiments on each point set are random repeated one hundred times for each setting (i.e., each degree of deformation, noise level and rotation degree).

3.2.

Fig. 2. Registration examples on 2D or 3D contour point set registration. Initial pose in the first and fifth column. From the second column to the fourth column or the sixth column to the last colunm are examples made by Ours, CPD and L2E, respectively. From the top row to the third row or the fourth row to the last row are examples in the deformation (the 8th degree), noise(0.03) and rotation (−30○ ) experiments. Black rectangles are used to highlight registration errors.

Experiments on 2D Contour Registration

In this section, we evaluate the performance of our method against CPD and L2E on different 2D synthetic point sets. The comparisons results are discussed as follows: results are discussed as follows:

3.2.3.

Hand Point Set

The performance results are given in the third row of Fig.1. Our method shows accurate alignments in all experiments, and shows the best performances over all degrees of deformation and all degrees of rotation. All the three methods show very close accurate alignments on the performance of all noise levels.Registration examples are shown in Fig.2.

3.2.1. Line Point Set The performance statistics (the mean error and its standard deviation) are shown in the first row of Fig.1. Our method shows accurate alignments in all experiments, and gives the best performances over all degrees of deformation, all noise levels and all degrees of rotation. Registration examples are shown in Fig.2.

3.3.

3.2.2. Fish1 Point Set The performances of this section are shown in the second row of Fig.1. Our method shows accurate alignments in all experiments, and gives the best performances over all degrees of deformation and all degrees of rotation. All the three methods give very close accurate alignments on the performance of all noise levels. Registration examples are shown in Fig.2.

Experiments on 3D Contour Registration

In the second series of experiments, we evaluate the performance of our method in 3D registration patterns. The 3D face point set has been used to test the performances of CPD for 3D registration pattern. The performances of our method, CPD and L2E are given in the fourth row of Fig.1. Our method shows accurate alignments in all experiments. Registration examples are shown in Fig.2. 271

3.4.

Experiments on Image Features

Our method shows the best accuracies. We compared against four state-of-the-art iterative methods. The results are shown in Table 2. Registration examples of our method are shown in Fig.5. Table 2. Matching rates on the CMU house for all possible image pairs.

Method Ours CPD L2E Caetano et al Leordeanu et al

Fig. 3. Examples of images in the IMM Face Database.Each row is an individual with different expressions,poses and illuminations.

In this section, we test our method on image feature and compare. We choose the publicly available benchmark IMM Face Database for evaluation.Some examples of the face images are shown in Fig.3. In our experiments, we choose the first eight faces for evaluation. In the each faces, we construct 36 pairs sets without repeat were registered. The results are shown in Table 1. Our method gives higher registration accuracies with respecting to the performances of CPD and L2E. Some registration examples are shown in Fig.4. Table 1. The mean of matching rates on the first eight faces from IMM Face Database.

Method Ours CPD L2E

House 100% 99.57% 97.75%