Automatic Multi-Camera Setup Optimization for Optical ... - CiteSeerX

Automatic Multi-Camera Setup Optimization for Optical Tracking Philippe A. Cerfontaine∗

Marc Schirski

Daniel B¨ undgens

Torsten Kuhlen

Virtual Reality Group RWTH Aachen University

Figure 1: Camera setup optimization sequence for a five sided CAVE with four cameras. The camera positions were constrained to remain on the open top side of the CAVE. Notice the increased point sample density in head height to improve the head-tracking robustness.

A BSTRACT

1

We propose a method to determine the optimal camera alignment for a tracking system with multiple cameras by specifying the volume to be tracked and an initial camera setup. We use optimization strategies based on methods usually employed for solving nonlinear systems of equations. All approaches are fully automatic and take advantage of modern graphics hardware since we also implement a GPU-based, accelerated visibility test. The algorithm automatically optimizes the whole setup by adjusting the given set of camera parameters. We can steer the optimization towards different goals depending on the desired application, e.g. the widest possible volume coverage or maximum camera visibility to overcome heavy occlusion problems during the tracking process. We also consider parameter constraints that the user may specify according to restrictions in the local environment where the cameras have to be mounted. This allows for a convenient definition of higher level constraints for the camera setup.

The number of systems and methods using multiple cameras for three dimensional reconstruction has rapidly increased over the past years [1, 2, 3, 4, 5]. These kind of systems range from low cost hardware like webcams to expensive tracking systems. All of these systems depend on the overlap between the view frusta of their cameras, thus it is important to find the best possible alignment. If the setup is constructed incorrectly, the tracking quality will be poor. Therefore, it would be preferable to guarantee an optimal underlying camera setup from the start.

CR Categories: G.1.6 [Numerical Analysis]: Optimization— Constrained optimization, Global optimization, Gradient methods, Simulated annealing; I.3.7 [Computer Graphics]: ThreeDimensional Graphics and Realism—Virtual reality; I.4.1 [Image Processing and Computer Vision]: Digitization and Image Capture—Camera calibration; I.4.8 [Image Processing and Computer Vision]: Scene Analysis—Motion, Tracking Keywords: optical tracking, optimization ∗ e-mail:

[email protected]

IEEE Virtual Reality 2006 March 25 - 29, Alexandria, Virginia, USA 1-4244-0223-9/06/$20.00 ©2006 IEEE

2

I NTRODUCTION

C AMERA SETUP DESCRIPTION

The aim is to track a moving object within a certain volume without any data loss. This means, that all positions within this volume need to be visible by at least two cameras at all times. The problem is, how to position these cameras such that the total coverage is reached. In fact, this is a necessary condition for 3D reconstruction of positions within the complete volume. The goal is to minimize the unseen part of the volume or the part only covered by one camera as much as possible. First of all, we need to describe the volume inside of which we want to perform tracking. In order to do so we are going to specify a completely arbitrary list of positions, that we will henceforth call the volume of interest. This approach yields us maximum flexibility in case of arbitrary shaped tracking volumes or certain regions of interest inside the volume with increased importance. Stressing the importance of a certain region inside the volume of interest becomes almost trivial since we may simply increase or duplicate the number of positions in this specific area of the volume. The second important part of the tracking problem are the cameras themselves. To be able to optimize the setup, our algorithm needs to know the cameras we use. In fact, optimizing the camera

301 Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on July 11, 2009 at 11:16 from IEEE Xplore. Restrictions apply.

whether the ith camera sees the jth position or not. This test can be performed efficiently by using a modern GPU, because it is computationally simple but highly parallelizable. Otherwise a simple view frustum test can be implemented on the CPU. Each position is tested for each camera, and the result is written to the visibility matrix V with vi j ∈ {0, 1}. Having this matrix computed allows us to specify a scoring or evaluation function taking the matrix as input. Depending on the given requirements and scenario it is possible to use various evaluation methods. The most common one for tracking would be to count the number of positions seen by at least two cameras k ≥ 2. 1. For every camera i, compute the visibility vi j ∈ {0, 1} of every point j 2. For every point j, add up the visibility over all cameras i 3. For every point j, threshold v j with the desired minimum number k of cameras seeing point j and sum up the results A setup achieving an increased number of traceable positions for this evaluation method has a better coverage of the specified volume of interest. But if heavy occlusion problems are encountered during the tracking process it is desirable to increase the number k of cameras required for a position to be classified as traceable. 4

Figure 2: Circular setting with central region of interest

setup means optimizing the individual camera parameters. Thus for our approach we need initial values for the parameters we want to optimize and fixed values for constant parameters. In order to concretize the parameters we will name them to allow a better comprehension. The necessary parameters – to decide whether a point in space is currently seen by a camera – are its position and orientation, as well as those parameters describing the view frustum of the camera device. In addition, we need the six culling planes for maximum flexibility. This leaves us with twelve parameters per camera we need to specify and optimize. Furthermore, six parameters concerning the camera’s initial position and orientation have to be specified separately for each camera. We could use the algorithm described in [6] to compute them automatically from camera footage of known regular calibration shapes. Since these are the parameters we want to optimize, the measurements only need to be approximate and not completely accurate. Providing an initial camera setup may seem complex, but it also has the important advantage to guarantee the feasibility of the optimized camera setup. Since the initial setup is supposed to be feasible, the outcoming solution, representing the closest local minimum, is probably more realistic than a solution which deviates significantly. 3

C AMERA SETUP QUALITY

Once the camera parameter values are available, we need to evaluate the impact of altering them on the quality of the setup in terms of coverage of the specified volume of interest. In order to do this, we compute an n × m visibility matrix V where n is the number of cameras and m is the number of positions. This matrix tells us

C AMERA ALIGNMENT OPTIMIZATION

As the problem we are trying to solve is discrete – a camera either sees a position or not – we neither have a function nor a system of equations we could try to minimize or solve. But we have a score, which indicates how close or far away we are from the final goal. Our approach minimizes the difference between the maximum possible score and the current score by using a discrete, gradient-based steepest descent method, similar to a solver that minimizes the residual of a non-linear system of equations. The gradient vector is determined by finite differences for all variable parameters. 5

C ONCLUSION AND FUTURE WORK

We developed an algorithm to increase the robustness of optical tracking systems. Based on initial values it computes an optimized camera setup automatically using mathematical methods to improve the tracking reliability. We plan to implement different optimization strategies like simulated annealing and genetic algorithms. To evaluate the benefit we will conduct case studies for different application scenarios. R EFERENCES [1] O. Faugeras. Three-Dimensional Computer Vision. MIT Press, 1993. [2] A.W. Fitzgibbon and A. Zisserman. Automatic 3d model acquisition and generation of new images from video sequences. In European Signal Processing conference (EUSIPCO ’98), pages 1261–1269, Rhodes, Greece, 1998. [3] R. Kumar and A. Hanson. Robust methods for estimating pose and a sensitivity analysis. In CVGIP-Image Understanding, volume 60, pages 313–342, 1994. [4] R. Kurazume, K. Nishino, Z. Zhang, and K. Ikeuchi. Simultaneous 2d images and 3d geometric model registration for texture mapping utilizing reflectance attribute. In Fifth Asian Conference on Computer Vision (ACCV), volume I, pages 99–106, January 2002. [5] M. Pollefeys. 3D Modelling from Images. Tutorial notes, in conjunction with ECCV 2000, Dublin, Ireland, June 2000. [6] Tomásˇ Svoboda, Daniel Martinec, and Tomásˇ Pajdla. A convenient multi-camera self-calibration for virtual environments. PRESENCE: Teleoperators and Virtual Environments, 14(4), August 2005.

302 Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on July 11, 2009 at 11:16 from IEEE Xplore. Restrictions apply.