FAST AND ACCURATE STEREO VISION-BASED ESTIMATION OF 3D ...

3 downloads 0 Views 2MB Size Report
August 21, 2003 8:56 WSPC/164-IJIG. 00131. International Journal of Image ... Laboratoire Perception Syst`emes Information, Université/INSA de Rouen,. Place Emile Blondel, 76131 Mont-Saint-Aignan Cedex, France ...... A. Broggi, M. Bertozzi and A. Fascioli, “Argo and the millemiglia in automatico tour,”. IEEE Intelligent ...
August 21, 2003 8:56 WSPC/164-IJIG

00131

International Journal of Image and Graphics Vol. 4, No. 1 (2004) 1–27 c World Scientific Publishing Company

FAST AND ACCURATE STEREO VISION-BASED ESTIMATION OF 3D POSITION AND AXIAL MOTION OF ROAD OBSTACLES

¨ ´ GWENAELLE TOULMINET∗ , STEPHANE MOUSSE0T† and ABDELAZIZ BENSRHAIR‡ Laboratoire Perception Syst` emes Information, Universit´ e/INSA de Rouen, Place Emile Blondel, 76131 Mont-Saint-Aignan Cedex, France ∗[email protected][email protected][email protected] Received 11 June 2002 Revised 13 February 2003 In this article, we present a fast and accurate stereo vision-based system that detects and tracks road obstacles, as well as computes their 3D position and their axial motion. To do so, axial motion maps are constructed and the inclination angles of 3D straight segments are computed. 3D straight segments are obtained after the construction of 3D sparse maps based on dynamic programming and multi-criteria analysis. Axial motion maps are computed from a sequence of dense 3D maps without region matching. Keywords: Fast obstacles detection and tracking; stereo vision; 3D straight segments; axial motion component.

1. Introduction In the past few years, extensive research has been carried out in the field of driver assistance systems in order to increase road safety and comfort when driving. For instance, the comfort system ACC (Adaptive Cruise Control), and the ESP (Electronic Stability Program) have already equiped a lot of recent vehicles. Prototypes of intelligent vehicles dedicated to road following1 – 3 have been conceived and extensively tested on highways in real conditions. Furthermore, in order to assist drivers in urban traffic environment, some other promising work has been performed on automatic parking functionality4 and on the challenging Stop&Go system.13 One crucial task for driver assistance systems is that of obstacle detection. On this topic, a large variety of active and passive sensors has been used to perceive the environment. Active sensors such as lasers or radars are less computationally expensive for measuring quantities as compared to passive vision-based sensors. However, their low spatial resolution does not enable the detection of small obstacles. That is not the case of vision-based sensors.5 Different vision-based approaches can be used to solve the problem of obstacle detection. Bertozzi et al. concluded6 that 2D model based approaches are a fast 1

August 21, 2003 8:56 WSPC/164-IJIG

2

00131

G. Toulminet, S. Mousset & A. Bensrhair

solution to detect specific types of obstacles such as the other vehicles on highways.7,8 To detect any kind of obstacles (vehicles as well as road debris or animals), the computation of the optical flow field can be used.9,10 However, it fails when both the intelligent vehicle and the obstacle have small or null speed, which happens quite often in urban traffic environment. Stereo vision techniques also enable the detection of any kind of obstacles and are both adapted to the highways and the urban traffic environments. They enable the 3D reconstruction of the scene and the 3D motion computation11,12 that are necessary for the real time obstacle detection and tracking functionality of the Stop&Go system. 13 To construct 3D maps of the scene, correlation techniques,14,15 the Helmholtz shear16 and a Hopfield neural network17 can be used. Then, based on the cameras geometry and by computing the ground plane, the image features that belong to the road and those that belong to the obstacles are separated. However, 3D reconstruction can be computationally expensive. A real time solution to obstacle detection without 3D reconstruction is the Inverse Perspective Mapping18,19 method. In this article, we present a fast and accurate stereo vision-based system that detects and tracks road obstacles, and computes their 3D position and their axial motion. To do so, 3D straight segments and axial motion maps are constructed. Because road objects are rigid objects and the road environment is structured, the construction of 3D straight segments is valid. And, assuming that the road surface can be locally modeled by a flat plane, their inclination angle with the road surface is computed. 3D straight segments are obtained after the construction of accurate 3D sparse maps based on dynamic programming and multi-criteria analysis. Accurate axial motion maps are computed from a sequence of dense 3D maps without region matching. The axial motion is defined as the 3D motion component following the optical axis of the stereo vision sensor. This is of the highest importance since most 3D movements of highways or country road scenes are translations along axis which are parallel to the optical axis of the stereo vision sensor. Finally, the axial motion map and the inclination angles of the 3D straight segments are used to detect and track road obstacles whose edge details are kept. This article is organized as follows: Sections 2 and 3 detail the methods used respectively for the constructions of 3D maps and axial motion maps. The computation of the inclination angles of 3D straight segments, the detection and the tracking of road obstacles are described in Sec. 4. Some experimental results are presented in Sec. 5. Finally, a conclusion ends the article. 2. Fast and Accurate 3D Map Construction In order to construct accurate 3D maps, we propose a fast and automatic scanline to scanline stereo-vision matching algorithm based on dynamic programming and multi-criteria analysis. In the first step of the process, the points of vertical edges are detected and located in the images. Then, the edge points are matched based

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

3

Table 1. An evaluation of the performance of the matching algorithm obtained from the processing of the grey level images (a) and (b) of Fig. 3. Results of the dynamic programming based matching Number of declivity associations

3657

Results of 3D curve construction Number Number Number Number

of of of of

non-significant 3D curves significant 3D curves declivity associations belonging to significant 3D curves non-matched right declivities belonging to significant 3D curves

1611 373 1702 254

Results of the detection algorithm Number of certain declivity associations Number of uncertain declivity associations Number of wrong declivity associations

1458 131 113

Results of the correction algorithm Correction of wrong declivity association Number of left declivities found for non matched right declivities

55 82

Results of the advanced matching algorithm Number of certain declivity associations Number of uncertain declivity associations Number of non-matched right declivities belonging to 3D curves

1595 131 230

Table 2. The processing time of the step results of Fig. 3 obtained with a Pentium III 800 MHz with Windows. The language used for implementation is C++. The grey level images have been acquired and processed at the format 720 × 284 × 8 bits. Extraction of edge points of both left and right images Matching based on dynamic programming Construction of 3D curves Advanced matching Total computation time of 3D map construction

31 ms 31 ms 16 ms < 1 ms 78 ms

on a dynamic programming method, and the disparity information is calculated. Finally, to improve the accuracy of the disparity information, an advanced matching algorithm based on multi-criteria analysis is used. In the following sections, the various steps in the construction of 3D maps are described. Figures 3 and 4 show the step results obtained from the processing of a pair of stereoscopic images of an outside scene. From these step results, Table 1 gives an evaluation of the performance of the matching algorithm, and Table 2 gives the processing time of each step.

August 21, 2003 8:56 WSPC/164-IJIG

4

00131

G. Toulminet, S. Mousset & A. Bensrhair

I I ( x i+1)

di I (x ) i

xi Fig. 1.

X

i

x i+1

x-coordinate

Characteristics parameters of a declivity.

2.1. Segmentation The segmentation step uses a self-adaptive and mono-dimensional operator called declivity. Declivity is defined as a set of consecutive pixels in an image line whose grey levels are a strictly monotonous function of their positions. Each declivity is characterized by its amplitude defined by: di = I(xi+1 ) − I(xi ). Relevant declivities are extracted by thresholding these amplitudes. To be selfadaptive, the threshold value is defined by20 : dt = 5.6σ ,

(1)

where σ is the standard deviation of the component of a white noise which is supposed to be gaussian and calculated by using the histogram of grey levels variations of pixels in an image line. The coefficient value is fixed in order to reject 99.5% of increments due to noise. In order to have a good disparity map accuracy, efficient locations of relevant declivities are essential. The position of a declivity is calculated using the mean position of the declivity points weighted by the gradients squared: Pxi+1 −1 [I(x + 1) − I(x)]2 (x + 0.5) , (2) Xi = x=x Pi xi+1 −1 2 x=xi [I(x + 1) − I(x)]

where Xi is the position of the declivity on an image line as shown in Fig. 1. It is computed with a precision of one pixel. Let {R(i, l)} and {L(j, l)} be two sets of declivities ordered according to their coordinates in an arbitrary l right and l left epipolar lines. Each declivity is described by the following attributes (see Fig. 2):

• its x-coordinate in the image line (Xi ) as defined above. In Fig. 2, XRi and XLj are the x-coordinates of declivities R(i, l) and L(j, l) respectively, • the grey-levels of its three left neighboring pixels. IR (xi − k) and IL (xj − k), k = 0, 1 and 2, are those of declivities R(i, l) and L(j, l) respectively, • and the grey-levels of the three right neighboring pixels. IR (xi+1 + k) and IL (xj+1 +k), k = 0, 1 and 2, are those of declivities R(i, l) and L(j, l) respectively.

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

I L ( x j +1 )

5

IR

IL

I L ( x j +1 + 2 )

IR (x i +1 + 2 ) IR (xi +1 +1)

IL ( x j −2 ) IL ( x j )

IR ( x i −1 ) IR ( x i ) xj

X Lj

x j +1

x-coo rdinate

xi

(a)

X Ri

x i+1

x-coordinate

(b)

Fig. 2. A declivity pair with its attributes. (a) Declivity L(j, l) on the left epipolar line. (b) Declivity R(i, l) on the right epipolar line.

To estimate the photometric similarity between two declivities to be matched, we consider the left and right photometric distances21 defined by lphdist =

k=2 X

|IR (xi − k) − IL (xj − k)| ,

(3)

|IR (xi+1 + k) − IL (xj+1 + k)| .

(4)

k=0

rphdist =

k=2 X k=0

2.2. The matching based on dynamic programming 2.2.1. The matching problem Our matching algorithm based on dynamic programming aims to associate each relevant declivity in the line l of the right image with a relevant declivity in the epipolar line l of the left image. Then, the matching problem can be summarized as finding an optimal path on a two-dimensional graph whose vertical and horizontal axes respectively represent the declivities of a left line and the declivities of the stereo-corresponding right line. Axes intersections are nodes that represent hypothetical declivity associations. Optimal matches are obtained by the selection of the path which corresponds to a maximum value of a global gain. Generally, classic methods tend to minimize a cost function. The main difficulty with this approach is that the cost value can increase indefinitely, which affects the computation time of the algorithm. To avoid this, on the one hand, we use a non-linear gain function which varies between 0 and a maximum value equal to 3 × gmax . gmax is defined as gmax = 0.5 × (dtR + dtL ) ,

(5)

where dtR and dtL are respectively the self-adaptive threshold value for the detection of relevant declivities on right and left corresponding scanlines. On the other hand, this gain function is calculated as gain = f (gmax ) − lphdist − rphdisp , where f (gmax ) is calculated as follows:

(6)

August 21, 2003 8:56 WSPC/164-IJIG

6

00131

G. Toulminet, S. Mousset & A. Bensrhair

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 3. Step results of the matching algorithm: (a) the left and (b) the right images. (c) the right and (d) the left edge points. (e) the 3D sparse map based on dynamic programming and coded with grey level values: the higher the grey level, the higher the disparity value. (f) the dense 3D map of (e). (g) the significant 3D curves. (h) the dense advanced 3D map obtained with significant 3D curves whose length l is l ≥ 4.

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

(g)

7

(h) Fig. 3 (Continued )

(a)

(b)

(c)

Fig. 4. Step results of the advanced matching algorithm on a 3D curve: (a) the projection of the 3D curve superimposed on the right image. (b) the results of the detection algorithm: the black left declivities belong to wrong or uncertain declivity associations, the white left declivities belong to certain declivity associations. (c) the representative projection of the 3D curve after the correction algorithm superimposed on the left image.

Case 1. If lphdist < gmax and rphdist < gmax then f (gmax ) is initialized to 3 × gmax . Case 2. If lphdist < gmax or rphdist < gmax , then f (gmax ) is initialized to gmax . Case 3. If lphdist > gmax and rphdist > gmax , then the corresponding hypothesis is rejected. The major advantage of this non-linear gain is that it enables the self-adaptivity, robustness and rapidity of the matching algorithm. 2.2.2. The algorithm details Our matching algorithm consists of three steps. Step 1: In the first step, we construct all possible declivity associations (R(i, l); L(j, l)) taking into consideration a geometric constraint. The position X Ri

August 21, 2003 8:56 WSPC/164-IJIG

8

00131

G. Toulminet, S. Mousset & A. Bensrhair

of R(i, l) on the line l of the right image and the position XLj of L(j, l) on the line l of the left image must satisfy the geometric constraint if 0 < XRi − XLj < dispmax , where dispmax is the maximum possible disparity value. dispmax depends on the length of the baseline and the focal length of the cameras. Step 2: In this second step, nodes corresponding to the hypothetical declivity associations are positioned on the 2D graph. Each node in the graph is associated to a local gain calculated by Eq. (6) that represents the quality of the hypothetical declivity association. As a result, we obtain several paths from an initial node to a final node in the graph. During the construction of the graph, we take into account the non-reversal constraint in declivity correspondence, as well as the continuity of the disparity constraint. The gain of the path, i.e. the global gain, is the sum of the gains of its primitive paths. This gain is defined as follows. Let G(k, l) be the maximum gain of the partial path from an initial node to node (k, l), and let g(k, l, i, j) be the gain corresponding to the primitive path from node (k, l) to node (i, j), which in fact, only depends on node (i, j). Finally, G(i, j) is computed as follows: G(i, j) = max[G(k, l) + g(k, l, i, j)] . (k,l)

(7)

Step 3: In the final step, the optimal path in the graph is selected. It corresponds to the maximum value of the global gain. The best declivity associations are the nodes of the optimal path taking the uniqueness constraint into account. 2.2.3. The result of the matching For each line l of the epipolar lines of the right and left images, and for each declivity association (R(i, l); L(j, l)) of the optimal path of line l, a disparity value δ(i, j, l) is computed. It is equal to XLj −XRi , where XRi and XLj are the respective positions in the l right and l left epipolar lines of R(i, l) and L(j, l). Then, the result of the matching algorithm based on dynamic programming is a 3D sparse map. To obtain a dense 3D map, a simple interpolation is done so that the disparity value of the points between two declivities is equal to the lower value of disparity of these two declivities. The matching algorithm based on dynamic programming is efficient in matching declivities and provides accurate 3D maps.21 However, in order to avoid false alarms when detecting obstacles, the accuracy of 3D maps needs to be improved (see Fig. 3). Error in disparity measurement is usually modeled as consisting of both a stochastic component and an outlier component.25 The first one comes from the finite pixel size, quantization effects and random noise in the intensity values. The second comes from wrong matching. Whereas the first component is usually in the order of one pixel or less, the second can be many pixels in magnitude. The advanced algorithm aims to eliminate the second component. It detects and corrects wrong declivity associations. It also matches right declivities that have not

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

9

been matched. For this, it uses criteria related to road environment and is applied on significant 3D curves that are built beforehand. 2.3. Construction of 3D edge curves An actual 3D edge curve can be constructed using its projections in the right and left images. Then, the 3D edge curves are built based on the result of the segmentation on the right image and the result of the matching algorithm based on dynamic programming. The construction of 3D curves starts with the construction of their projections in the right image. 2.3.1. Characteristics of the construction of 2D right curves By means of line by line processing, 2D right curves are made based on right declivities so that: • a right declivity belongs to one and only one 2D right curve • a 2D right curve starting at line ls and ending at line le (le ≥ ls ), has one and only one point from line ls to line le 2.3.2. The algorithm of 2D right curve construction On the first line of the right image, each declivity that has a stereo-correspondent generates a 2D right curve. For each declivity of other lines of the image, the following steps are performed. Step 1: Let R(i, l) be a right declivity whose coordinates in the right image are (XRi , l). The set S is constructed with declivities whose coordinates in the right image are (XRi + q, l − p), with q ∈ {−2, −1, 0, 1, 2} and p ∈ {1, 2}. Step 2: A priority level is computed for each set {R(i, l), R(iq , l − p)} with R(iq , l − p) ∈ S. The priority level evaluates the extension by R(i, l) of the curve to which R(iq , l − p) belongs. For this computation • we use the coordinates of R(i, l) and R(iq , l − p) in the right image. And if the stereo-correspondents of R(i, l) and R(iq , l − p) both exist, then we use their coordinates in the left image • we take into account the characteristics of the 2D right curve construction Step 3: The highest priority level of curve extension is considered. Let ER be the curve that must be extended. Step 4: If a highest priority level of extension of a curve has been computed, then R(i, l) and eventually a point at line (l − 1) extend ER . Otherwise, if R(i, l) has a stereo-correspondent, it generates a new 2D right curve.

August 21, 2003 8:56 WSPC/164-IJIG

10

00131

G. Toulminet, S. Mousset & A. Bensrhair

Step 5: Due to the extension of ER by R(i, l) and eventually a point at line (l − 1), the algorithm of 2D right curve construction is applied on points that may have been taken off ER or other 2D right curves. The algorithm of the 2D right curve construction provides the projections in the right images of actual 3D curves. The result of the matching algorithm based on dynamic programming provides the estimations of their projections in the left image. Each 2D right curve and its associated 2D left curve define a 3D curve. A 3D curve is significant if its projection in the right image contains at least three right declivities. By their construction, the significant 3D curve may be incomplete and/or has partially false portions of edges of road objects. 2.4. The advanced matching algorithm In order to improve the accuracy of 3D maps, significant right and left declivities that construct significant 3D curves are selected, and non significant 3D curves are eliminated. This operation eliminates most of the wrong matching. Then, from the declivities of significant curves, the advanced matching algorithm aims to match significant right declivities that have not been matched, to detect and to correct wrong significant declivity associations. To detect the wrong declivity associations, we suppose that most of declivity associations are correct. And, as the edges of road objects are continuous 3D curves, if the coordinates of a 3D point belonging to a 3D curve does not validate a continuity criteria, then this 3D point is the result of a wrong edge points association. In order to match the right edge point that has not been matched or that has been wrongly matched, a left edge point validating a continuity criteria is searched for each of these right edge points (Fig. 4). 2.4.1. Detection of wrong declivity associations Let ER and EL be respectively the projection in the right image and the estimation of the projection in the left image of an actual significant 3D curve. At the beginning of the detection algorithm, we supposed that all declivity associations of all 3D curves are certain. Then, the detection algorithm is applied on each 2D curve E L , and is divided into two steps. Step 1: The first step aims to detect uncertain declivity associations. It is applied on each declivity L(j, l) of EL that is not the first point of EL . Let (XLj , l) be the coordinates in the left image of L(j, l). If there is a point L(jr , l − 1) of EL whose coordinates in the left image are (XLj + r, l − 1) with r ∈ Z − {−2, −1, 0, 1, 2}, then (R(i, l), L(j, l)) and (R(ir , l), L(jr , l)) are the uncertain edge points association. Step 2: The second step aims to detect among uncertain edge points associations those that are wrong edge points associations. It is applied on each point Lu (j, l) of EL whose association with its right stereo-correspondent Ru (i, l) is uncertain. Let (Rc (i1 , l − p1 ), Lc (j1 , l − p1 )) and (Rc (i2 , l + p2 ), Lc (j2 , l + p2 )) be two certain 3D points. Lc (j1 , l − p1 ) and Lc (j2 , l − p2 ) belong to EL and their coordinates in the

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

11

left image are (XLcj1 , l − p1 ) and (XLcj2 , l − p2 ) with p1 ∈ {1, 2}, p2 ∈ {1, 2} and (p1 +p2 ) ≤ 3. If Lc (j1 , l −p1 ) and Lc (j2 , l −p2 ) exist, a constraint is defined between Lu (j, l), Lc (j1 , l − p1 ) and Lc (j2 , l − p2 ). If a continuity criteria is validated then (Ru (i, l), Lu (j, l)) is a certain 3D point. Otherwise, (Ru (i, l), Lu (j, l)) is a wrong 3D point. If Lc (j1 , l − p1 ) or Lc (j2 , l − p2 ) does not exist, then (Ru (i, l), Lu (j, l)) is an uncertain 3D point. 2.4.2. Correction of wrong declivity associations The correction algorithm is applied on each point Rw (i, l) of Er that has not been matched or wrongly matched. Let (XRwi , l) be its coordinates in the right image. Let (Rc (i1 , l − 1), Lc (j1 , l − 1)) and (Rc (i2 , l + 1), Lc (j2 , l + 1)) be two certain 3D points. Rc (i1 , l − 1) and Rc (i2 , l + 1) belong to ER and their respective coordinates in the right image are (XRci1 , l − 1) and (XRci2 , l + 1). If (Rc (i1 , l − 1), Lc(j1 , l − 1)) and (Rc (i2 , l + 1), Lc (j2 , l + 1)) both exist, then we look for a left declivity Lc (j, l) that has not been matched or wrongly matched so that a constraint defined between Lc (j, l),Lc (j1 , l − 1) and Lc (j2 , l + 1) validates a continuity criteria. If Lc (j, l) exists, then Lc (j, l) is the correct left stereo-correspondent of Rw (i, l). For this certain declivity association (Rw (i, l), Lc (j, l)), a disparity value is computed. It is equal to XLcj − XRwi where (XLcj , l) are the coordinates of Lc (j, l) in the left image. 2.4.3. The results of the advanced matching algorithm The correction algorithm ends the advanced matching algorithm that provides declivity associations labeled as certain or uncertain. For each certain declivity association (R(i, l), L(j, l)), we consider that R(i, l) and L(j, l) are the projections of an actual 3D point respectively in the right and in the left images because the number of errors due to wrong matching has been considerably reduced. The result of the advanced matching algorithm is a 3D sparse map whose accuracy has been improved (see Table 1): In the particular case of Fig. 3, • the number of wrong declivity associations is 113 • the number of non-matched right declivities belonging to significant 3D curves is 254 • the number of uncertain declivities is 131. Note that this number does not change because uncertain declivities associations are not corrected by the correction process. The maximum number of non or wrongly matched left declivities to find is 367. The correction process finds 137 left declivities that were not or wrongly matched. These 137 corrections increase the number of correct declivities associations from 1458 to 1595. This corresponds to a 10% improvement of the 3D sparse map. This improvement seems to be low. Nevertheless, this improvement is interesting because of its computation time which is very low (see Table 2). In addition, it is of paramount importance in following sections.

August 21, 2003 8:56 WSPC/164-IJIG

12

00131

G. Toulminet, S. Mousset & A. Bensrhair

Associated to the construction of 3D curves, the advanced matching algorithm provides also the right projection and parts of the left projections of the actual 3D curves respectively in the right and left images. We define as representative the parts of the left projection of an actual 3D curve if it has a minimum number of 3D points. They are used to construct 3D surfaces. They are constructed so that their projections in the right image have the biggest surface areas that are characterized by: • a top and a bottom frontiers that are two image lines • a left and a right frontiers that are a part of the projections in the right image of two different significant 3D curves • the points belonging to a projection of a 3D surface and belonging to the projection in the right image of a 3D significant 3D curve belong to the left or the right frontier of the projection of the 3D surface. The interpolation method applied on 3D points of the right and left frontiers of a 3D surface ends its construction. Among constructed 3D surfaces, only those that have minimum surface areas are selected. By its construction, a 3D surface is a portion of a road rigid object. They are used in the construction of axial motion maps.

3. Fast Estimation of Axial Motion Component In this section, we present a robust and fast method that compute axial motion maps. We define axial motion as the component of 3D motion following the optical axis of the stereo vision sensor. Our method in estimating axial motion consists in a low level computing on the image point from a range of dense disparity maps obtained with 3D points of advanced significant and non significant 3D curves. Our stereo vision sensor provides dense 3D maps in real time where obstacles are modeled by 3D surfaces. Then, we can consider that road obstacles have little displacements between two 3D maps, and so that their 3D surface overlap in two consecutive dense 3D maps. Each pixel in the overlapping areas has a correct tracking, and axial motion is computed on these image points. For the computation, we use discrete approaches and motion oriented methods22 and we assume that motions are continuous and uniform translations along the optical axis of the stereo vision system. The specificity of our approach lies in two points: first, objects are not isolated before motion analysis, and then, we only consider the axial motion transformation. The axial motion computation consists of two steps. In the first step, we estimate motion with a low-level computing for an image-point by a detection-estimation structure presented in Fig. 5. In the second step, we use the neighborhood information of the image-point for the morphology operation described in Sec. 3.5. The aim of this procedure is to compute axial motion without region matching.

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

Criterion Temporal observational vector

Detection module

Constraints

13

maximum motion

State of tracking Partial axial motion Estimation module map

Fig. 5.

The detection-estimation structure.

3.1. The detection module The detection module is the pivot of the structure detection estimation. It estimates the state of the tracking. This problem can be summed up by: (i) (ii) (iii) (iv)

correct tracking of an image point, appearance of an object since the previous instant, correct tracking at instant t, corrupted data at the previous instant, correct tracking at instant t, state is not determined, the computation of the axial motion for the image point is impossible. There are three possibilities for this state: • appearance of an object, • disappearance of an object, • corrupted data.

For a pixel, the appearance or the disappearance of an object is an identical problem since the disappearance of an object corresponds to the appearance of a new object. We assumed that the context of the observed scene and evolution of object motion is the road environment. Therefore the motions are seen as translations along the road. Then, the tracking evolution possibilities are limited. We describe this tracking evolution by the automaton of Fig. 6. The constraint and criterion modules of Fig. 5 enable the validation of the state of tracking for an image point. These modules are defined by the observed environment.

3.2. The criterion module This module analyzes the pixel temporal evolution and defines the problem of tracking evolution. Two criteria can define the tracking problem. One of the criterion is associated to disparity measurement errors, the other one is associated to the dynamic model errors.

August 21, 2003 8:56 WSPC/164-IJIG

14

00131

G. Toulminet, S. Mousset & A. Bensrhair

Steady tracking of the same object

Yes

correct tracking with new measure

No

State no determined (same object)

Yes

corrupted data at previous instant

No

New object detect but not confirmed

Yes

tracking correct of this new object

No

Fig. 6.

Tracking evolution.

The criterion associated to disparity measurement is validated when the measurement of disparity verifies Eq. (8). ˆ R , t) − 2, δ(p ˆ R , t) + 2] , δ(pR , t) ∈ [δ(p

(8)

where the width of the interval of equation Eq. (8) corresponds to the precision on disparity measurement (two pixels). Indeed, the position of declivity is computed ˆ is the prediction of the disparity at with a precision of 1 pixel (see Sec. 2.1). δ(t) instant t taking into account the hypothesis of a continuous and uniform translation ˆ is equal to: for an axial motion V (t). δ(t) 1 V (t) × ∆t × p 1 = + . ˆ R , t + 1) δ(pR , t) f ×e δ(p

(9)

The second criterion is defined by the precision on the cinematic model of the observed scene. The displacement of the 3D surfaces are supposed to be a continuous and uniform translation. There is a margin of errors defined by the maximum acceleration Γ admissible for all observed points. Using Eq. (9), we can define the admissible values by: 1 ∈ [amin ; amax ] , ˆ R , t) δ(p

(10)

amin =

1 ∆t × p + × (V (t − 1) − Γ.∆t) , δ(pR , t − 1) f ×e

(11)

amax =

1 ∆t × p + × (V (t − 1) + Γ.∆t) . δ(pR , t − 1) f ×e

(12)

The overall criterion is verified if at least one of the two criteria is verified.

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

15

3.3. The constraint module This module defines the environment constraint. They are: • the overlapping constraint • the maximal acceleration constraint • the maximal speed constraint. Maximal acceleration constraint and maximal speed constraint are fixed by the road environment. Then, we define the maximum of axial acceleration as the maximal acceleration of a standard vehicle, that is to say 4 m·s−2 . And we define the maximal speed as 50 m·s−1 (180 km/h). It corresponds to two cars passing each other. 3.4. The Estimation module The Estimation module estimates the axial motion component of each image point that has a correct tracking at instant t. It provides the partial axial motion map. 3.4.1. Axial motion for an image point Consider a 3D point P at distances ZP (t) and ZP (t + ∆t) respectively at times t and (t + ∆t). The axial motion component is given by a Taylor’s development to the first order of the depth value. ZP (t + ∆t) = ZP (t) + Z˙ P (t) × ∆t + o(∆t) ,

(13)

where ∆t → 0. Assuming the axial velocity is constant during ∆t and using Eq. (13), then it can be expressed by:   1 1 f ×e ˙ , (14) − ZP (t) = p × ∆t δ(pR , t + ∆t) δ(pR , t) where pR is the projection of the 3D point P in the right image and δ is the discrete function disparity. However, the disparity is an integer and for distant 3D points, the disparity values are very low and may not change between two acquisitions of images. To face this problem we compute the axial motion on a variable interval equal to n × ∆t and corresponding to the time of a non change of disparity:   1 f ×e 1 ˙ . (15) ZP (t) = − p × n × ∆t δ(pR , t + n × ∆t) δ(pR , t) 3.4.2. Axial motion estimation for a long sequence of stereo vision images To compute axial motion for a pixel in a long sequence of stereo vision images, we use a simple iterative algorithm. With this algorithm the values of axial motion estimates change quickly when the actual motions change. For the algorithm, we define these variables:

August 21, 2003 8:56 WSPC/164-IJIG

16

00131

G. Toulminet, S. Mousset & A. Bensrhair

• VˆZ (ti , ti ): Estimate of the axial motion at instant ti . • VˆZ (ti+1 , ti ): Prediction of the axial motion estimate at instant ti + 1 with data of instant ti . The instant ti is the instant of the last measurement and the instant ti + 1 is defined when the disparity has changed since ti . • Z˙ P (t): Instantaneous axial motion of point P at instant t. Assuming the axial velocity of a point is constant during ∆t, then we can write: VˆZ (ti+1 , ti ) = VˆZ (ti , ti ) .

(16)

The axial motion estimation at instant ti + 1 is obtained when giving an equal weight to the prediction of the axial motion estimation and to the axial motion measurement at instant ti : 1 VˆZ (ti+1 , ti+1 ) = VˆZ (ti+1 , ti ) + (Z˙ P (ti+1 ) − VˆZ (ti+1 , ti )) . (17) 2 3.5. Spatio-temporal tracking In the second step of the computation of the axial motion component, we use neighborhood information. In the first step, the detection-estimation structure provides partial axial motion maps by estimating the axial motion component of image points that belong to the overlapping areas. In order to extend the result to image points that correspond to the appearance of an object or corrupted data at the present instant, we use a morphology operation which is a geodesic linear dilation (see Fig. 7). The criterion of extension uses the depth information instant (t-1)

instant t

background

instant t

background

object object disparity map

background

disparity map

disp f1

disp f2

disp o1

disp o2

object

axial motion map background

Vf2

background

Vo2

object object axial motion map

partial axial motion map

Vf1

Vf2

Vo1

Vo2

Fig. 7.

non overlapping area

The geodesic linear dilation.

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

17

of image points and the results of the 3D surface construction. Then, the extended axial motion maps are computed without region matching that are computationally expensive. With our method, 3D dense maps at the format 720 × 284 × 8 bits are processed in 158 ms on a Pentium III 800 MHz with Windows. 4. Precise Depth and Axial Motion of Obstacles In order to estimate precisely the depth and the axial motion of obstacles, the obstacles must be first detected. Because a road obstacle is a rigid object, the points of the 3D surfaces that compose it have the same axial motion and almost the same depth. In addition, the 3D surfaces of the obstacles have frontiers that may belong to the same 3D curves. Then, based on the 3D curves and the results of the geodesic linear dilation, sets of 3D surfaces are constructed. Each set of 3D surface is a road object. Like the 3D surfaces, the object are characterized by top and bottom frontiers that are two image lines and left and right frontiers that are portions or entire projections of significant 3D curves in the right image. In order to detect the road objects that are obstacles, we use an algorithm of 3D obstacle edges extraction, and the results of obstacles detection at the previous instant. The depth of an obstacle is the lowest depth value of its certain 3D points. Its axial motion is the mean value of axial motions of its points. 4.1. Extraction of 3D obstacle edges In this section, we present our algorithm of obstacle edges extraction that has been installed and tested onto the GOLD system.23 GOLD18 is a stereo vision software developed at the Dip. di Ingegneria de’ll Informazione of the University of Parma and which stands for Generic Obstacle and Lane Detection. Our algorithm starts with the decomposition of 3D curves into 3D straight segments by an iterative partition method. Indeed, as the road environment is structured, the representative parts of advanced 3D curves can be approximated by means of one or several 3D straight segments. Usually a 3D curve is a 3D segment. Then, in order to select 3D segments that belong to the obstacles, we suppose that the road is a flat plane, then we calculate and threshold the inclination angles of the 3D segments. To compute the inclination angle of the 3D segments, we suppose that the configuration of the stereo vision sensor is known. We suppose that the configuration is such that the optical axes of the two camera-lens units are parallel, and the straight line joining the two optical centers is parallel to each image horizontal line in order to respect an epipolar constraint. We suppose also that the position of the stereo vision sensor is known and the plane containing the two optical center and the optical axes is parallel to the road that is supposed to be locally a flat plane. To compute the inclination angle of a 3D segment, we use the equations of its projections in the right and left images. The equation of the right projection fr calculated in (Rr Xr Yr ) of Fig. 8, and the left projection fl calculated in (Rl Xl Yl ) are: fr : xr 7→ mr × y + br

fr : xl 7→ ml × y + bl ,

(18)

August 21, 2003 8:56 WSPC/164-IJIG

18

00131

G. Toulminet, S. Mousset & A. Bensrhair

Z

Rl Rr

X

Xr

Xl

Yl

Or Yr

Fig. 8.

Ol

Y

3D segment

Inclination angle β of a 3D segment.

with mr , ml , br , bl calculated by a least square method. Note that the advanced matching algorithm is of paramount importance for the calculation of mr , ml , br and bl , and also for a reliable estimation of the inclination angle. The equation of the plane Pl that contains the projection of the 3D segment in the left image and the optical center Ol is calculated. We do so with plane Pr that contains Od and the right projection of the 3D segment in the right image. The intersection of Pl and Pr is a 3D straight line that contains the 3D segment whose vector director is   w   px × ((mr × bl − ml × br ) + × (ml − mr )) Vx 2   h , (19) V =  Vy  =    py × ((bl − br ) + × (ml − mr )) 2 Vz f × (ml − mr ) with px and py as the width and height of the CCD pixel respectively. f is the focal length of the two lenses, and w × h is the resolution in pixels of the cameras. Then, the tangent of the inclination angle of the 3D segment is given by Eq. (20). tan β = p

Vy . Vx2 + Vz2

(20)

The 3D straight segments whose inclination angle exceed a threshold angle are identified as the edges of obstacles. The others are identified as the edges of the road (see Fig. 9). 4.2. Obstacle detection and tracking For each object O constructed at instant t, we select 3D segments or portions of 3D segments that belong to it. Indeed, It happens that one of the right or left frontiers of the object does not belong to it: for example when a vehicle is partially occluded by another one. Then, if most of the 3D straight segments of O are identified as edges of obstacles by the algorithm of 3D edges extraction, then O verifies the obstacle criterion. Then, the overlapping areas of O and objects constructed at instant (t−1) enable the object tracking. From the results of the obstacles detection algorithm on O obtained at (t − 1), and the results of O edges extraction obtained at instant t, there are four possibilities to conclude on O.

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

19

Fig. 9. Results of the extraction of obstacles 3D segments: the white segments are identified as the edges of obstacles, the black one as the edges of the road.

(i) (ii) (iii) (iv)

O O O O

belongs to the road with tracking confirmation, is an obstacle with tracking confirmation, is an obstacle without tracking confirmation, is an obstacle due to tracking.

5. Experimental Results and Discussions Our algorithm of obstacle detection has been tested on many real images. In this article, an example of experimental results obtained at instant t = 5.6 s from a sequence of 60 stereo pairs of grey levels images is presented in Fig. 10. For this sequence, a car is coming to the stereo vision system that is stationary and is placed in such a way that the hypothesis of driving on a flat plane is validated. The computation time of the entire process is 270 ms on a Pentium III 800 MHz with Windows, and with the processed images at the format 720 × 284 × 8 bits. Figure 11 is a graph where the computed axial motion of the moving car is compared to the actual one for each pair of images of the sequence. The axial motions presented on Fig. 11 correspond to actual and estimated axial motions of a pixel of the image. Then, the jump from 0 m/s to 8.2 m/s of the actual axial motion corresponds to the appearance of the vehicle for this pixel. And, the jump from 8.2 m/s to 0 m/s of the actual axial motion corresponds to the disappearance of the vehicle for this pixel. For constant axial motions, the computed axial motions converge to the actual one as expected by the process. But, the system needs about 5 s to converge, which is too long. However, this result must be nuanced, because it was difficult to drive the car at a constant speed. Other experimental results are presented in Fig. 12 (the second grey level images has been acquired by the Istituto Elettrotecnico Noazionale, Italy). From all these experimental results, we

August 21, 2003 8:56 WSPC/164-IJIG

20

00131

G. Toulminet, S. Mousset & A. Bensrhair

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 10. Experimental results: (a) the right grey level image. (b) the results of the extraction of obstacles 3D segments: the white segments are identified as edges of obstacles, the black one as edges of road. (c) the entire 3D map. (d) the entire axial motion map. (e) the 3D position of obstacles. (f) the axial motion of obstacles.

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

21

conclude that the use of 3D surfaces has caused the smooth peaks of axial motion. Nevertheless, axial motion measurements are still noisy. Indeed, the interpolation step is simple and rapid, but the constructed 3D dense maps may not be coherent to reality. And, the temporal accumulation of these errors leads to noisy axial motion maps. In addition, the assumption that motions are continuous and uniform translations along the optical axis of the stereo vision system may not always be valid for all road scenarios. Then, additional information like lateral motion are essential to reduce the noise of the estimation of axial motion. -14

axial motion of the car

speed (m/s)

-12 -10 -8 -6 real axial motion

-4 -2

measured axial motion 0

2

4

6

8

10

12

14

0 time (s) 2

t = 5,6 s

4

Fig. 11.

Evolution of the vehicle axial motion.

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 12. Experimental results: (a) the right grey level image. (b) the entire 3D map. (c) the entire axial motion map. (d) the results of the extraction of obstacles 3D segments: the white segments are identified as edges of obstacles, the black one as edges of road. (e) the 3D position of obstacles. (f) the axial motion of obstacles.

August 21, 2003 8:56 WSPC/164-IJIG

22

00131

G. Toulminet, S. Mousset & A. Bensrhair

(a)

(b)

(c)

(d)

(e) Fig. 12 (Continued )

(f)

The extraction of 3D edges of obstacles by thresholding their inclination angle is promising as shown also on Fig. 13. The stereo vision images have been acquired in twilight and in the daytime with our stereo vision system installed on the roof of our Laguna 2 experimental car. The sequences represent urban scenes. For these sequences, we drove at low speed. Our system features 40.5 cm between the two optical centers and 16 mm of focal length of the lenses. These experimental results prove that the reduced visibility conditions do not affect the extraction of 3D edges of obstacles. This is due to the self adaptivity of the operator declivity that made it possible to have an efficient segmentation of grey level images even if the visibility conditions are reduced. With our method, the edges of the pedestrian are extracted as far as 50 m. Further, 3D edges of pedestrians may be extracted but the extraction is not guaranteed. This result is promising because pedestrians are difficult to detect due to their low width in the images. The edges of the cars are extracted as far as 70 m. Furthermore, the 3D edges of cars may be extracted but the extraction is not guaranteed. From these experimental results, it should be noted that the 3D edges of obstacles may not be all extracted and there may be false extraction. This drawback usually appears on small 3D segments. Indeed, the errors in estimating the equation parameters of 3D segments in the two images are due to stochastic noise and are inversely proportional to the length of the 3D segment: The longer the 3D segment, the more accurate is its inclination angle. In addition, in some cases the 3D edges of the obstacles may not be all vertical (security guardrails on highways, some edges of towing, . . .). However, when adapted to non flat ground plane, our method is not sensitive to approximative modeling and detection of the ground. Classical methods

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

23

Fig. 13. Experimental results: (a) the right grey level image. (b) the 3D curves. (c) the results of the extraction of obstacles 3D segments: the white segments are identified as edges of obstacles.

that extract 3D edges of obstacles using stereo vision model the ground as a plane and they extract 3D points of obstacles by thresholding their disparity value. Consequently, classical methods are sensitive to approximative modeling and detection of the ground, but are not sensitive to stochastic noise.25 In Ref. 23, we have proposed a cooperation of those two methods of extraction of 3D edges of obstacles. Their different sensitivity used in a complementary way increase the reliability and the robustness of the extraction. In Ref. 23, the ground has been modeled by a flat plane. We are now working to adapt this cooperation to non-flat ground plane.

August 21, 2003 8:56 WSPC/164-IJIG

24

00131

G. Toulminet, S. Mousset & A. Bensrhair

6. Conclusion In this article, we have presented a fast stereo vision system that detects and tracks road obstacles, and also precisely estimates their 3D position and their axial motion. It is based on estimations of the inclination angles of 3D straight segments and construction of axial motion maps. A great effort has been made to obtain accurate data: • the calibration of the stereo vision system (it has been installed and tested on the experimental autonomous vehicle ARGO24 ), • the computation of the positions of relevant declivities in the images, • the advanced matching based on a multi-criteria analysis of 3D curves, • the algorithm of axial motion map construction that detects the remaining corrupted data and does not take them into account for the computation contribute to accurate estimations of the 3D positions of obstacles. These efforts also contribute in improving the reliability of detection and the tracking of obstacles when the hypothesis of driving on a flat plane is validated: pedestrian and cars are respectively detected as far as 50 m and 70 m. However, axial motion maps are too noisy and additional information like lateral motion is essential. One of our future research work will aim at computing lateral motion. Other future research will aim at integrating our stereo vision system into our Laguna 2 experimental vehicle, which is already equipped with passive sensors that provide for example the speed of the Laguna 2, the position of the steering wheel, etc. The hypothesis of driving on a flat plane is not always validated due to the intelligent vehicle pitch and roll movements and changes in the slope of the road. Thus, our stereo vision system needs to be improved, as the computation of the ground plane is required. Finally, real driving conditions require real time computation. Even if our stereo vision system is fast, real time processing is not achieved due to the amount of data processed. But it should be noted that the algorithm has not yet been optimized and that the use of dedicated engines would considerably reduce the processing time. Acknowledgments A part of this work was supported by a GALILEE 2000 program. It was a FrenchItalian program between the INSA of Rouen and the Dip. di Ingegneria de’ll Informazione of the University of Parma. References 1. M. Maurer, R. Behringer, S. Urst, F. Thomanek and E. Dickmanns, “A compact vision system for road vehicle guidance,” International Conference on Pattern Recognition, Vienna, pp. 313–317 (1996).

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

25

2. A. Broggi, M. Bertozzi and A. Fascioli, “Argo and the millemiglia in automatico tour,” IEEE Intelligent Systems 14(1), 55–64 (Jan–Feb 1999). 3. D. Pomerleau and T. Jochem, “Rapidly adapting machine vision for automated vehicle steering,” IEEE Expert: Special Issue on Intelligent System and their Applications 11(2), 19–27 (1996). 4. I. Paromtchik and C. Laugier, “Motion generation and control for parking an autonomous vehicle,” Proc. IEEE Int. Conf. Robotics and Automation (Minneapolis, USA, April 22–28, 1996), pp. 3117–3122. 5. T. Williamson and C. Thorpe, “A trinocular stereo system for highway obstacle detection,” IEEE ICRA’99 (1999). 6. M. Bertozzi, A. Broggi and A. Fascioli, “Vision-based intelligent vehicles: State of the art and perspectives,” Journal of Robotics and Autonomous Systems 32(1), 1–16 (June 2000). 7. S. Denasi and G. Quaglia, “Early obstacle detection using region segmentation and model-based edge grouping,” Proc. IEEE Intelligent Vehicles Symp. (Stuttgart, Germany, October 1998), pp. 257–262. 8. M. Bertozzi, A. Broggi, A. Fascioli and S. Nichele, “Stereo vision-based vehicle detection,” in IEEE Intelligent Vehicles Symposium (Detroit (MI), USA, October 2000), pp. 39–44. 9. S. M. Smith, “ASSET-2: Real-time motion segmentation and object tracking,” Proceedings of the Fifth International Conference on Computer Vision (Cambridge, MA, 1995), pp. 237–244. 10. P. Batavia, D. Pomerleau and C. Thorpe, “Overtaking vehicle detection using implicit optical flow,” Proc. IEEE Intelligent Transportation Systems Conference (Boston, MA, 1997), pp. 729–734. 11. R. Jain, S. L. Bartlett and N. O’Brien, “Motion stereo using ego-motion complex logarithmic mapping,” IEEE PAMI 9(3), pp. 356–369 (1987). 12. L. Li and J. H. Duncan, “3D translational motion and structure from binocular images flows,” IEEE PAMI 15(7), 657–667 (1993). 13. U. Franke, D. Gavrila, S. G¨ orzig, F. Lindner, F. Patzhold and C. W¨ ohler, “Autonomous driving approaches downtown,” IEEE Intelligent Systems 13(6), pp. 40–48 (1999). 14. S. Badal, S. Ravela, B. Draper and A. Hanson, “A practical obstacle detection and avoidance system,” WACV94, pp. 97–104 (1994). 15. K.-H. Siedersberger, M. Pellkofer, M. L¨ utzeler and E. D. Dickmanns, “Combining EMS-Vision and horopter stereo for obstacle avoidance and autonomous vehicles,” ICVS’2001 (Vancouver, Canada, July 7–8, 2001). 16. Q. T. Luong, J. Weber, D. Koller and J. Malik, “An integrated stereo-based approach to automatic vehicle guidance,” ICCV95, pp. 52–57 (1995). 17. Y. Ruichek and J.G. Postaire, “A new neural real-time implementation for obstacle detection using linear stereo vision,” Real Time Imaging 5, 141–153 (1999). 18. M. Bertozzi and A. Broggi, “GOLD: A parallel real-time stereo vision system for generic obstacle and lane detection,” IEEE Transactions on Image Processing 7(1), 62–81 (January 1998). 19. M. Bertozzi, A. Broggi and A. Fascioli, “An extenion to the inverse perspective mapping to handle non-flat roads,” IEEE IV’98 (Stuttgart, Germany, Oct. 1998), pp. 305-310. 20. P. Mich´e and R. Debrie, “Fast and self-adaptative image segmentation using extended declivity,” Annals of Telecommunication 50(3–4), pp. 401–410 (1995).

August 21, 2003 8:56 WSPC/164-IJIG

26

00131

G. Toulminet, S. Mousset & A. Bensrhair

21. A. Bensrhair, P. Mich´e and R. Debrie, “Fast and automatic stereo vision matching algorithm based on dynamic programming method,” Pattern Recognition Letters 17, 457–466 (1996). 22. S. M. Haynes, “Detection of moving edges,” Computer Vision Graphics and Image Processing 21 (1983). 23. A. Bensrhair, M. Bertozzi, A. Broggi, A. Fascioli, S. Mousset and G. Toulminet, “Stereo vision-based feature extraction for vehicle detection,” IEEE IV’2002 (to appear, Versaille, France, June 2002). 24. A. Bensrhair, M. Bertozzi, A. Broggi, P. Mich, S. Mousset and G. Toulminet, “A cooperative approach to vision-based vehicle detection,” IEEE ITSC’01 (Oakland, USA, 25–29 August 2001), pp. 207–212. 25. J. Weber and M. Atkin, “Further results on the use of binocular vision for highway driving,” SPIE Conference on Intelligent Systems and Controls, SPIE Vol 2902 (November, 1996).

August 21, 2003 8:56 WSPC/164-IJIG

00131

Stereo Vision-Based Estimation of 3D Position and Axial Motion of Road Obstacles

27

Gwena¨ elle Toulminet graduated with the M. Eng. degree in Electrical Engineering (1999) from the Superior School in Electrical Engineering, Rouen, France, discussing a Master’s thesis on the reconstruction of the environment of a mobile robot equipped with sonars and a laser telemeter. This research work was performed when she was an exchange student researcher at the Perception and Robotics Research Group (GRPR) at the Polytechnic School of Montreal, Canada. In December 2002, she graduated with the PhD degree in Physics at the National Institute of Applied Sciences(INSA) of Rouen. Her PhD work focused on the extraction of 3D edges of obstacles using stereo vision for driving assistance. She is currently a researcher at the Laboratory of Perception, Systems, Information at the National Institute of Applied Sciences(INSA)/University of Rouen. Her research work focuses on intelligent vehicles, sensors and stereo vision analysis. St´ ephane Mousset is currently an Assistant Professor at the laboratory PSI-INSA/University of Rouen. He teaches at the Technology University Institute of Rouen. He graduated from ´ the Ecole Normale Suprieure of Cachan and obtained his PhD in Computer Science at the University of Rouen in 1997. His PhD studies focused on the estimation of axial motion using a stereo vision system. His research interests include stereo vision analysis, road application, driving assistance systems and sensors. Abdelaziz Bensrhair graduated with the Master of Science in electrical engineering (1989) and the PhD degree in Computer Science (1992) from the University of Rouen, France. From 1992 to 1999, he was an Assistant Professor in the Physics and Instrumentation Department, University of Rouen. He is currently a Professor in Information Systems Architecture Department and the Head of Vision Systems Division in the Perception System Information Laboratory (PSI) of the National Institute of Applied Sciences of Rouen (INSA). His research interests include vision systems, neural vision and real-time implementation on intelligent vehicles.