Semi-Automatic Linear Feature Extraction by

1 downloads 0 Views 3MB Size Report
Abstract. This paper deals with semi-automatic linear feature extrac- tion from digital images for GIS data capture, where the iden- tification task is pe$ormedĀ ...
Semi-Automatic Linear Feature Extraction by Dynamic Programming and LSB-Snakes Armin Gruen and Haihong Li

Abstract This paper deals with semi-automatic linear feature extraction from digital images for GIS data capture, where the identification task is pe$ormed manually on a single image, while a special automatic digital module performs the high precision feature tracking in two-dimensional (2-0) image space or even three-dimensional (3-0) object space. A human operator identifies the object from an on-screen display of a digital image, selects the particular class this object belongs to, and provides a very few coarsely distributed seed points. as an approximation of subseq;ently, with th;?sk seed the ~ o s i t i o nand s h a m the linear feature will be extracted automatically by either a dynamic programming approach or ~ ~ ~ S E-spline Snakes). With dynamic b y L S B - S ~[Least-Squares programming, the optimization problem is set u p as a discrete multistage decision process and is solved b y a "timedelayed" algorithm. It ensures global optimality, i s numerically stable, and allows for hard constraints to be enforced on the solution. In the least-squares approach, we combine three types of observation equations, one radiometric, formulating the matching of a generic object model with image data, and two that express the internal geometric constraints of a curve and the location of operator-given seed points. The solution is obtained b y solving a pair of independent normal equations to estimate the parameters of the spline curve. Both techniques can be used i n a monoplotting mode, which combines one image with its underlying DTM. The L S B - S ~approach ~ ~ ~ S i s also implemented in a multi-image mode, which uses multiple images simultaneously and provides for a robust and mathematically sound full 3 D approach. These techniques are not restricted to aerial images. They can be applied to satellite and close-range images as well. The issues related to the mathematical modeling of the proposed methods are discussed and experimental results are shown i n this paper too.

Introduction One of the most fascinating promises of digital photogrammetry is the highly automated acquisition and updating of spatial data from images. Remarkable progress has been made in areas involving image and template matching such as automatic interior orientation, relative orientation, tie point selection and transfer, digital terrain model (DTM) generation, and digital ortho-image generation. Although the current level of automation on most digital photogrammetric stations is still fairly low, a number of these developments are meanwhile available on some commercial systems (Gruen, 1996; Miller et al., 1996; Walker and Petrie, 1996). On the way towards automatic mapping or spatial data acquisition and update, automatic identification and localization of cartographic objects in aerial and satellite images has Institute of Geodesy and Photogrammetry, Swiss Federal Institute of Technology Zurich, ETH-Hoenggerberg, CH-8093 Zurich, Switzerland ([email protected]). PE&RS

August 1997

gained increasing attention i n recent years in digital photogrammetry and computer vision. Despite the fact that quite some achievements have been reported, the automatic extraction of man-made objects in essence is still an unresolved issue. As fully automatic methods for mapping are still far out of reach, semi-automatic methods for feature extraction that interact with a human operator are considered to be a good compromise, combining the mensuration speed and accuracy of a computer algorithm with the interpretation skills of a human operator. Although the level of automation is much higher in close-range applications, many approaches there are restricted to the measurement of point-shaped features. Automated linear feature extraction has been reported only occasionally (e.g., Gruen and Stallmann, 1991; Streilein, 1996). We have developed two semi-automated algorithms for linear feature extraction, to be used on any type of image (satellite, aerial, close-range). We have treated with these techniques images of topographical, architectural, industrial, and medical objects, and there are no restrictions concerning applications. Linear edge-type and ribbon-like features may be extracted. The techniques of choice are based cm a method of dynamic programming and on the estimation of energy-minimizing functions ("Snakes"). We have developed a new kind of Snake class, the LSB-snakes(Least Squares Bspline Snakes), which combine the powerful tools of leastsquares estimation with the determination of energy-minimizing functions. This article introduces our general semi-automatic linear feature extraction scheme. For both techniques, it summarizes the respective mathematical approaches. We focus here in particular on the extraction of roads and we emphasize the underlying generic road models. Monoplotting and multiple-image implementations will be shown. Due to space limitations not all details of the algorithms can be presented here. For more information, please refer to the relevant publications (Gruen and Li, 1995; Gruen and Li, 1996; Li, 1997).

General Semi-Automatic Feature Extraction Scheme Our semi-automatic feature extraction scheme is shown in Figure 1. In the current implementation, the main procedures of image preprocessing are Wallis filtering and road sharpening. The Wallis filter is used to enhance the images and facilitate the subsequent road extraction process by locally forcing the grey value mean and especially contrast (dynamic range) to fit certain target values (Baltsavias, 1991). This preprocessing stage is particularly important for panchromatic

Photogrammetric Engineering & Remote Sensing, Vol. 63, NO. 8, August 1997, pp. 985-995. 0099-11l2/97/6308-9&5$3.00/0

0 1997 American Society for Photogrammetry and Remote Sensing

Digital Image(s)

*

1 I

Image Preprocessing

1 Feature Identification & Classification I

@ @

6 Seed Points

3-D Object Outline

@ Input @ ~utomatic @ ~ a n u a l l ~@output Figure 1.A semi-automatic linear feature extraction scheme.

SPOT images which have a big contrast globally but most grey values stretch over a small range. We built a particular wavelet for road sharpening (Gruen and Li, 1995) based on multiresolution analysis and splinewavelet (Chui, 1992). More precisely, we used a shifted cubic B-spline as our scaling function and derived the associated wavelet with the consideration that the linear feature has to be enhanced in the wavelet domain. Using a cubic B-spline as the scaling function implies that the signal can be approximated by a cubic spline, which is a reasonable assumption. The basic wavelet and its associated filters h and g are shown in Figure 2. It is noticed that the function

---

h ( x ) has a very fast decay and g(x) a small compact support; in fact, the discrete coefficients are only for g(0) and g(+-1) non-zero. Therefore, the wavelet decomposition can be computed very efficiently by the pyramidal algorithm (Mallat, 1989). Because the Fourier transform of the basic wavelet has a zero of order 2 at OJ = 0,with the properties of Fourier transform it can be shown that our basic wavelet is the second derivative of a smoothing function (a function whose integral is equal to 1 and that converges to 0 at infinity). Therefore, the difference information obtained with such a particular wavelet is proportional to the second derivative of the smoothed signal. Thus, the zero-crossings of the wavelet transform indicate the location of a maximum gradient in the original signal and the local extrema correspond to ridges and valleys of the signal. After image preprocessing, a human operator identifies the object from an on-screen display of a digital (Wallis filtered) image, selects the particular class this object belongs to, and provides a very few coarsely distributed seed points. This is done through activation of a mouse in a convenient interactive graphics-image user interface. Subsequently, with these seed points as an approximation of the position and shape, the linear feature will be extracted automatically. Two different techniques, the dynamic programming approach (Gruen and Li, 1995) and the LSB-snakes(Gruen and Li, 1996), are used to solve this problem. These techniques can basically be used either in a monoplotting mode (combining one image with the underlying DTM) or in a multi-image mode. However, our dynamic programming approach has not been implemented in a multi-image mode.

Generic Road Model A road is a well-defined concept. The definition of the word "road," as given by a dictionary, states: "A smooth prepared track or way along which wheeled vehicles can travel." This is a more or less functional description of a road, but cannot be directly applied to identify roads in a digital image. It is necessary for road extraction that a generic road model be formed which describes the appearance of a road in the digital image and whose generation makes the task programmable. Let 5 be a curve on a digital image, corresponding to a road in the object space. We make the following assumptions: The curve [ can be represented by a vector function f (s), which maps the arc-length s to points (x, y) in the image; The curve [has continuous derivatives, and a unit vector n(s) is normal to the curve; and The preprocessed digital image is represented by a ZD function G(x, y) which is measurable and has a finite energy and continuous derivatives.

We state the properties of our generic road model, according to our knowledge about the object "road," first in words and then give the equivalent mathematical formulation.

(a)

(1) A road pixel in the image is lighter than its neighbors on

both road sides. In other words, a road on a digital image is identified by a continuous and narrow region of high intensity with regions of lower intensity on each side. This suggests that a squared sum of the grey values (or their second derivatives in the direction normal to the road) along the curve attains a maximum: i.e., (b)

(c)

Figure 2. The basic wavelet and the associated filters h and g used for road sharpening. (a) the basic wavelet, (b) the function h(x), and (c) the function g(x). The discrete coefficients are indicated by small circles.

Epl

=

JIGU (sl)lzds * Maximum.

(2) Grey values along a road usually do not change very much

within a short distance. This stems from the knowledge that the road materials do not vary much and that their spectral properties are similar within a short distance. This is equivalent to August 1997 PE&RS

E,,

=

Z , J , ~ G ( ~ ( ~-)Gm(Asi)12 ) ds =. Minimum

where As, is a shnrt segment of the curve 5, and G,,, (As,)is the average value of G on As,: that is,

Objective Function In our actual implementation, a road segment is described as y,), and a polygon with n vertices P = { p , , p,,..., p,], p, = (x,, the generic road model developed i n the last section is represented i n discrete form. Equations 1, 2 , and 4 can be discretized i n the form

(3) A road is a light linear feature. In fact, this is a generaliza-

tion of the two previous properties and, as such, is a more global formulation. This property can be considered through the appropriate evaluation of the formula E,,

=

$nld(s))lG(f(s) + d(s)n(s))l2ds * Maximum

E,

(4)

where d(s) is the distance between curve t and the linear feature near it. w(d(s))is a Gaussian weight function that decreases as the distance d(s) increases. (4) In terms of basic geometric property, a road is usually smooth and does not have small wiggles. In fact, most roads consist of straight lines connected with smooth curves, normally circular arcs. This property can be represented mathematically as Eg =

and, for the Equations 5 and 6, we get

S l f..[s) I ' dr =. Minimum

This implies that the curve 5 can be represented as a cubic spline. (5) The local curvature of a road has an upper bound, which means that the local change of direction is upward bounded. This property follows from smooth traffic flow requirement: i.e.,

= C[2 - 2

cos (ai - aitl)l/ 1

and

where a, is the direction of vector between points p,-, and p,, and I As, I is the distance between them: i.e.,

The generic road model can be formulated by the following merit function and a n inequality constraint as E

=

?[E,,(P~, pi+,)

-

BE,

(pi, pi+,)

+ vEp, (pi, pi+,)]

with TI as a given threshold. (6) The width of a road does not change significantly. This is

kept for practical reasons. We don't give a mathematical representation for this property here because it will not be used explicitly in our implementation, but it is involved in defining a linear feature in property (3). These six properties given above, three i n photometric and three i n geometric terms, form the generic road model involved i n our road extraction scheme. Although this formulation of a road model is strictly used only i n our dynamic programming approach, it also forms the basis for the observation equations of the L~B-snakes, as explained later.

Linear Feature Extraction with Dynamic Programming Dynamic programming is a technique for solving optimization problems when not all variables in the evaluation function are interrelated simultaneously (Ballard and Brown, 1982). It is a solution strategy for combined optimization problems which involve a sequential decision-making process. It is an optimization process, expressed as a recursive search (Bellman and Dreyfus, 1962). The general idea of road extraction by dynamic programming is as follows: A curve 6 is described as a polygon with n vertices. The first four properties of the generic road model developed in the last section can be made discrete and combined into a merit function with property (5) as a hard constraint; Each point or vertex moves around its initial position (xp,y:), for example, in a 5 by 5 window, forming a number of polygons. The candidate among them for which the maximum of the merit function is achieved under the constraints is considered a road; and Thus, a road extraction can be treated as an optimization problem or multistage decision process in which the coordinate candidates of the ith vertex form the choice nodes of the ith decision stage. That can be solved by the dynamic programming procedure or time-delayed algorithm mentioned before (for more details, compare Gruen and Li (1995) and Li (1997)). PE&RS

August I997

where p and y are two positive constants and T, is a threshold for direction change between two adjacent vectors. The optimization problem defined by the objective function (Equation 11) can be solved by the "time-delayed" algorithm (Gruen and Li, 1995). To reduce the computational complexity and make the algorithm more efficient, the number of vertices used to approximate a curve and the number of candidates for each vertex should b e reduced to as few as possible. Coarse to Fine One-DimensionalVertex Search A polygon with n vertices can be written as P = (p,, p,, ...,p,] where p, denotes the image coordinates of the ith veris - tex, i.e., p, = (x,, y,). The road extracted by the algorithm represented as another polygon with n vertices {p,, p,, The ith vertex p, i n the new polygon is selected from a finite set of candidates in the neighborhood of p,. One can simply take a window around the vertex p,, with the candidates inside. To get a large convergence range, a large window has to be used. However, this may greatly increase the computational complexity because it is o n the order of O(nm3)where n is the number of candidates for the evaluation. To reduce the number of candidates, two strategies can be employed:

...,

Restrict the search to one dimension. Here the candidates are selected only in the direction normal to the initial curve at point p,. In such a way, the same convergence range can be obtained with a much lower computational complexity. For instance, if the window size is 5 by 5, then the number of candidates will be 25, but it is five for the one-dimensional search. Select the candidates from a coarse to fine pyramid. Instead of using an image pyramid explicitly, we can simply select the candidates with a certain interval, e.g., every three pixels, at the beginning. This results in a large convergence range. To get high accuracy, the interval is subsequently decreased. One can even use an interval smaller than one pixel to achieve the sub-pixel accuracy, but this was not implemented in this investigation.

Pi- I

------I-------

Figure 3. Collinear condition checking and blunder detection for the internal vertices.

the polygon; and the third one, together with the constraint of the objective function, ensures that the polygon is an approximation of a smooth curve and makes the algorithm more robust to bridge small gaps and resist the influence of distortions. implementation and Experimental Results This dynamic programming approach has been implemented in a monoplotting mode. A number of experiments for road extraction from SPOT and small scale aerial images have been performed (Gruen and Li, 1995; Li, 1997). Figure 4 shows a portion of a SPOT ortho-image of Kefalonia, Greece draped over its underlying DTM. The extracted road segments are overlaid. Figure 5 shows a road network extracted from a SPOT panchromatic image of Sydney, Australia. The contrast of roads is very low in this image. The result shows that our algorithm works very well even under these unfavorable conditions.

Dynamic Vertex Insertion and Deletion To describe a curve with a polygon, the most simple and widely used strategy is to use a polygon with equidistant vertices (Fua and Leclerc, 1990; Kass et al., 1988). This strategy is not optimal because more vertices are needed and their positions are not related to the shape of the curve. In our approach, a few starting vertices are given coarsely by the operator. Connecting these seed points, an initial polygon is formed. After the first iteration of the optimization procedure by dynamic programming on this polygon, two equidistant new vertices are inserted by a linear interpolation between every two adjacent vertices whose distance is larger than a threshold. Then the second iteration is applied on this new polygon. Each new inserted vertex is checked and those points are deleted which are collinear with their neighbor points and those which cause a "blunder," e.g., in the form of a wiggle. The iteration scheme runs until convergence is reached. Using this dynamic vertex insertion and deletion control strategy, the computational complexity is reduced and at the same time the algorithm is more robust in case of small gaps and other distortions. As shown in Figure 3, each internal vertex is checked after an iteration with three conditions: i.e.,

name from the fact that they are a combination of least-squares template matching (Gruen, 1985) and B-spline Snakes (Trinder and Li, 1995). B-spline Snakes have been applied to satellite and aerial images (Trinder and Li, 1995; Li, 1997). Figure 6 shows a B-spline Snakes application to images from the area of industrial quality control. It demostrates well the good performance of Snakes for linear feature extraction. For L S B - S ~ ~we~ use ~ S three types of observations, which are also based on the generic road model described earlier. These observations can be divided into two classes: photometric observations, that formulate the grey level matching of images with the object model; and geometric observations, that express the geometric constraints and the a priori knowledge of the location and shape of the feature to be extracted.

The first condition requires that the distance between the two neighbor vertices has to be larger than a threshold; the second condition ensures that every vertex is necessary for

Photometric Observation Equations Assume the photometric model of the feature to be extracted is formulated as a discrete two-dimensional function PM (x, y). It may be formed by a real or synthetic image. Its values can

LSB-Snakes L S B - S ~ ~derive ~ ~ S their

F Figure 4. A portion of a

SPOT

image draped over its underlying DTM. The extracted road segments are overla~d.

August 111'37 PE&RS

Figure 5. Road network extraction from a portion of a Australia.

be the intensities or other quantities derived from them, for instance, the gradient of the intensities. It can also be a vector function whose components express the different aspects of the a priori photometric knowledge (for examples, intensities, gradients, moments, and other derivatives) of the feature to be extracted. Suppose an image patch is given as a discrete two dimensional function I(x, y). In terms of leastsquares adjustment, this patch can be interpreted as an observation vector of the photometric model PM (x, y). A nonlinear observation equation is established as

where e(x, y) is a true error function and T(.) is an abstract transform which represents the radiometric relationship between the function values of the photometric model and grey levels of the image patch. This transform basically consists of two parts. The first part represents the functional operations for computation of the values of the photometric model by grey levbls. This part can be determined exactly based on the knowledge of the model and (:an then be appliod to the image prior to or during thc least-scruarcs adiustrnent. 'l'he seco k d l ~ a r models t the iorrections of radioketric distortions. The investigations on issues with regard to flexibility of implementation, control of the parameters, stability of the solution, quality analysis, and computational speed led to the conclusion that the radiometric corrections should not be included in the estimation model [Baltsavias, 19911. Instead, they should be applied to the g k y levels prior to the adjustment. Assume the transformed image patch is given as G(x, y). Applying a Taylor's series to Equation 13 and dropping second- and higher-order terms, the linearized form of the observation equation becomes

The relationship between the template and the image patch needs to be determined in order to extract the feature, i.e. the corrections Ax, Ay in Equation 14 have to be estimated. In the conventional least-squares template matching applied to feature extraction, an image patch is related to a template through a geometrical transformation, formulated normally

SPOT

image of Sydney,

by a six-parameter affine transformation to model the geometric deformation (Gruen, 1985). The template is typically square or rectangular, and sizes range from 5 by 5 to 25 by 25 pixels. Originally, the LSM technique was only a local operator used for high precision point measurement. It was extended to an edge tracking technique to automatically extract edge segments (Gruen and Stallmann, 1991), and a further extension was made through the introduction of object-typedependent neighborhood constraints between individual patches to enforce the local continuity in a global solution (Gruen and Agouris, 1994). In another approach, leastsquares template matching was combined with Kalman filtering of the parameters of the road for road tracking (Vosselman and Knecht, 1995). Instead of a square or rectangular template, we extend the least-squares template matching technique into LSB-

I

1

(2a)

(2b)

Figure 6. Edge extraction by B-Snakes: CCD images of a machine part with the initial polygons given by the operator (left) and the extracted edges (right).

snakes by using a deformable contour as the template. This is shown in Figure 7. The black ribbon (left) is our initial template and the right one is the final solution. Its center line defines the location of the extracted feature, in the example of the figure, the center line of a road segment. Suppose a linear feature, the center line of the template, is approximated by a spline curve and represented in parametric form as

tnrrd positlo

final pos1tIon

template

(a)

where X,and Y, are the coefficients of the B-spline curve in the x and y directions, respectively. In terms of the B-spline concept, they form the coordinates of the control polygon of the curve. N:" (s) is the normalized mth B-spline between knots u, and u,,,,, (Bartels e t al., 1987). While the knot sequence is given, feature extraction can be treated as a problem of estimation of the coefficients X,and Y, of the spline curve. The first-order differentials of the B-spline curve can be obtained as

z n

Ax

=

, = I

N,m (s) AXi,

Substituting the terms in Equation 14, we obtain the linearized photometric observation equations with respect to the coefficients of the B-splines. The linearization of the observation equations for all involved pixels can be expressed in matrix form as

(b)

Flgure 7. Vlsualizatlon of (a) least-squares template matching of an edge and (b) an Lss-snake of a road segment, with the initial position (left) and flnal solution (right).

the seed points in x and y directions, respectively, and P, and P, are the corresponding weight matrices, introduced as diagonal matrices. The linearization of the coordinates with respect to the coefficients of the B-splines can be expressed in matrix form as -

e,,

-e,,

with N r (s) N;" (s)

...N:

(s)

I

Because a linear feature is essentially unidirectional, the template would slide along that feature during matching. To ease this problem and simplify the implementation, the above equations are changed to

A pair of independent observation equations are thus formed for the x and y directions. The observation vectors l,, and l,, contain the differences of conjugate pixels, and P,, and P,, are the corresponding weight matrices, which are introduced as diagonal matrices. Geometric Observation Equations

In our semi-automatic feature extraction scheme, a set of seed points near the feature of interest is given by a human operator or other preprocessing procedures. In terms of a least-squares adjustment, these seed points can be interpreted as the control points which determine the location of the feature to be extracted. Because they are only coarsely given, a correction has to be estimated. Therefore, they should be considered as observations. Thus, the second type of observation equations can be established as -

ecx = x - xU; -e, = y - y , ; where x, and yo are the observation vectors of coordinates of

=

NAX - t, ;

=

NAY - t, ;

in which t, and t, are given by

With the seed points an initial curve is formed as a first shape approximation of the feature. In order to stabilize the local deformation of the template, we introduce the following smoothness constraints. Assume the initial curve is expressed by 9 ( s ) and yO(s). We establish the third type of observation equations based on the first and second derivatives of the curve as

Linearizing them with respect to the coefficients of the B-spline, they can be expressed in matrix form as -e, -e,

= =

N,AX N,AY

-

t, ; t,;

p, p, , 9

where N, and N,, are the first and second derivatives of N, as defined in Equation 18, and the terms t are given by

August 1997 PE&RS

Any other a priori geometric information of the feature can be formulated in this manner. A joint system is formed by all of these observation Equations 19, 21, 25, and 26. Solution of LSB-Snakes

In our least-squares approach, linear feature extraction is treated as the problem of estimation of the unknown coefficients X and Y of the B-spline curve. This is achieved by minimizing a goal function which measures the differences between the template and the image patch and which includes the geometrical constraints. The goal function to be minimized in this approach is the L2-normof the residuals of least-squares estimation. It is equivalent to the total energy of Snakes and can be written as vTPv = (vTPsv, + v ~ ) ? ~ v+, ,vz P, vm + vTP,v,) = E, + Ex + Ec + Minimum.

(29)

In terms of Snakes, E, denotes the internal (geometric) energy of the snakes derived from smoothness constraints, Ex denotes the external (photometric) energy derived from the object model and the image data, and E, represents the control energy which constrains the distance between the solution and its initial location. To minimize this goal function (the total energy of Snakes), we have the following necessary conditions:

A further development of these formulae will result in a pair of normal equations used for the estimation of AX and AY, respectively. Because of the local support property of Bsplines, it can be shown that the normal equations are banded (bandwidth b = m + 1)and the solution can be efficiently computed. The various tools of least-squares estimation with their familiar and well established mathematical formulations can be favorably utilized for the statistical analysis of the obtained results and the realistic evaluation of the algorithmic performance. So one can evaluate the covariance matrix of the estimated parameters and derived quantities therefrom. Also, one obtains an estimate of the system noise. In addition to the traditional least-squares estimation, a robust estimation can be efficiently computed too, if required.

where N is defined in Equation 18; X, Y, and Z are the coefficient vectors of the B-spline curve in 3D object space; and X,, Y,, and Z, are the object space coordinates of the feature. Assume that patches are used from k > 1 images (Figure 8). If the image forming process followed the law of perspective projection, a pair of collinearity conditions in parametric form can be formulated for each of the image patches as

If the interior and exterior orientation parameters of each image are given or can be derived, the unknowns to be estimated in Equation 32 together with Equation 31 are the coefficient vectors of a B-spline curve. The first-order differentials can be obtained as axi ax, ax, A X , = -~NXA T X + - N A~YY + -TN A Z , ~ Z T

Substituting Equations 33 into Equation 14, the linearization of the observation equations with respect to the coefficient vectors of a 3D B-spline c w e can be obtained. For the same reasons as for 2D LSB-snakes,the equations are changed such that they can be expressed in matrix form as

where F,, F,, and Fz are partial derivatives which can be written as

Fz

=

d dx. d ayi dx g(x, y) L + - g(x, y)

A

LSB-Snakes with Multiple Images

If a feature is extracted from more than one image, its coordinates in 3D object space can be derived. There are two main ways to perform multi-image matching: Object-Space Correlation Methods (Wrobel, 1987; Ebner and Heipke, 1988; Helava, 1988). The methods relate two or more digital images to an object space model. They are definitely of theoretical interest; however, they cannot be easily applied without extension of the algorithm. to LS~-~nakes Multiphoto Geometrically Constrained Matching (MPGC) (Gruen, 1985; Gruen and Baltsavias, 1985).This method connects the photometric observation equations of every image by means of external geometrical constraints. One class of most important constraints is generated by the imaging rays intersection conditions (Gruen, 19851. Because LSB-snakes deal with a curve instead of an individual point, a direct analogy to the MPGC technique may introduce many unknowns and therefore increase the complexity of the computation. A modified version of the MPGc algorithm is therefore used and developed into 3D LSB-Snakes. Suppose a Linear feature in 3D object space can be approximated by a spline curve and represented in B-spline parametric form as PE&RS

August 1 9 9 7

az,

ay

The geometric observation equations (Equations 21, 25, and 26) can be extended into three dimensions by introducing a new component for the Z-direction. Then the 3D LSBsnakes can again be solved by a combined least-squares adjustment. That is, a 3D linear feature is extracted directly from multiple images. The statistical analysis of the obtained results and the realistic evaluation of the algorithmic performance can be done through the use of the covariance matrix of the estimated parameters. Experiments with Road Extraction

The ~ s ~ - S n a kmethod es described in this paper has been successfully implemented on SUN and SGI computer workstations. We have tested the algorithm on a number of images. In this section, some experimental results of road extraction will be given. An imaged object is defined and identified by its characteristics, which can be classified into five groups: Photometric, geometric, topological, functional, and contextual characteristics. In our semi-automatic feature extraction scheme, the high level knowledge, which requires quite

X Figure 8. Multiple images arrangement for 3~ LSBsnakes. The linear feature is represented by a 3D B-splines curve in the object space.

some intelligence for the image interpretation process, is used by the human operator to identify and classify the object. The generic object model involved in the model driven feature extraction algorithms consists of some photometric and geometric characteristics. Some of these properties are mathematically formulated and used to generate the template and define the weight functions. For instance, the grey values of the template can be derived from the images through computations of the local contrast according to the first property, while the second property suggests that the weights of the photometric observations should be related to the local changes of the grey levels along the road. In the current implementation, only one image is displayed through the user interface. The algorithm can run both in a monoplotting and a multi-image mode. In the monoplotting mode, the L S B - S ~ ~extract ~ ~ S linear features in the image space. The 3D coordinates are obtained in real-time by intersecting the imaging rays through the use of a camera model with an underlying DTM. With multiple images, some very few seed points are given by the operator i n one displayed image, and the camera model is applied to project them into object space and onto a coarse DTM. The 3D feature is then extracted automatically and precisely by the LSBsnakes algorithm. Figure 9 shows an example of 2D ~ s ~ - s n a kused e s for extraction of road segments from a portion of an aerial image of Avenches, Switzerland. The scale of the original photograph is about 1:15,000. The pixel size of the digitized image is 100 micrometres. Thus, the footprint of the image is about 1.5 metres. Roads in this image have different widths and are disturbed by buildings, trees, cars, and other objects. The results show that the algorithm of LSB-snakesworks very well even under these unfavorable conditions. Figure 10 shows one example of simultaneous 3-D extraction of a road segment from four images, which are portions of aerial images of Heinzenberg, Switzerland. The four images are from two strips with about 60 percent end and side lap. The scale of the original photographs is about 1: 15,000. The negative films were scanned on a LeicaIHelava scanner with a 10-micrometre pixel size and were later subsampled to a 40 micrometre pixel size. Thus, the footprint of the images is about 0.6 metres. Most roads in the test region belong to the third or fourth class and are about 5 pixels

wide on the images. The exterior orientation parameters are taken from the results of a block adjustment. The VirtuoZo software was used to perform the interior orientation of the digitized images and to generate a very coarse DTM (interval 30 m, about 50 pixels). The seed points (initial polygon), provided manually by the operator, are displayed in black, overlaid on the first image. The extracted road center line is shown as a black curve. Visual tests prove the successful performance of the algorithm. Figure 11 focuses on another portion from the same aerial images. At this time, the problematic areas marked with white rectangles show occlusions caused by trees in two out of four images. In terms of a least-squares adjustment, the photometric observations in the problematic areas are blunders. They have to be detected and rejected. This is achieved in our implementation by using adaptive weight functions, in which the weights of observations are related to the ratio of their residuals and the variance factor. To get a large pullin range, the algorithm starts with a flat weight function. To reduce the influence of blunders, it becomes steep after three iterations. In such a way, the weights of observations with big residuals will become smaller and smaller. The results shown in Figure 11 prove that the blunders are successfully rejected and the algorithm bridges gaps in a robust manner. For comparison the results without blunder detection are shown in Figure 11 as (c') and (d'). Because the extracted road on images (a) and (b) is the same in both cases, they are not displayed again. It is also verified by this example that more than two images are required for 3D linear feature extraction. Using only two images cannot give reliable 3D results. Reliability suffers in places where the linear feature extends in the direction of the image base.

Conclusions In this paper we have presented new methods for linear feature extraction from satellite and aerial images for GIS data capture. We also have shown a close-range application in in-

Figure 9. Road extraction in 2D mode from an aerial image of Avenches. The extracted road center lines are displayed in black, overlaid on the original image. Note that the extracted center lines represent raw data, which, in a second step, has to be refined and converted into a consistent road network structure.

August 1997 PE&RS

dustrial quality control (this, however, with the simpler Bspline version of the Snakes). The investigations are focused on semi-automatic approaches in which high level decisions, e.g., identification of the object, assignment of attributes to the measured feature, and grouping of features in higher level entities, are made by the human operator while precise measurement tasks are handled by the computer software. Such use of a human operator within a system is considered optimal because humans perform the identification task flawlessly and almost effortlessly. The precise feature extraction, which experience shows to be a time-consuming part of photogrammetric data collection, is performed fully automatically and very fast with efficient algorithms. The generic road model which consists of six photometric and geometric properties has been successfully formulated by a hard constraint and a merit function for a model-driven feature extraction algorithm based on dynamic programming. This new technique differs from the conventional use of dynamic programming for edge extraction because a model of a road segment is explicitly applied in our approach. We do not optimize every pixel on the road segment but only the parameters of the model, the vertices of the polygon in the current implementation, with a time-dePE&RS

August 1997

layed algorithm. It is very robust in the case of gaps and other distortions because of the use of global photometric information and geometric constraints. With the strategy of coarse-to-fine one-dimensional vertex search and dynamic vertex insertion and deletion, the computational complexity could be reduced successfully and even a long road segment can be handled efficiently. ~ s ~ - s n a kconsiderably es improve active contour models by using three new elements: (1)the exploitation of any a priori known geometric and photometric information to constrain the solution, (2) the simultaneous use of any number of images, and (3) the solid background of least-squares estimation. Through the connection of image and object space, assuming that the interior and exterior orientation of the sensors are known, any number of images can be simultaneously accommodated and the feature can be extracted in a 2D as well as in a fully 31, mode. Thus, blunders in image data, like occlusions, can be controlled very well. Instead of a set of points on the feature, a B-spline representation of the linear feature is estimated. The least-squares approach allows for precision and reliability assessment of the estimated 3D feature utilizing covariance matrix evaluation. This is in clear contrast to conventional Snakes, which, due to their

Figure 11.Simultaneous 3~ extraction of a road segment from four images (a), (b), (c), and (d). The white rectangle denotes the problematic area of occlusion and the extracted road segments are displayed as a black curve. (c') and (d') show the results without blunder detection.

particular theoretical background and formulation, do not provide any measures for the quantitative control of their results. At the same time, LSB-snakescan be considered as a new application and extension of the least-squares template matching (LSM) techniques. Our LSB-snakesconcept is not restricted to road extraction. Other linear features, e.g., edges, can be modeled and extracted. In fact, anything which can be geometrically modeled by B-splines can be handled. This makes it a powerful general concept for semi-automated feature extraction, not only for the processing of aerial and space images, but also for a variety of close-range (machine vision) applications. The results obtained so far are very encouraging. Further studies will make use of more extensive data sets and will focus on the quality assessment and automated performance evaluation.

References Amini, A,, T. Weymouth, and R. Jain, 1990. Using dynamic programming for solving variational problems in vision, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(9):855867. Ballard, D.H., and C.M. Brown, 1982. Computer Vision, PrenticeHall, Englewood Cliffs, New Jersey.

Baltsavias, E.P., 1991. Multiphoto Geometrically Constrained Matching, PhD thesis, Report No. 49, Institute of Geodesy and Photogrammetry, ETH-Zurich, Switzerland. Bartels, R.H., J.C. Beatty, and B.A. Barsky, 1987. A n Introduction to Splines for Use i n Computer Graphics and Geometric Modelling, Morgan Kaufmann Publishers, Los Altos, California. Bellman, R., and S. Dreyfus, 1962. Applied Dynamic Programming, Princeton University Press, Princeton, New Jersey. Chui, C.K., 1992. A n Introduction to Wavelets, Academic Press, Boston, Massachusetts. Ebner, H., and C. Heipke, 1988. Integration of digital image matching and object surface reconstruction, International Archives of Photogrammetry and Remote Sensing, 2 7(B11):534-545. Fua, P., and Y. Leclerc, 1990. Model driven edge detection, Machine Vision and Applications, 3:45-56. Gruen, A., 1985. Adaptive least squares correlation - A powerful image matching technique, South African Journal of Photogrammetry, Remote Sensing and Cartography, 14(3):175-187. , 1996. Digital photogrammetric stations revisited, International Archives of Photogrammetry and Remote Sensing, 31(Part B2):127-134. Gruen, A., and P. Agouris, 1994. Linear feature extraction by least squares template matching constrained by internal shape forces, International Archives of Photogrammetry and Remote Sensing, 30(Part 3/1):316-323. Gruen, A., and E.P. Baltsavias, 1985. Adaptive least squares correla-

August 1997 PE&RS

I

tion with geometrical constraints, SPIE Proceedings of Computer Vision for Robots, 595:72-82. Gruen, A., and H. Li, 1995. Road extraction from aerial and satellite images by dynamic programming, ISPRS Journal of Photogrammetry and Remote Sensing, 50(4):11-20. , 1996. Linear feature extraction with LSB-Snakes from multiple images, International Archives of Photogrammetry and Remote Sensing, 31(Part B3):266-272. Gruen, A., and D. Stallmann, 1991. High accuracy edge matching with an extension of the MPGC-matching algorithm, SPIE Proceedings of Industrial Vision Metrology, 1526:42-45. Helava, U.V., 1988. Object-Space Least-Square Correlation, Photogrammetric Engineering 6 Remote Sensing, 54(6]:711-714. Kass, M., A. Witkin, and D. Terzopoulos, 1988. Snakes: Active contour models, International Journal of Computer Vision, 1(4):321331. Li, H., 1997. Semi-Automatic Road Extraction from Satellite and Aerial Images, PhD thesis, Report No. 61, Institute of Geodesy and Photogrammetry, ETH-Zurich, Switzerland. Mallat, S., 1989. A theory of multiresolution signal decomposition: The wavelet representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7):674-693. Miller, S.B., F.C. Paderes, and A.S. Walker, 1996. Automation in dig-

ital photogrammetric systems, International Archives of Photogrammetry and Remote Sensing, 3 l(Part B2):250-255. Samadani, R., 1991. Adaptive snakes: control of damping and material parameters, SPIE Proceedings of Geometric Methods i n Computer Vision, 1570:202-213. Streilein, A., 1996. Utilization of CAD models for the object oriented measurement of industrial and architectural objects, International Archives of Photogrammetry and Remote Sensing, 31(Part B5):548-553. Trinder, J., and H. Li, 1995. Semi-automatic feature extraction by snakes, Automatic Extraction of Man-Made Objects from Aerial and Space Images, Birkhaeuser Verlag, pp. 95-104. Vosselman, G., and J. de Knecht, 1995. Road tracing by profile matching and Kalman filtering, Automatic Extraction of ManMade Objects from Aerial and Space Images, Birkhaeuser Verlag, pp. 265-274. Walker, A.S., and G. Petrie, 1996. Digital photogrammetric workstations 1992-96, International Archives of Photogrammetry and Remote Sensing, 31(Part B2):384-395. Wrobel, B., 1987. Facet Stereo Vision (FAST Vision) - A new approach to computer stereo vision and to digital photogrammetry,

Proceedings of Intercommission Conference on Fast Processing of Photogrammetric Data, Interlaken, Switzerland, pp. 231-258.

O R T H C O M I N G

rn

AndrGs 1Vl. A b e y t c ~a n d Jcznet F ' r c l n k l i ~ ~Il'he , Accuracy of Vegetation S t a n d B o ~ l n d a r i e sD e r i v e d from Irnage S e g m e n t a t i o n i n a Desert E n v i r o n m e n t . Y e h u d a A f e k a n d Ariel B ~ , u r l d M , o s a i c k i n g of O r t h o r e c t i f i e d A e r i a l Irnages. Peter M . A t k i n s o n a n d Paul J . C L I ~ ~ (CIl I~Io,o s i n ga11 A p p r o p r i a t e S p a t i a l R e s o l u t i o n for R c m o t e Seilsiilg 111vestigations. Grorges B l a h u , A c c u r a c y of fllatc:s C a l i b r a t e d b y a n A u tomatic Monoconlpdralo~.

A R T I C L E S

1-

R o n g x i n g Li, M o b i l e M a p p i n g - A n E m e r g i n g T e c h n o l ogy for S p a t i a l Data A c q u i s i t i o n . D.D. L i r h t i a n d 12.I.A. C l ~ u p l n c ~Cn o, n s t r a i n e d FEM SelfCalil~ration. Johil G . L y o n , L)ing Y u a n , Ross S . Llinetta, anti Chris D. Elvidge, A C h a n g e D e t e c . i o n E x p e r i n l e n t U s i n g Vegetation I n d i c e s . Hc~ns-GerdMnrrs a n d Thomcrs Kersten. A e r o t r i a n g u l a tion a n d DEMIOrthophoto Gerleration from High Resol u t i o ~Still-Vitleo ~ Imagery.

Michel B o u l i u n n r , Clkrr~entN o l r t t e , Jec111-PnuIA g n a r d , a n d Martin Brindainour, Heriiispllerical Pllotographs U s e d for Mappirig Corlfined S p a c e s .

Fabio Maselli. Ljiljann Petkov, a n d Giarnpiero Martrcchi, Extension of Climate Paranleters Over t h e L a n d S u r v a c e b y t h e U s e of NOAA-AVFIRR a n d A n c i l l a r y Data.

H e n r y Ker-Chung C h a n g , Shing-Hurl L ~ L aI n, d C h e n g Kuan Tso, Two-Dimensional 'Template-Basecl Encoding for L i n e a r Q u a d t r e e K e p r e s e ~ i t a t i o n .

S c o t t M a s o n . H e u r i s t i c K e a s o r i n g S t r a t e g y for A u t o inateti S e n s o r P l a c e m e n t .

Gar,v R . C l a y a n d Stliart E. 1Llarsl1, S p e c t r a l A n a l v s i s for A r t i c u l a t i n g S c e n i c Color (:hangc:s i n a Coniferous Landscape.

Willinin K . M i c h e i ~ e ra n d Paula F. H O L I ~ I O LDetecI~~S, t i o n of Vegetation C h a n g e s A s s o c i a t e d w i t h E x t e n s i v e F l o o d i n g i n a F o r e s t e d Ecos!lstern.

Denis J. Dean, K e n n e t h H. LViIson, a n d Curtis H . Flather, S p a t i a l Error A n a l y s i s of S p e c i e s R i c h n e s s for a G a p Ailalysis M a p .

De111etr.i~~-Kleclnthis D. Hokos a n d Marc P. A r ~ ~ ~ s t r o n g , E x p e r i m e n t s i n t h e Identification of Terrain F e a t u r e s l l s i n g a PC-Based P a r a l l e l C ~ ~ m p u t c r .

R o l a n d J . Duhairne, Peter \/: A ~ i g l r s t ,a n d 'CVilliarn R . W r i g h t , A u t o m a t e d Vegetation M a p p i n g l l s i n g Digital O r t h o p h o t o g r a p h y .

F a j ~ e zS . S h a h i n , Efficient H a n d l i n g of Large Digital Irilages i n G e o g r a p h i c I n f o r r r ~ a t i o ~Sly s t e m s . Paul S u t t o n , 1101.Roberts, Chris E l v i d w , a n d H e n k Meij, A C o m p a r i s o n o f N i g h t t i m e ~ a t e y l i t eInlagerv a n d l'opulation Density for t h e (:ontinental U n i t e d States.

C l y d e C. Goad a n d Ming T7ang, A N e w A p p r o a c h t o Prec i s i o n A i r b o r n e G P S P o s i t i o n i n g for P h o t o g r a m m e try. Mark E. J a k u b a u s k a s a n d K e v i n P. P r i c e , Eillpirical R e l a t i o n s h i p s b e t w e e n S t r ~ ~ c t u r aa nl d S p e c t r a l Fact o r s of Y e l l o w s t o n e L o d g e p o l e Pine F o r e s t s .

J.E. ~ ~ o g e l i n a n n7'.. S o h l , a n d S . M . Hoctrard, Regional Characterization of Land Cove. Using Multiple S o ~ l r c e s of Data.

K a z ~ r oK o b n y a s h i a n d C h u j i Mori, R e l a t i o n s b e t w e e n t h e Coefficients i n P r o j e c t i v e T r a n s f o r n i a t i o n E q u a tions a n d the Orientation Elements o f a Photograph.

'I'inlothy A . I/Varner a n d iZlichar1 S h a n k , A n E v a l u a t i o n of t h e P o t e ~ l t i a lfor F u z z y (:lassification of M u l t i s p e c t r a l Data IJsing A r t i f i ~ i a lN e u r a l N e t w o r k s .

Jon Leachtencr~~er, Kenneth Daniel, a n d Tl~orncrsCrogl.Digitizing Satellite Imagery: Quality a n d Cost Considerations.

Hrrang I'ouccri a n d L ~ LIVenhao, I Builtling t h e E s t i m a t i o n M o d e l of Digitizing Error.

PE&RS

August 10517

995