Look Before You Leap: Peering Behavior for ... - Iguana Robotics, Inc.

2 downloads 0 Views 128KB Size Report
motion parallax to judge distance to an object. Wallace ... Detectors and Motion Parallax. Recently ..... Other values such as average jump parameter over a scan.
To be presented at the Simulation of Adaptive Behavior Conference 1998

Look Before You Leap: Peering Behavior for Depth Perception M. Anthony Lewis Mark E. Nelson Beckman Institute 405 N. Mathews Avenue University of Illinois Urbana, Illinois 61801 [email protected] [email protected] Abstract When presented with a water or an air gap barrier, animals often engage in peering, or side-tosided head movements, before leaping across the barrier. This strategy is used instead of depth recovery using stereopsis, and likely gives a much better estimate of distance. In this article we present a neurocomputational model of peering, hosted on a small robot, that explains the essential characteristics of peering reported in the literature. The model builds on recent evidence for non-direction selective movement detectors in insects. Through non-linear transformation of the retinal image, the model produces a ‘leap’ command without intermediate reconstruction of the external space of the animal.

1. Introduction About 40 years ago, Wallace (1959) described an investigation into a depth perception mechanism in locusts. The mechanism is called peering. During peering, the animal’s body moves in a radial arc centered at the posterior tip of the animal’s abdomen. The head translates and, as was shown by Collet (1978), the head remains rotationally stable during translation (see Fig. 1). These peering movements precede directed leaps by the insect (but apparently do not occur before non-directed escape responses). It was hypothesized by Wallace that the animal is using motion parallax to judge distance to an object. Wallace reports that Exner (1891) speculates that the articulate eyes of crabs may use the rate of movement of an image as a cue to the depth of an object. Evidence supporting this idea was presented by Wallace and latter in a similar experiment with computer controlled equipment by Sobel (1990). In those experiments, a target was made to move either with or against the direction of the peering animal’s head movement. In the case that the target was moved in opposition to the head movement of the animal, the animal underestimated the distance to the object (as assessed by the leap velocity or leap distance of the animal). If the target was moved with the head movement, the distance to the target was over esti-

mated. Sobel also tried moving the target in the same direction as the head movement of the animal but faster. When he did this, there was a very surprising result. The estimated distance to the object decreased. This is a very curious reversal phenomenon. Sobel’s explanation for how the animal computes distance is based on the static geometry of the problem. After a peer, the animal uses two angular measurements made at the extents of the peer to triangulate the object’s position. A neurally plausible mechanism must be able to explain the computation of depth at different speeds of peering. It is known that the speed of peering varies and is not constant for each peer. Secondly, the reversal effect must be explained and the mechanism should rely on known characteristics of image speed transduction in insects. In this article we present a biologically plausible model for peering. The model is consistent with experiments reported in the literature on peering, as well as new behavioral and anatomical evidence for movement detection pathways in insects. Furthermore, the model does not involve an intermediate representation of external space. Nor does it require any memory process on the part of the animal. The processed image can be mapped directly to motor behavior without intermediate representation. In the following section we present the experimental evidence for this hypothesis. In section 3 we describe a model of this behavior. In section 4 we describe a robotic implementation of this behavior. This is followed by experimental results.

2. Experimental Evidence 2.1 Peering Behavior In Sobel (1990) locust were placed on a platform in the middle of a pie tin filled with water. The visual environment was plain with the exception of a single target, a vertical black stripe against a white background. The head of the locust was monitored and the stripe could be moved under computer control, tracking the movement of the insect’s head. If the target was moved in the opposite direction of the head movement, the peering effect was exaggerated and the

tured walls will center itself. The non-direction selective pathway has a higher frequency response than the direction selective pathway. Because of this difference in frequency response, behavioral experiments designed by Srinivasan and Zhang (1993) were able to show that the two pathways are distinct. This was followed by anatomical evidence provided by Douglass and Strausfeld (1996).

3. Neurocomputational Model

Figure 1. Peering behavior in a locust. The head of the locust

is held rotationally stable while it translated perpendicular to the gaze direction. Retinal speed of the object, normalized by the head translational speed, can predict the distance to the object.

target gave the illusion of being closer. In fact, the locust sometimes reached with their forelimbs to touch the seemingly proximal target. If the target was moved with the direction of the head movement (but slower) the target appeared farther away. If the target moved faster than the head movement, a curious reversal illusion was observed. It was infered by the behavior of the animal that it perceived the target to be closer as speed increased. In monocular locusts, the estimated distance was doubled. This results would be expected if the animal simply summed the computations from two sides of the body. This result is not seen in similar experiments in the mantis (Walcher and Kral, 1994). Sobel suggests that the principle of distance estimation in locusts is based on measurements of angle at the extent of the peers. By triangulation, the distance to the object can be estimated. However, the suggestion of Exner (1891) has merit, and can be applied to this case (Exner was discussing principle of distance estimation in crustaceans, not insects). In fact, we can place his original suggestion in the context of recent evidence in movement detection systems in insects.

2.2 The Non-Direction Selective Movement Detectors and Motion Parallax Recently, evidence for non-directional selective movement detector (NDSMD) pathways in insects, first implied by behavioral evidence in bees (Srinivasan and Zhang,1993) have been supported by anatomical evidence in flies (Douglass and Strausfeld, 1996). The idea is that there are at least two movement detection pathways. One pathway is direction selective and supports behaviors such as the optomotor response (turning the head or body to track a rotating environment). A second pathway supports obstacle avoidance and the so-called centering response where an insect flying down a corridor with tex-

A model of obstacle avoidance using non-directional selective movement detectors is presented in Lewis (1998). In that work, the NDSMD is modeled as a filtered temporal derivative of the retinal brightness. In that work, an argument is made that simple temporal differentiation of brightness will give a reasonable estimate of image motion given sufficiently rich texture of the observed environment. Another approach is to use a Reichart type detector (Borst and Egelhaaf, 1989) but modified to have a symmetrical response. Figure 2 shows an array of these detectors. The layer ‘T’ is an array of photodetectors. A delayed and shifted version of the retinal image is correlated with the current image. This results in layer ‘S’. Layer ‘S’ is an estimate of the speed of an object’s image on the retina. Because the circuit is symmetric, a given cell in layer ‘S’ responds equally to leftward or rightward image movement. Once we have this detector, we can estimate the distance to an object. This can be shown in the following way.

3.1 Depth From Parallax Let us assume a perspective projection: x = --λ- X Z Y y

(1)

where λ is the focal length of the lens, X, Y, Z is the position of a point in the environment, and x, y is the position of the projection of that point in retinal coordinates. The velocity of the image of a moving point in the world can be found by differentiating (1) with respect to time: λ ZX· – XZ· x· = ----2 · ·y · Z ZY – YZ

(2)

If we assume that objects in the environment are fixed in relation to one-and-other and that the observer is moving with relative translational velocity relative rotational velocity

T Ve = V V V x y z

Ωe = ω ω ω x y z

T

and

to the envi-

ronment given in the observer’s frame, a point in the environment has relative velocity: P· =

T = –(Ω × P + V ) X· Y· Z· e e

Now substituting into (2):

(3)

A

Target

T

S Peering Speed

S∗

Camera

Khepera Robot N

B

B

Jump Parameter Figure 3. Experimental setup. (A) A Khepera robot with a miniature camera mounted on top. The camera looks sideways, perpendicular to the direction of movement. (B) An example of processed data from layer ‘S’. Here the robot peers at a soda can.

Excitatory Connection Inhibitory Connection Figure 2. Neural Network for Jump Parameter Computation from retinal stimulus.

now have: x· y·

=

1 --- – λ 0 x Z 0 –λ y

c

1 Ve + --λ

xy

–1–x

1+y

2

2

– xy

λy c

Ωe

1 ----

(4)

Z

–λ x

This describes the movement of the image point projected onto the retina, the motion field. If a point is on a richly textured surface, the motion field, and the optic flow field will be well approximated by (4). We assume that our transducer gives the speed but not the direction of the flow field. Taking the absolute value of (4) we have an equation for the computation performed by layer ‘S’. s

2

=

2 2 λ Vz  Z-  xy ω + ω  x 2 + 1 + y ω   2 + -------------  – α + x – ----x y  z  2  Vz  Z

We can see that the terms involving

ωx ωy ω z

(5)

(6)

If s is the speed estimated by the photo transducer, then if we normalize this quantity by the peering speed of the transducer, we have an estimate of the reciprocal of depth for a point in the image. Let us call that quantity s∗ . This is reflected in the layer s∗ in Fig 2. Here ‘Peering Speed’ equals:

2 2 Tx + T y .

If we take an average over the extent of

the image then we have the average inverse depth: (7)

Z∗ = N ⁄  ∑ s∗  

(8)

∀i, j

→ α β

ω x ωy ωz

= 0 and

2 2 λ Tx + Ty

1 1 1 1 ------ = ---- ∑ -------------- = ---- ∑ s∗ N ∀i, j N ∀i, j Z ( i, j ) Z∗

.

confound

our efforts to recover the depth of a point. In the case of peering, the head is stabilized and forward velocity is negligible. Therefore, we can set

s -------------------------

Inverting, we have

 2 2 Z +  – β + y + ------  – ω  y + 1 + xy ω + x ω    x   y z  V   z

where we have made the substitution: X· ⁄ Z· Y· ⁄ Z·

=

T

z

= 0 . We

The computation in eqn (8) is computed in the ‘Jump Parameter’ neuron. To understand how this is done, we note that if we sum the activity and use that as a shunting inhibition signal, and we apply a hard threshold to this layer (the computation in layer ‘N’) we essentially count the number of ‘on’ neurons. The jump parameter neuron sums all of the ‘on’ neurons to compute ‘N’ and then divides this by the

#1

#2

#3

#4

Figure 4. Targets used for peering experiments: Foam cup, soda can, utility tool, roll of wire.

summation of activity in each s∗ . Thus we compute the Jump Parameter which is related to distance (up to a scale factor).

3.2 Reversal Phenomena The reversal phenomena can be explained by the fact that our model uses NDSMD. Because of this, a target moving faster than the head of an insect will produce the same effect in layer S as a target that is moving slightly slower than the head of the insect.

4. Robot Experiment Setup A miniature camera was mounted on a Khepera robot base. The camera was positioned so that it looked sideways to the direction of movement (See Fig 3). Thus, as the robot moved in a straight line, the camera translated parallel to the image plane. Image processing from the camera was done on a SGI Indy workstation. All images were subsampled to 60x80 pixels. The robot was controlled from a Dual Pentium II workstation via an RS-232 connection. Commands were issued to the workstation based robot controller by a process on the Indy Workstation. For all three processes (Image Processing, Command Issuing, and Robot controller processes), interprocess communication was handled through PVM (Geist et al., 1994). Performance of the system was measured by varying the target, target distance and peering velocity parameters. Four different targets (Fig. 4 ) were placed at various distances from the robot. Targets #1 and #2 where roughly the same size but with different surface texture. Target #3 was about the same height as targets #1 and #2, but was narrower. Target #4 was about the same width as targets #1 and #2 but was much shorter. The targets were placed at various distances in front of the camera: 2,4,6,8,10, and 12 cm. Peering velocities of 4 cm/sec and 2 cm/sec were used. The duration of each peer was adjusted so that the total distance traveled was about 8cm.

Figure 5. Response of Jump Parameter neuron for target 2 at different distances.

In all cases, the measured variable was the jump parameter as determined by the output of the jump neuron. During a single peer (either to the left or the right), the maximum value of the jump parameter was measured. This value was then recorded as the peer parameter for that particular peer. Other values such as average jump parameter over a scan could have been used as well. At the ends of the peer, when the robot reversed direction, the velocity fell to zero. It was felt that the maximum value gives the most reliable measurement of distance It was noticed that often the maximum peer velocity was established close to its final value early on. It seems likely that the peer distance of 8 cm is much more than is needed to determine depth. In the series of experiments described here, the minimum distance needed for a good estimate of depth was not determined.

5. Experimental Results 5.1 Variation of Jump Parameter with Target Distance In this experiment, target #2 was placed at various distances in front of the camera. The maximum jump parameter was established for a series of 14 peers at each distance. In theory, the jump parameter should increase linearly with target distance.The data is plotted in Figure 5 along with a line of best fit (in a least squares sense). As can be seen, the jump parameter did increase monotonically with distance, in a fairly linear fashion. During each peer, it was noticed that some residual rotation of the camera introduced spurious noise in the measurement. This undoubtedly accounts for some of the spread of the data at each peering distance.

5.2 Variation of Jump Parameter with Different Targets. Four different targets were compared and each was placed at

A

Jump Parameter for Four Different Objects 6

Target #3 5

Target #4

Response

4

3 2

Target #1 Target #2

1 0

0

2

4

6 8 Distance (cm)

10

12

14

Figure 6. Comparison of jump parameter response verses 4 different targets. The regression lines are plotted and the data points are left out for clarity of pre-

sentation. the 6 different distances as before. During each peer about 14 data samples were taken. Because we normalize the activity of the distance measurement layer by the number of ‘on’ units, the jump parameter should be invariant to target size. In addition, if sufficient texture is present, the range estimation should be invariant to target texture. Figure 6 shows the regression lines for each target. Notice target #1 and target #2. These are two objects with different texture but approximately the same size, and they have nearly identical regression lines. However, targets #3 and #4 have apparently significantly different regression lines. We can speculate that the reason for this is that the background may have been averaged into the reading. Thus we compute a distance that is too large.

5.3 Jump Parameter and Peering Speed Variation The next experiment was to determine the variation in estimated peering distance with varying peering velocity. In theory, because we normalize the processed image by the peering velocity, the jump parameter should be invariant to peering velocity. Figure 7 shows the data from this experiment and regression lines. Two targets were compared, target #2, a large target and target #4, a small target. In both cases of large and small targets, the regression lines are lower at the lower scan speed. This means that the distances at slower scan speed s is over estimated. An overestimate of distance indicates a reduced response of the movement detectors. Two possible explanations can account for this. Either there is a non-linearity in the movement detectors or increased peering velocities are associated with increased rotational noise. These two cases can be distinguished by examination of the curves. If there is a non-linearity in the movement detectors, then the regression line should show a pronounced

B

Figure 7. Comparison of jump parameter estimate versus distance. (A) Target #2. (B) Target #4.

curve. If the background noise is changed, then the line should shift parallel and downward at lower peering speeds. The second case seems to fit the data. That is, the results indicate that increased peering velocities are associated with increased rotational noise.

6. Discussion Our results indicate that it is possible to recover a distance estimate to a target using the computational model presented here. We infer from our results that rotational stability of the head is a problem and will introduced significant errors in distance estimation. Insects incorporate an optomotor response to rotationally stabilize relative to the environment. In the insect system, the eyes look sideways. During peering, the poles of the optic flow field are perpendicular to the target direction. Any translational movement at the poles must be due to head rotation. Incorporation of a rotation stabilization mechanism should increase the accuracy target distance estimation and allow the system to accurately judge distances further away. Our model is able to estimate the distances to objects

using biologically plausible mechanisms. It relies on nondirectionally selective movement detectors. This is a pathway only recently discovered in insects. The model can explain the reversal phenomena noted by Sobel (1990). In addition, the particular computation performed here is a simpler hypothesis than the computation proposed by Sobel. Sobel’s explanation for how the animal computes distance is based on the static geometry of the problem. After a peer, the animal uses two angular measurements made at the extents of the peer to triangulate the object’s position. His computation requires that (A) the animal memorizes an angle at the extremes of movement presumably in some short term storage (B) the animal is able to ‘zero’ out this short term storage between peers, (C) the animal is able to match different views of the same object (at the end of a peer it must realize that a portion of the image is the same thing that it memorized a short while ago), (D) the animal then uses two angular measurements to compute distance using a non-linear function. This computation appear difficult to explain using a neural substrate.This appears more complex than the original idea put forth by Exner (1891)— that it is the apparent speed of the object that determines the depth of the object, not the static position of the object at the extents of movement. Peering is an interesting phenomenon from several points of view. In the behaving animal, peering solves a particularly difficult problem in nature: Depth perception at relatively large distances. At close distances, and especially for moving object (Collet, 1996) stereopsis can recover the depth of a target and is used when an animal strikes it prey. The mechanism of stereopsis has the advantage of stealth and is compatible with a lay and wait strategy where the animal is motionless and waits for unsuspecting prey to approach. Insects with compound eyes have relatively few photoreceptors. Their omatidia are focused forward into a relatively small binocular region. Stereopsis for gauging distances to far away objects is not an option for insects. Instead, peering is used. Peering is potentially of wide use in the robotics community. Peering could be used as a complement to stereopsis in gauging distance to objects. One could image a system were peering is invoked to confirm distance estimation from stereopsis.

7. Conclusion A neurocomputational model of depth perception using peering was presented that explains how an insect can recover depth using peering behavior. The model is consistent with recent behavioral and anatomical evidence for nondirectional selective movement detectors. The model demonstrates that the memorization of angles to target positions at the extents of a peer is not necessary to recover depth. The model can also estimate the distance to a variety of objects and at a variety of speeds.

It is speculated that the accuracy of the distance estimate will increase with better rotational stabilization of the camera. In addition, slower scanning speeds may give better results in a real, mechanical system. The theoretical model can also explain the reversal phenomena described in the literature (Sobel, 1990).

Acknowledgments The authors acknowledges the useful critique of this work by Narendra Ahuja, John Hart and Lucia Simo. The authors acknowledges the support of ONR grant N000149610657. The authors also acknowledges the loan of the Khepera from UCLA (NSF grant CDA-9303148).

References A. Borst and M. Egelhaaf (1989), Principles of Visual Motion Detection, Trends in Neurosciences, 12(8):297-306. T. S. Collett (1978), Peering— A Locust Behavior Pattern for Obtaining Motion Parallax Information, J. Exp. biol. 76, pp 237-241. T.S. Collet (1996), Vision: Simple Stereopis, Current Biology, 6(11):1392-1395. J. K. Douglass and N. J. Strausfeld (1996), Visual Motion-Detection Circuits in Flies: Parallel Direction and Non-Direction-Sensitive Pathwasy between the Medulla and the Lobula Plate, J. Neurosci, 16(15):4551-4562. Exner, S. (1891), Die Physiologie der facettirten Augen von Krebsen and Insecten, pp 206, Leipzig and viena: Franz Deuticke. A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek and V. Sunderam (1994), PVM Reference here.PVM: Parallel Virtual Machine A Users' Guide and Tutorial for Networked Parallel Computing, MIT Press, Cambridge. M. A. Lewis, Visual Navigation In a Robot using ZigZag Behavior, Advances in Neural Information Processing Systems (NIPS*10), The MIT Press, Cambridge 1998 E. C. Sobel (1990), The locust’s use of motion parallax to measure distance, J. Comp. Physiol A. 167, pp 579-588. M. V. Srinivasan and S. W. Zhang (1993), Evidence for Two Distinct Movement-Detecting Mechanisms in Insect Vision, Naturwissenschaften, 80, pp 38-41. F. Walcher and K. Kral (1994), Visual deprivation and distance estimation in the preying mantis larva, Physiological Entomology 19, 230-240. G. K. Wallace (1959), Visual Scanning in the Desert Locust Schistocerca Gregaria ForskålJ. Exp. iol. 36, pp 512-525.