Real-Time Robot Vision for Collision Avoidance ... - Semantic Scholar

4 downloads 820 Views 1MB Size Report
inspired by the visual nervous system in a locust, which can avoid a collision robustly by using visual information. We implemented the architecture of the locust ...
Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems San Diego, CA, USA, Oct 29 - Nov 2, 2007

TuD8.1

Real-time robot vision for collision avoidance inspired by neuronal circuits of insects Hirotsugu Okuno and Tetsuya Yagi Abstract— A real-time vision sensor for collision avoidance was designed. To respond selectively to approaching objects on direct collision course, the sensor employs an algorithm inspired by the visual nervous system in a locust, which can avoid a collision robustly by using visual information. We implemented the architecture of the locust nervous system with a compact hardware system which contains mixed analogdigital integrated circuits consisting of an analog resistive network and field-programmable gate array (FPGA) circuits. The response properties of the system were examined by using simulated movie images, and the system was tested also in realworld situations by loading it on a motorized car. The system was confirmed to respond selectively to colliding objects even in complicated real-world situations.

I. I NTRODUCTION Animals can detect and avoid approaching objects in real-time using the visual nervous system [1]. In insects, such ability of collision avoidance is achieved with limited nervous networks of the small cephalon. The comparatively simple nervous networks of insects enable researchers to specify the functions and activities of an individual neuron and those of well-defined neural circuits [2]–[4]. Based on this background, artificial visual systems that imitate some features of the nervous system of flies have been developed to demonstrate the advantages of employing such bio-inspired systems in robotic vision [5]–[7]. Locusts have attracted attention for their ability to avoid collisions through the use of monocular cues. A neuron named the lobula giant movement detector (LGMD) has been identified as being responsible for triggering avoidance behavior in the locust visual nervous system [8]–[10], and a network model of the neuronal circuit has been proposed [11]. The collision avoidance algorithms inspired by these biological studies have been implemented on a personal computer (PC) [12]–[14], and a digital very large-scale integrated (VLSI) vision chip has been designed to mimic the collision avoidance response of the locust neuronal circuits [15][16]. In the previous study, we proposed a network model to implement the LGMD neurons with a mixed analog-digital integrated circuits and demonstrated that the system gave rise to responses similar to that of LGMD neurons using simulated video movies [17]. In the present study, the system was implemented with a neuromorphic silicon retina [18] and field-programmable

gate array (FPGA) circuits so as to take advantage of the real-time analog computation and programmable digital processing. The system was applied to control a motorized miniature car avoiding collision in real-time. II. A LGORITHM FOR C OLLISION AVOIDANCE A. Neuronal Network for Collision Avoidance The system implemented in this study is inspired by computation that takes place in locust visual neurons. Fig.1 shows the neuronal network model of the locust visual circuit proposed by Rind et al. [11]. This model consists of four hierarchically arranged layers of neuronal networks, which are decomposed into three retinotopically connected layers and an integrative layer. A “P” unit of the first layer responds to changes in luminance, which is mainly induced by the motion of edges, and outputs a pulsed excitation, as indicated by the solid line in Fig.2. Then, the excitation is fed to the second layer. In the second layer, an “E” unit instantly transmits an excitatory signal to the next layer (dotted line in Fig.2), while an “I” unit sends delayed inhibition signals (dashed line in Fig.2) to its neighboring units in the next layer. In the third layer, an “S” unit subtracts the aggregated inhibitory signals of I units from the excitatory signal of the E unit on the same retinotopic position. If the subtracted value exceeds a particular threshold, the S unit is excited and transmits the subtracted value to the fourth layer, where all outputs from the S units converge on the LGMD neuron. There is another pathway for feedforward inhibition that is Layer-1

P

S

LGMD

F

[email protected] Fig. 1.

[email protected]

1-4244-0912-8/07/$25.00 ©2007 IEEE.

I E

H. Okuno is with Division of Electrical, Electronic and Information Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka, Japan T. Yagi is with Division of Electrical, Electronic and Information Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka, Japan

Layer-2 Layer-3 Layer-4

1302

LGMD neuronal network model proposed by Rind et al. (1996).

moving edge

P E I

Delay

E I

Excitation

response

1.0

0.0

pixel

t0

Time

past location of the edge

Fig. 2. Response of the P, E, and I units to an illumination change at t = t0 in the neuronal network illustrated in Fig.1. The time lag between excitation by the E units and inhibition by the I units is the key to generating a selective response to an approaching object at close range.

present location of the edge

Fig. 4. Spatial profile of the unit responses. An edge-motion induces an instant localized response of the E unit (solid line) and broad responses of the I units with a delay (dashed line).

mediated by an “F” unit. The function of this inhibition is to prevent the LGMD from responding to global excitations such as sudden changes in background illumination. B. Visual Signal Computation

Layer-2

Layer-1

moving edge

C. Monocular Cues for Approach Detection

Transient response

Transient response

E

E

...

summation unit in the fourth layer generates a large response. Otherwise, the excitatory signal is cancelled or suppressed by the inhibitory signal, resulting in no or weak response in the fourth layer.

Delay

Delay

Spreading

Spreading

retinal surface

...

lens

approaching object

a(t)

D V

Layer-3

I Subtraction

I Subtraction

f Layer-4

S

d(t)

Fig. 5. Diagram of an approaching object focused by a lens onto a retinal surface. On the retinal surface, both the length and the moving velocity of the surrounding edge of the projected image increase drastically at close range.

Summation of signals

LGMD Fig. 3.

Visual signal flow diagram of LGMD.

The computation carried out in the neuronal network of the locust visual neurons is shown in Fig.3. The fundamental computation required for generating the collision avoidance signal is the critical competition in the second layer between excitatory signal “E” induced by the moving edge (indicated by the thick arrow) and inhibitory signal “I” spreading laterally (indicated by the outline arrow). The spatial profile of signals E and I is shown in Fig.4. If the edge of the object moves faster on the retinal surface than the inhibitory signal that is spreading laterally in the second layer, the subtraction units in the third layer send particular positive values and the

Fig.5 depicts how an approaching object is projected onto the retinal surface. D is the diameter of an object and d(t) is the distance of the object from a lens with focal length f . Now, consider that the object is approaching the lens at constant velocity V along the optical axis of the lens. In this case, the diameter of the object’s image projected onto the retinal surface and its derivative with respect to time are given by: fD (1) a(t) = d(t) f DV ˙ a(t) = − (2) d(t)2 We assume that the length of the surrounding edge of the projected image is approximately proportional to a(t) and

1303

that the moving velocity of the edge is proportional to a(t). ˙ The above equations indicate that as the object approaches the lens, the length of the surrounding edge and its velocity on the retinal surface increase in proportion to d−1 and d−2 , respectively. In other words, the length and velocity of the edge increase drastically at close range. Therefore, the LGMD neuron generates a prominent response to an approaching object at close range, particularly for a direct collision course. This response is an effective cue for the locust to avoid collision by using its monocular visual field.

Accordingly, the delay of the inhibitory signal is configured with the FPGA circuit. The silicon retina used here has 100 × 100 pixels, each of which is composed of an active pixel sensor (APS), resistive networks, and differential amplifiers [19]. The architecture of the chip has been originally designed by Kameda and Yagi [18]. B. Response properties of resistive network

III. H ARDWARE I MPLEMENTATION A. System Architecture Silicon Retina Chip APS

(a)

(b)

(c)

Differential Amplifier

Rs

Rm

Rs

to the adjacent pixel

Rs

to the adjacent pixel

Rs MUX A/D DEMUX FPGA

(d)

rectifier Delay

RAM

Fig. 7. Outputs from the silicon retina. (a) Transient response to light. (b) Smoothed response. (c) Movie image used as visual stimuli. A blackand-white edge moves sideways. (d) Spatial profile of the pixel voltages measured along the 50th row of (a) and (b).

α

VE

VI VS Summation

Fig. 6. Block diagram of the system implemented using FPGA circuits and a silicon retina. The lateral spread is realized by exploiting the analog resistive network in the silicon retina. This implementation reduces the computational cost in the subsequent processing. The delay, subtraction, and summation are realized in the FPGA with RAM.

We have implemented the fundamental architecture of the collision avoidance circuit depicted in the previous section by using a mixed analog-digital system consisting of a silicon retina and FPGA circuits (Fig.6). In the system developed here, the lateral spread of the inhibitory signal is achieved by using the resistive network. This is an efficient architecture for realizing the lateral spread of the signal since the inhibitory signal is conducted passively and instantaneously over the resistive network. However, it is not appropriate to implement the delay of the inhibitory signal by using the analog circuit since in order to realize the delay time required in the present situation, the capacitor occupies a significantly large area on the chip when fabricated in the analog VLSI.

The transient response to light is obtained by the differential amplifiers in the silicon retina circuit, which subtract consecutive image frames received by the APS array. The resistive network connecting neighboring pixels is used to generate the lateral spread inhibitory signals. Fig.7 (a) and (b) show the two types of responses generated by the silicon retina to a moving image shown in Fig.7 (c). Fig.7 (d) shows the spatial profile of the pixel voltages measured along the 50th row of Fig.7 (a) and (b). The transient response exhibits a sharp peak at the border of the black-and-white areas (dotted line). The signal that is smoothed by the resistive network of the silicon retina has skirts with a decaying spatial profile on both sides, as shown in Fig.4. The size of the smoothing filter, or in other words, the degree of exponential decay of the smoothed image, can be easily controlled by an externally applied voltage to registers Rs because the resistive network of the silicon retina is implemented with metal-oxide-semiconductor (MOS) transistors [18]. The delay in the inhibitory signal is generated in the FPGA circuits using random access memory (RAM). The difference between VE and VI gives VS in each pixel. The VS values

1304

of all pixels are summed up to obtain the final output of this system. The amplitude and delay time of the inhibitory signal can be controlled in the FPGA circuits. IV. S YSTEM R ESPONSE A. Response Property The responses of the system to moving images have been examined, as shown in Fig.8. In this experiment, a movie simulating an approaching object is created on a computer and presented on a liquid crystal display (LCD) monitor. The frame sampling time of the silicon retina is set to 33 ms.

(a)

Silicon Retina

Stronger

to FPGA

LCD

(b) Fig. 8. Experimental environment to test the system response. Movie images are presented on an LCD monitor. The response of the system was recorded by a PC via Ethernet.

Fig. 10. System responses to a simulated approaching object. In the movie images displayed here, it is assumed that an object in front of the retina approaches straight-on with a uniform velocity. (a) Comparison between the response of the system with inhibition (solid line) and that without inhibition (dotted line). The dashed line plots the product of the edge velocity and the edge length. (b) Effects of the amplitude of inhibition.

approaching object when the object is far and the response drastically increases when the object approaches at close range.

Fig. 9. Movie image simulating an approaching object. The white rectangle in the center of the screen expands.

Fig.10 shows the system response to the movie shown in Fig.9. In the figure, the amplitude of the output is normalized with the maximum response in each case. The comparison between the response of the system with inhibition (solid line) and that without inhibition (dotted line) is illustrated in Fig.10 (a). Without inhibition, the output of the model is approximately proportional to the product of the edge length and the edge velocity because the number of pixels stimulated by the moving edge is proportional to the product. Here, the edge velocity refers to the velocity of a moving edge of an image projected onto the acceptance surface. However, with inhibition, the model responds little to the

The effect of the inhibition is shown in Fig.10 (b). The response curve shifts toward the direction of the arrow as the amplitude of inhibition increases. Furthermore, similar effects were obtained by shortening the delay time and/or broadening the filter size [17]. These results indicate that the slope of the response curve to approaching objects can be controlled by these parameters. Fig.11 shows the system response to simulated approaching objects with various sizes. The time when the response drastically increases varies according to the size of the object. The deflection point of the response curve is at about 0.7ms before the contact for a 20cm square object and about 0.5ms for a 10cm square object. The drastic increase was induced by the object only when the velocity of edge motion exceeds a certain value. In the present study, the velocity was set to 1.8 pixels/f rame as shown in Fig.12, and the velocity was independent of the object size.

1305

beverage can on collision course

beverage cans on non-collision course guard rail car loaded with a silicon retina Fig. 11. System responses to simulated approaching objects with various object sizes. The assumed approaching velocity is 5 km/h (1.39 m/s). The larger the object size is, the earlier the system responds.

Fig. 13. Experimental environment to test the system response in a realworld situation. A motorized miniature car loaded with a silicon retina moves toward a beverage can. Two other beverage cans are also placed respectively on either side of the collision course.

Threshold



c

d

a b

Stop Signal

Fig. 12. System response to a simulated approaching object, and edge velocity on the surface of the silicon retina. The system response increases drastically after the velocity exceeds 1.8 pixels/f rame. Fig. 14. System response when silicon retina moves forward in the environment as shown in Fig.13.

B. Response in Real-World Situations To examine the system response in real-world situations, an environment as shown in Fig.13 have been used. In this setting, a motorized miniature car loaded with a silicon retina moves toward a beverage can placed on a direct collision course, passing two beverage cans placed on a non-collision course. The brake of the car is designed to operate when the system response exceeds a particular value. Fig.14 shows the system response when silicon retina moves forward in this environment. While the cans on the non-collision course increased the response only slightly at approximately t = 1.0s and 1.3s, the can on the direct collision course caused a much larger response, resulting in activating the brake, which successfully stopped the car just before the collision. V. D ISCUSSION The system response was affected by the object size as shown in Fig.11. The larger the object size was, the earlier the strong response was induced by the object. However, the

drastic increase of the response was induced by a particular velocity of edge motion irrespective of the object size; the inducing velocity is decided by the spatiotemporal profile of inhibition signals. From equation (2), edge velocity of an approaching object is proportional to D/d(t)2 . Therefore, even if the size of the object becomes double, drastic increase √ of the response emerges at 2 times distant point, not 2 times. The experimental results obtained in Fig.11 are consistent with this estimation. The frame sampling interval was set to 33ms due to the low sensitivity of the photoreceptors in the silicon retina. The computation in the signal processing circuits of the system completes in a few milliseconds by using 40 MHz slow clock frequency. Therefore, the frame rate can be improved by employing active pixel sensors with high sensitivity or increasing the intensity of illumination light. In the present study, we have implemented a real-time

1306

(a)

(b)

(c)

(d)

Fig. 15. Transient responses from the silicon retina. These images were sampled at the timing indicated by the alphabets in Fig.14.

vision system for collision avoidance inspired by neuronal circuits of locusts. Conventional digital computation often encounters the limitations of excessive power consumption, large-scale hardware, and high cost of computation in sensory information processing [20]. However, its programmable architecture enables a variety of image processing techniques to be executed. In contrast, the analog VLSI circuit executes parallel computation by using its physical properties of built-in circuits, and the results of the computation are instantaneously obtained in the form of a voltage or current distribution in the circuit. Therefore, the analog VLSI circuit can provide a high computational efficiency in sensory information processing, although the computation is not as flexible as that performed by the digital counterparts. The implemented system consists of the analog VLSI silicon retina [19] and FPGA circuits so as to take advantage of the properties of both analog and digital technologies, and was proved to be applicable in real-world scenes.

[6] N. Franceschini, “Visual guidance based on optic flow: a biorobotic approach,” J. Physiol. Paris, vol.98, pp.281-292, 2004. [7] B. Webb, “Robots in invertebrate neuroscience,” Nature, vol.417, pp.359-363, 2002. [8] N. Hatsopoulus, F. Gabbiani and G. Laurent, “Elementary computation of object approach by a wide-field visual neuron,” Science, vol.270, pp.1000-1003, 1995. [9] F. C. Rind, “Intracellular characterization of neurons in the locust brain signalling impending collision,” J. Neurophysiol., vol.75, pp.986-995, 1996. [10] M. O’Shea, J. L. Williams, “The anatomy and output connections of a locust visual interneuron: the lobula giant movement detector(LGMD) neuron,” J. Comp. Physiol., vol.91, pp.257-266, 1974. [11] F. C. Rind and D. I. Bramwell, “Neural network based on the input organization of an identified neuron signaling impending collision,” J. Neurophysiol., vol.75, pp.967-984, 1996. [12] M. Blanchard, F. C. Rind and P. F. M. J. Verschure, “Collision avoidance using a model of the locust LGMD neuron,” Robot. Auton. Sys., vol.30, pp.17-38, 2000. [13] S. Bermudez and P. Verschure,“A Collision Avoidance Model Based on the Lobula Giant Movement Detector(LGMD) neuron of the Locust,” Proceedings of the IJCNN, Budapest 2004. [14] S. Yue, F. C. Rind, M. S. Keil, J. Cuadri and R. Stafford, “A bioinspired visual collision detection mechanism for cars: Optimisation of a model of a locust neuron to a novel environment,” NeuroComputing, vol.69, pp.1591-1598, 2006. [15] J. Cuadri, G. Linan, R. Stafford, M. S. Keil and E. Roca, “A bioinspired collision detection algorithm for VLSI implementation,” Proceedings of the SPIE conference on Bioengineered and Bioinspired System 2005. [16] R. Laviana, L. Carranza, S. Vargas, G. Li˜na´ n, E. Roca., “A Bioinspired Vision Chip Architecture for Collision Detection in Automotive Applications,” Proceedings of the SPIE conference on Bioengineered and Bioinspired System 2005. [17] H.Okuno and T. Yagi, “Bio-inspaired real-time robot vision for collision avoidance,” Journal of Robotics and Mechatronics, In press. [18] S. Kameda and T. Yagi, “An analog VLSI chip emulating sustained and transient response channels of the vertebrate retina,” IEEE Trans. on Neural Networks, vol.14, pp.1405-1412, 2003. [19] R. Takami, K. Shimonomura, S. Kameda and T. Yagi, “A novel pre-processing vision system employing neuromorphic 100x100 pixel silicon retina,” Proc. 2005 IEEE Intl. Symp. on Circuits and Systems, pp.2771-2774, Kobe, Japan, 2005. [20] G. Indiveri and R. Douglas, “Neuromorphic Vision Sensors,” Sicence, vol.288, pp.1189-1190, 2000.

R EFERENCES [1] F.C. Rind and P.J.Simmons,“Seeing what is coming: building collision-sensitive neurons,” Trends Neuroscience 22(5), pp215-220, May 1999. [2] W. Reichardt and T. Poggio, “Visual control of orientation behaviour in the fly Part I,” Q. Rev. Biophys., vol.9, pp.311-375, 1976. [3] T. Poggio and W. Reichardt, “Visual control of orientation behaviour in the fly Part II,” Q. Rev. Biophys., vol.9, pp.377-438, 1976. [4] M. Egelhaaf and A. Borst, “A Look into the Cockpit of the Fly: Visual Orientation, Algorithms, and Identified Neuron,” J. Neurosci., vol.13, pp.4563-4574, 1993. [5] N. Franceschini, J. M. Pichon and C. Blanes, “From insect vision to robot vision,” Philos. Trans. Roy. Soc. Lond. B, vol.337, pp.283-294, 1992.

1307