An Aplysia-like Spiking Neural Network ...

2 downloads 0 Views 503KB Size Report
the pathways that activate MNs by USs (dotted lines in. Fig.1A and Fig.1B) are now ...... T. Natschläger, B. Ruf and M. Schmitt, Unsupervised. Learning and ...
An Aplysia-like Spiking Neural Network Implementation of Sensory Fusion on an Autonomous Robot Fady Alnajjar Department of System Design Engineering, Graduate School of Engineering University of Fukui, Fukui 910-8507, Japan E mail: [email protected]

Kazuyuki Murase Department of Human and Artificial Intelligence Systems, Graduate School of Engineering Research and Education Program for Life Science University of Fukui, Fukui 910-8507, Japan E mail: [email protected] Abstract - We here introduce a novel biologically inspired adaptive controller for autonomous robot. The proposed controller binds N number of Aplysia-like spiking neural network each of which could interact with a particular sensory information and produce various motors output. The post-synaptic weights in each model are gradually updated by the property of spike timing-dependent plasticity (STDP) and that of the presynaptic modulation signal (synapse-on-synapse contact) from the sensory neurons. Information from different types of sensors is bound at the motor neurons. Experimental results show that a physical robot Khepera with the proposed controller quickly adapted into an open environment by evolving obstacle avoidance behavior while locating a target object using both its IR sensors and liner-camera. We believe that this novel approach could be an opportunity for new applications to autonomous robots with various sensory and motor modalities.

1. Introduction Computation in the brain is primarily carried out by pulse signals (spikes). The computational model of spiking neurons is therefore called an artificial spiking neural network (SNN). Although numbers of SNN models have been proposed by many groups of researchers, the real applications regarding to its functionality are still under investigation.1-3 The attractive features of SNNs include the ability to handle structured codes based on the timing of action potentials4 and a greater computational power than similar networks with sigmoidal-gate neurons.2 SNNs have been shown to perform novel types of computations, such as the recognition of temporal patterns using transient synchrony,5 and to support synaptic plasticity.6 In most SNNs, the neuron fires an electric pulse at certain points in time, which is commonly referred to as an action potential or spike. The size and the shape of a spike are independent from

the input to the neuron, but the time when a neuron fires depends on its input. Recently, there have been many attempts to develop a sufficient adaptive controller in SNN for autonomous mobile robots with various levels of success. 7-8 Although most of those works are based on evolutionary approach, such as, genetic algorithm (GA),7 which is one of the appropriate tools, so far, for designing and evolving such a network. However, as each individual in such an algorithm must be evaluated in the environment in each generation, such as applications are limited by disadvantages including the necessity of a trial-and-error process, the necessity of a long time period to isolate a good individual, and no capability to automatically adapt to new environments. Some researchers have, therefore, tried to outperform this disadvantage by using robot simulators alternatively during the initial period of evolution .8 Some other researchers have also used Hebbian rule as an effective mechanism in adapting SNN for

autonomous robot.9 The problems that arise in these solutions are that they have not yet clarified the required number or the properties of the hidden layer neurons for the ideal robot movements. Therefore, the size of the hidden layer can only be experimentally or arbitrarily defined. Notice that the use of an excessive amount of neurons would result in slow adaptation and unnecessary calculations. Hebbian learning may also lead to runaway processes of potentiation and cannot account for the stability of neural function. Regarding the problems outlines above, it is therefore desired to develop a method or an algorithm with which the autonomous robot with minimal network structure incrementally and/or continuously learns in the environment in accordance with a kind of fitness function representing the desired behavior. Such a required algorithm has been partially introduced in our previous work by developing Aplysia-like SNN controller (ASNN).10 In this study, we expand our previous work, so that various types of sensory inputs and motor outputs can be easily processed through building a novel combination between numbers of ASNNs toward performing more complex robot tasks; we called it Multi Aplysia-like Spiking Neural Network (MASNN). This paper is organized as follows. In Sections 2, 3, and 4 the ASNN adaptive controller, the synaptic plasticity mechanism STDP, and the spiking response model (SRM), a model of SNN, used in this study are explained, respectively. In section 5, a network structure that deals with various sensory modalities and motor outputs MASNN is introduced. In section 6 and 7, experimental setups and results are explained. The last section discusses the whole work and introduces some of its future directions. 2. Aplysia-like Spiking Neural Network (ASNN) ASNN is a model that is inspired from the siphon-gill and tail-siphon withdrawal reflex circuits of Aplysia11. More specifically, from the associative facilitation mechanism of the sensorimotor connection that underlies the short- and long-term sensitization of Aplysia. This is employed to construct an adaptive controller for an autonomous robot. The structure of the ASNN is explained in this section, while the mechanisms of plasticity and neural firing are presented in section 3 and 4, respectively. More details about ASNN have been fully described in Ref. 10.

Aplysia is a marine snail organism with a minimal structure for a basic mechanism of learning and memory. It has been thus considered well suited for the examination of the molecular, cellular, morphological, and network mechanisms underlying neuronal modifications plasticity, learning, and memory. The schematic model of heterosynaptic facilitation of sensorimotor connection in Aplysia is shown in Fig.1A (modified from Ref. 11). A sensory neuron (SN) makes synaptic contacts on two motor neurons, MN1 and MN2. When US1, one of the unconditioned stimulus pathways, is activated by stimulation to the animal, the corresponding motor neuron MN1 is activated and a withdrawal reflex is elicited (unconditioned response, UR). The unconditioned stimulus US1 also activates a facilitatory neuron FN1 that makes a presynaptic contact onto the SN’s synapse on MN1. The facilitation of sensorimotor reflex is based on an activity-dependent heterosynaptic neuro-modulation. The presynaptic terminals from FNs release presynaptic neuromodulator(s), such as serotonin. As shown in Fig.1B, when a Conditioned stimulus (CS) is paired with the US1, the SN’s terminal on MN1 is selectively sensitized because of the coincidence of its activity with the neuromodulator action. Now, CS alone becomes sufficient to activate MN1 by releasing a larger amount of neurotransmitter, and elicits the conditioned response CR. Unconditioned stimulation US2, which is unpaired with the CS, in contrast, produces no change in the SN’s synapse on the corresponding motor neuron MN2. The coincidence of CS and US is thus detected by a presynaptic mechanism. The detection by a postsynaptic mechanism that is also present in Aplysia is omitted here for simplicity.11 Several schemes are possible to implement such a network program directly into a physical mobile robot12,13 (e.g., Khepera). One of the simplest circuitry is shown in Fig.1C. The sum of both sensors activities, represented by the activity of the SN, plays a role of the CS. The coincidence with the left (right) sensor’s activity, LS (RS), enhances the synaptic transmission from the SN to the MN1 (MN2), thus the signal to the left (right) motor, LM (RM), is potentiated. Notice that, the pathways that activate MNs by USs (dotted lines in Fig.1A and Fig.1B) are now present as the paths from the SNs to the MNs through the hidden neuron (HN) in Fig.1C.

CS

US1 FN1

In our early work,10 we have proved the validity of this circuitry as an adaptive controller in a real mobile robot for performing simple avoiding obstacles behavior in an open environment and following the changes in the environment. In such behavior, the robot requires, at least, the left and right proximity sensors and the left and right motors to perform the task. While the robot is moving forward, the activation of the LS (RS) has to result in an increase of the LM (RM) activity and a decrease of the RM (LM) activity, to avoid the obstacle in the left (right) side of the robot (Fig.2). The LS and RS signals thus have to take a role as conditioning stimuli (reflexing mechanism) for the signals to the LM and RM, respectively.

US2 FN2

SN

MN1

MN2 UR

UR (A)

US1

CS

FN1

US2 FN2

SN

MN1 CR

MN2

Motors

Active Neurons (Strong)

IR sensors

Active Neurons (Weak)

Obstacle

Moving direction LS

RS

HN

RS

SN

SN Input Layer

LM

STDP synapses

Hidden Layer

MN Output Layer RM

LM (C)

Fig. 1. The schematic model of heterosynaptic facilitation of sensorimotor connection in Aplysia and the implementation to an adaptive robot controller. (A) Conditioned stimulus (CS) is given to the sensory neuron (SN) that makes synaptic contacts with two motor neurons, MN1 and MN2. Each FN makes a presynaptic contact onto the respective SN’s terminal, and also sends an excitatory signal to the respective MN (dotted lines). (B) US1, which is paired with the CS, potentiates the synaptic transmission from SN to MN1, and produces the conditioned response (CR) of MN1 that is largely augmented. US2, which is unpaired with the CS produces no change in the MN2 output. (C) ASNN implemented for an adaptive mobile robot controller (Adopted from Ref. 10).

HN RM

(A) HN

MN

Obstacle

Moving direction

(B)

LS

Presynaptic modulation

Non-active Neurons

LM

RM

(B)

Fig. 2. The robot’s behavior for avoiding left (A) or right (B) obstacles using ASNN adaptive controller suggested in Ref. 10. Regarding the synaptic weight modifications done by STDP (details in section 3), in (A)/(B), the firing rate of the left (right) motor neuron LM (RM) is larger than the firing rate of right (left) motor neuron RM (LM) when the left (right) sensors are activated LS (RS). The arrows inside the motors illustrate the motors’ movement direction.

3. Spiking Timing-Dependent Plasticity (STDP) With the ASNN circuitry described above, we used a model of synaptic plasticity for the heterosynaptic facilitation at the sensorimotor connections called Spike timing-dependent plasticity (STDP). STDP is a form of synaptic modification rule found recently in natural synapses.15 Numbers of experimental and modeling studies have revealed that the STDP is a temporal interpretation of Hebbian learning; short-term

prediction, gain adaptation and boosting of temporally correlated inputs. Synapses with the properties of STDP can be strengthened or weakened depending on the time latency (TL). Inputs that fire the postsynaptic neuron with a short TL or that act in coherent manner develop strong synaptic connections. Inputs of longer latency or less effective inputs, in contrast, lead to weaken the synaptic strength.15 In other words, synapses modifiable with STDP compete for controlling the time of postsynaptic action potentials to be more sensitive to the presynaptic action potentials. Δtpre.

Δtpost.

Pre-synapse

Post-synapse

Although robotic applications with the STDP have recently drawn much attention from the researchers, however, most of the reported works are still limited, and a rule of STDP that suitable for real robotics applications is still under investigation. In STDP, synaptic efficacy is modified persistently by a coherent activation of pre- and post-synaptic elements. The degree of the potentiation is found to be a function of the time interval between the pre- and postsynaptic excitations. When the postsynaptic site is excited just after the presynaptic excitation, the synaptic efficacy is increased. Conversely, it is decreased, or even becomes negative when the postsynaptic site is excited before the presynaptic excitation. There are two types of computational models of STDP15. The one we introduce in this study is based on the neurons’ firing rates as shown in Fig.3. If the presynaptic site and the postsynaptic neuron are generating action potentials with time intervals of Δt pre and Δtpost, respectively, the synaptic weight between those two neurons is modified in accordance with the difference Δtpre-post. We estimated the value of Δtpre-post at each time interval as follows: N

 t pre  post  TL 

N

 t

pre

i 1

N 1



post

i 1

Wpost (t), Wpost (t) + 0.1, Wpost (t) – 0.1,

Wpost (t+1) =

TL  0.5 TL  -0.5 TL  0.5

(2)

After the synaptic modification, the postsynaptic neuron would tend to respond faster to the presynaptic spiking. 4. Spiking Response Model (SRM)

Fig. 3. A type of STDP based on the difference in the firing rates in pre- and post-synaptic sites.

 t

synaptic weights occurs. That is, during movement in the environment, synaptic weights were renewed at every time interval according to the following equation.

(1)

N 1

Where N illustrates the number of the spikes at each time interval. We defined that STDP takes place only when TL > 0.5 (see section 6.2.3), and otherwise no modification in

There are several models of spiking neurons with various degrees of details. In this study, in particular, we focused on the Spiking Response Model (SRM)18, which is the easiest to understand and to implement especially with the case of the STDP. SRM is defined as a single variable vi (voltage, member potential) that describes the state of the neuron. In the absence of the spike, the variable vi is in its resting value. However, here we assume it zero. Each incoming spike generates a postsynaptic potential that takes time before it returns to zero. Mathematically speaking, vi can be represented by Eq.3. (More details can be found in Ref. 7,15). As shown in Eq.3, vi is the weighted sum of postsynaptic potentials from several inputs. An incoming spike that occurred at time s generates the postsynaptic potential following the function є(s). The weight of the synapse from j-th input to the neuron is ωjt. The function η(s) expresses the refractory effect after spike generation by the neuron.  i (t ) 

 j



t j



 j(s j) 

f



 i (si )

f

 ( s )  exp[ ( s   ) /  m ].{1  exp[ ( s   ) /  s ]}

 ( s )   exp[  s /  m ]

(3) (4) (5)

The function ε(s) describes the time course of the postsynaptic response generated by an incoming spike. If the summation of the effects of several incoming spikes reaches a threshold (θ=1), an output spike is triggered. Once the neuron has emitted a spike, its membrane potential is set to a very low value period, called refractory period, and then it gradually recovers to its resting potential. Notice that during the refractory

period spikes can be hardly evoked by the magnitude of the inputs. In Eq.4, the effect of ε of an incoming spike on the neuron membrane is a function of the difference (s = tt0) between the current time t and the firing time t0 of the neuron. The properties of the function are determined by i) the delay Δ between the generation of a spike at the presynaptic neuron and the time of arrival at the synapse, ii) a synaptic time constant τs and c) a membrane time constant τm. The function η(s) in Eq.5 calculates the refractory period that starts at the time s when the neuron emits a spike. The recovery to the resting level is exponential with the membrane time constant τm. 5. Binds Multiple Sensory Modalities for a new MASNN Structure In this section, we show the possibility to expand ASNN10 to accommodate various sensory and motor modalities as shown in Fig.4. Indeed, the network is a consortium of N number of ASNNs for operating N number of sensory inputs and motor outputs.

Modality 1 S2 S1

S3

Modality 2 S4 S5

Modality N SN-1 SN

related stimulus is occurred. If one of the stimuli does not exist at the training time, then its associated network will not be trained. Note here, due to the network structure, the adaptation mechanism of each network in this circuitry is generally identical to each other. 6. Implementation into a Real Mobile Robot 6.1. A real mobile robot “Khepera-I” We have implemented the proposed controller directly into a physical miniature mobile robot Khepera-I12 (Fig.5). In addition to the IR sensors that the robot has and uses for obstacle avoidance, namely the right-sensor (RS) and the left-sensor (LS), Khepera also is equipped with a linear-camera vision system to locate a stationary target object. The robot’s camera has approximately a 30o viewing range with a 64-Pixel line CCD. The visual field is divided into three regions; right, front, and left, each of which feeds its vision-sensor neuron, (RVS), (FVS) and (LFS), respectively. The size of the environment is 6060 cm, and obstacles are distributed in the environment as shown in Fig.5B. The target object is covered by a black sponge which is easily recognizable by the vision-system and unrecognizable by IR proximity sensors.

0

LVS FVS RVS 22 42 63 Pix.

LS

60cm

Target

RS

Robot

M0 M1

MN

Fig. 4. The proposed MASNN of a multiple sensory modalities and motor outputs. Each of M0, M1 … Mn, is the combination of the output neuron values in the same category.

From the figure, each network can have its minimal structure and can adapt independently, based on the appearance of its stimulus, towards performing its own task, which can be considered as a sub-task from the robot’s main task. Let us suppose, for instance, that we have a robot with 3 sensory types (IR sensor, Camera, and light sensor), and suppose that the activation of each of these sensors should generate different motor response. At this case we actually are having 3 different modalities (networks), where each modality can operate independently to perform its own task whenever its

RM

LM

60cm

(A)

Liner-camera

(B)

Wheel

(C)

IR sensor

Fig. 5. The locations of proximity sensors and motors (wheels) on the Khepera, and the viewing range of its camera (A). The environment (B). The miniature mobile robot Khepera equipped with a linear vision system on the top (C).

6.2. Neuron properties in a single ASNN 6.2.1. Spike generation by the sensor neurons In the IR sensors, the probability of emitting a spike at a particular time step (25 steps) is determined by the IR sensor’s value (0~1023). That is, the IR sensor’s neuron generates spikes randomly if the sensor value exceeds 300 (proximity to the obstacle is approximately 2 cm). In the case of the vision sensors, each pixel (0~63) can get a value from 0 to 255 depending on the distance between the robot and the target. Emitting a spike at a particular time step in each neuron is determined by the summation of a group of pixels which cover that region. Lower values associated with a particular region reflect the target location.16 6.2.2. Synapses on the hidden-layer neuron Each sensory neuron makes a synaptic contact with the hidden-layer neuron (RSW\LSW). These synaptic weights are randomly initialized [0.0~1.0] and will not be modified during the process. (See Fig.6). 6.2.3. Synapses on the output-layer neurons The hidden layer neuron makes a post-synaptic contact on each of the output-layer neurons (RMW\LMW). These synapses are randomly initialized and have the properties of STDP. The post-synaptic strengths are modified in accordance with the difference in average spike-time intervals at the presynaptic sites and presynaptic modulator input from the input-layer neurons. That is, the STDP takes place in accordance with TL, Eq.1, where Δtpre refers to the firing rate of the hidden-layer neuron and Δtpost to that of the output-layer neuron. During the lifetime of the robot, the postsynaptic weights are gradually forced to be modified in aim to drive TL’s value to the short time latency region (STL) [-0.5 ~ +0.5] (Eq.2). This, therefore, build a strong synaptic connection that affects the motor neuron firing rate to quickly response to its corresponding sensory input. 6.2.4. Motor control signals: Each of the output-layer neuron generates signals for the corresponding motor in proportion to its membrane potential value. As the measure of the membrane potential value, for simplicity, we instead use directly the spike interval at the synapse (Eq.6).

Motor _ value  ( 0 .5  TL ) * 10

(6)

7. Experimental Results 7.1. MASNN for complex robot tasks We had setup an experiment to investigate the validity of the MASNN controller (Fig.4), in adapting various sensory and motor modalities toward generating an autonomous behavior(s) on a physical robot in an open environment. For simplicity, the network shown in Fig.4 had been simplified to the network shown in Fig.6, which is a consortium of two ASNNs. The first network, “Network A”, for performing the navigation and the obstacle avoidance behaviors using the right and left IR proximity sensory inputs. The second “Network B” for locating a target object using the vision-sensory inputs. The robot’s task was to locate a stationary target object within the environment while learning obstacle avoidance behavior. Generating spikes from both sensory inputs were dependant on the sensors value as explained in section 6.2. If the value of the sensor reaches to a predefined threshold, then its related sensory neuron randomly fires spikes within a certain period of time. Network A Modality 1 LS RS

FVS

Network B Modality 2 RVS

SN

SN

SN

SN

LVS SN

RSW “A”

RSW “B” HN

HN LSW RMW “A”“B”

LMW “A”

LSW “B”

RMW LMW “A” “B”

MN

MN

LM

RM

STDP synapses Presynaptic modulation

Fig. 6. Two combined ASNNs. Network A is for navigation and obstacle avoidance behavior, while Network B is for locating the target object. RSW: right presynaptic weight, LSW: left presynaptic weight, RMW: right postsynaptic weight, LMW: left postsynaptic weight.

Both networks in Fig.6 were operating simultaneously during the robot navigation in the environment, and the activation of any network relied on the activation of its own sensory neuron (i.e. the existence of its stimulus). We illustrated both networks activities A&B independently as shown in Fig.7 left & right, respectively. We also showed the robot’s behavior

during a randomly selected time period I, J and K, which are illustrated by the dotted area in Fig.7. The actual motor values of the robot were the combination of the motor values in Fig.7G&H. The synaptic weights, in both networks, were initialized randomly and the robot was left to navigate in the environment freely until it located the target. Figures 7 left (right) show: proximity to the wall (target), synaptic weights, pre- and post-neuron firing rates and motor values of network A(B) during the robot navigation in an environment for 30min. From the Fig.7C&D, it is clear that the synaptic weights are gradually adjusted whenever the robot gets close to the obstacles and/or the target Fig.7A(B) (see the arrows in I, J, K areas). The changes in synaptic weights occur to force the corresponding postsynaptic neurons LMF&RMF to fire more sensitive to presynaptic neuron firing time HF, Fig.7E(F). The sensitivity between the firing rate of pre- and post- synaptic neurons strengthen the synaptic connection in each network side, and thus, generates automatically the desired behavior in the mobile robot. For example, in the obstacle avoidance behavior “Network A”, we can observe the gradual changes on the firing rate of the left motor neuron LMF (right motor neuron RMF) to be sensitive to the firing rate of the hidden layer neuron HF whenever LS (RS) is activated. These changes force the LM (RM) to respond faster than RM (LM) to the LS (RS) activation, and therefore, the robot incrementally trains to turn right (left) to avoid left (right) obstacles by increasing the wheel speed on the side of the obstacle and decreasing the wheel speed on the opposite side and keep moving forward in the environment. Similar mechanism, with opposite motor response, occurs also in “Network B”, since the presynaptic modulation from the vision sensory inputs have opposite connection with the motor neuron. Thus, the robot, by adjusting the synaptic weights in Network B, incrementally trains to turn right (left) (forward) toward the target on the right (left) (front) side. The learning of the network A and B’s behavior can be observed by Fig.8A&B, respectively. Notice that, each data point in these two figures is the average of 5 runs of the root mean square values (RMS) of the robot’s proximity and vision sensors with different initial synaptic weights and robot position. From Fig.8A, it can be seen that: at the initial time, the robot often collides with the obstacles, rms > 400 (i.e., the robot is at a distance less than 1.5cm from an

obstacle), but in a short period of time the behavior incrementally updates and the robot learns to keep a certain distance away from the obstacles, rms < 400. Simultaneously, in “Network B” whenever the target is recognized by the camera, the robot incrementally corrects its movement toward the target. Therefore, rms increased to reach a value > 380 (target less than 1cm from the robot). From this experiment, the robot succeeded to perform the task and reached the target within a time less than 30min. 7.2. Comparison with conventional GA We tested whether the STDP mechanism for weight changes that worked in the proposed MASNN structure (Fig.6) would be comparable to the genetic evolution that involves a population of individuals at every generation. We have applied the same network structure to both STDP and GA. In GA, each synaptic weight was encoded to a gene of four bits. The population in a generation was 20, and each individual was evaluated by a fitness function Ф (Eq.7) for a lifetime of (6sec + 2sec random walk between generations). Similar fitness function was also used to evaluate the behavioral quality of the robot using STDP. 1 T

T



( OA * TR )

(7)

OA  [ RM  LM ]  [ RS  LS ]

(8)

TR  [ RVS  FVS * 2  LVS ]  1

(9)

 

t0

Where OA, is for obstacle avoidance behavior, and TR, for target locating behavior. RM and LM are the motor speeds, RS and LS are the IR proximity sensor values, and RVS, FVS and LVS are the vision sensor values. The best fitness occurs if both wheels rotate to the forward direction while the robot is far enough from the obstacles and directly facing the target object. The learning process of both STDP and the GA are shown in Fig.9. The average fitness values, as well as the standard errors in 3 runs are illustrated. As can be seen in the figure, at least 55 generations are required to evolve a good individual, which is approximately equal to 146min. This is much longer than the time required for the robot with STDP to adapt to its environment and reach the target object.

I

J

I

K IR LS

1000

IR RS

800 600 400 200 0 -200 0

LVS

400

15

100 0 15

LSW 6 RSW RSW

0 0

15

30

30

0

(C) 9 8 7 6 5 4 3

30

8 7 6 5 4 3 2 1 0

LMW 8

2

FVS

(B)

10

4

RVS

200

(A)

Synaptic Weight

K

300

30 -100 0

Time (min)

Firing Rate

J

500

Proximity to Target object

Proximity to Wall

1200

(D)

HF

LMF

7 6

RMF

5 4 3 2 1 0

1 0

-1 0

15

30

0

15

(E) 100

60

Motor Value

RM

40 20

0

0

-50

-20 -100

-40

-150

Motor Value

LM 50

-200

30

(F)

-60 0

15

30 -80 0

(G)

(I)

15

30

(H)

(J)

(K)

Fig. 7. Events recorded during the adaptation process for both Networks A (left) & B (right) during the robot navigation for 30mins. Data were recorded every 30 sec. A & B illustrate the value of the left, right and front IR proximity & vision sensors’ inputs, respectively. C & D illustrate the pre- and post-synaptic weights changes. E & F illustrate the firing rate of hidden layer neuron HF and both of the right and left post synaptic neurons, RMF and LMF, respectively. Synaptic weights are modified, if and only if, any of the sensory input is activated and the pre- and post-synaptic neuron firing rate is out of the STL range. G & H illustrate the left and the right motor values. The summations of both G&H values were representing the actual robots movement. I, J and K, illustrate the path of the robot in the environment within the period shown in the dotted area in figures (A~H).

The results in this section clearly indicate that the robot with the proposed controller has two significant advantages over the conventional genetic approach. First, the adaptation is achieved in a single individual, no matter for the complexity of the network, since in such a combination of MASNN networks, each network is independently adapted. Second, the number of trials and errors is minimal. Thus the correction of the synaptic weights in each network takes place only when the robot senses dangerous in its behavior, i.e. move towards an obstacle or away from the target object. We also compared the ability of MASNN with that introduced in our previous work (See Ref. 16 &

Fig.11), where a self-organizing algorithm for SNN had been applied for an autonomous mobile robot performing a similar task. Our current proposed network improves upon the previous one in three main aspects (see table 1). It is clear from the table that the current proposed MASNN architecture with STDP helps in simplifying the network structure, as well as, decreasing the computational time: i) no conflict between networks since each network updated independently. ii) The synaptic weights are updated only if the stimulus is exist (i.e. obstacle or target).

500

800

400

600

300

400

200

200

100

RMS

1000

0

0 0

15 Time (mins)

30

15 Time (mins)

0

(A)

30

(B)

Fig. 8. A (B) illustrate the RMS of IR proximity sensors (Vision-sensors) obtained on the physical robot Khepera. Each data point in A & B is the average of 5 runs with different random synaptic weight. The robot gradually reaching the target (RMS’s value is increased and the robot gets close to the target, as shown in B) and simultaneously performing obstacle avoidance behavior (RMS’s value decrease and the robot keep distance from the obstacle, as shown in A). The Dashed horizontal line illustrates the threshold where the robot hit the obstacle (A) or reached the target object (B).

Table 1. A comparison between MASNN and our previous work

Network architecture Weights updating mechanism

Time to complete the Task

MASNN See Fig. 6. Two independent networks combined in the motor neurons. 2 neurons in the hidden layer Weight range [-10,10], updating rate ±0.1 based on TL between pre and post synaptic firing rate (STDP)

Previous work (Ref. 16) See Fig.10. One single network. Fully connected 8 neurons in the hidden layer Self adaptation of the synaptic connection (On/Off) based on predefined rule.

Less than 30min to reach the target

40~45min to reach the target

500

400 300

300

Fitness Ф

Fitness Ф

400

200

200 100

100 5

10

15

20

30

10

20

30

40

50

60

70

Generations (B)

Time (min) (A)

Fig. 9. (A) The adaptation process with the proposed MASNN (Fig.6) adapted by STDP. The fitness value is shown every 30sec. (B) The evolutionary process of MASNN with GA. Both graphs illustrate the average and standard errors of three runs. From (B), it is hard for GA to adapt both networks.

IR Sensor Input

Vision Sensor Input

Hidden Layer

Motors Output

Fig. 10. Architecture of the neural network used in the previous study. White circles represent excitatory neurons, while black circles represent inhibitory neurons. The neurons in the hidden layer are fully connected (Adopted from Ref. 15).

8. Conclusion and Future Direction At present, there is no rational for identifying SNN structure necessary for the given tasks on a physical mobile robot, and therefore, most of the early reported works determined the structure experimentally or arbitrarily which probably results in slow adaptation and unnecessary calculations. In our early work10, we presented a minimal structure of SNN inspired from the withdrawal reflex circuit of Aplysia which acts as a simple adaptive controller in an autonomous mobile robot. A physical mobile robot with the proposed controller acquired the desired behaviors, navigating and avoiding obstacles in an open environment in less time than that needed by a GA. In this study, we investigated the validity of a novel network structure that deals with multiple sensory and motor modalities, by combining two of the ASNN. Thus, the input-layer has two types of sensory input: IR proximity and vision sensors. This expanded version succeeded to generate a real-time robot’s movement and

successfully performed the robot task(s) in a very short period of time as shown in experiment 1. In experiment 2, the proposed MASNN adapted by STDP, outperformed that evolved by a GA, both in time and in behavioral quality, utilizing minimal neural circuitry. In conclusion, from the experimental results, we emphasize the importance of our simple model in its flexibility to be expanded to support a much larger number of various sensory inputs as well as motor outputs toward performing various robot tasks as illustrated in Fig.4. The generalization in such a matter is planned for future investigation. For future direction, it is also essential to incorporate short-term and long-term plasticity in such a network in order to balance the stability and plasticity of more complex artificial systems and environment. One possibility is to use SNN ensemble that is incrementally assembled. Constructive algorithm for SNN ensembles, similar to the one for conventional neural network ensembles17, is now under investigation. Recently, we are working in constructing a more complex network connection of MASNN, for building an autonomous reflexing mechanism in office-like robot’s arm (Fig.11), which has much more sensor modalities and various motor outputs.

Hand Sensor

Hidden Neuron

Arm Sensor

Motor Neuron

Shoulder Sensor

Motor

Synapse Connection

Fig. 11. The possible complex structure of MASNN for autonomous reflexing mechanism in a mobile robot.

Acknowledgements This work was in part supported by grants to KM from Japanese Society for Promotion of Sciences (JSPS) and University of Fukui. References 1.

2. 3.

4.

5.

6.

7. 8.

D. Floreano and C. Mattiussi, Evolution of Spiking Neural Controllers for Autonomous Vision-Based Robots, Lecture Notes in Computer Science 2217 (2001), 38-61. W. Maass, Networks of spiking neurons: the third generation of neural network models, Neural Networks, 10(9) (1997), 1659-1671. D. Floreano and C. Mattiussi, Bio-Inspired Artificial Intelligence: Theories, Methods, and Technologies. MIT Press (2008). W. Gerstner, A.K. Kreiter, H. Markram and A.V.M. Herz, Neural codes: firing rates and beyond, in Proc. of the National Academy of Science of USA 94(24) (1997), pp. 12740-12741. J.J. Hopfield and C.D. Brody, What is a moment? Transient synchrony as a collective mechanism for spatiotemporal integration, in Proc. National Academy of Science of USA 98(3) (2001) pp. 1282-1287. F. Worgotter and B. Porr, Temporal sequence learning, prediction, and control – A review of different models and their relation to biological mechanisms, Neural Computation, 17(2) (2005) 245-319. D. Floreano, J.C. Zufferey and J.D. Nicoud, From Wheels to Wings with Evolutionary Spiking Neurons, Artificial Life, 11(12) (2005) 121-138. N. Kubota, H. Sasaki, Genetic algorithm for a fuzzy spiking neural network of a mobile robot, in Proc. IEEE

International Symposium on Computational Intelligence in Robotic and Automation (2005). pp. 321 - 326. 9. T. Natschläger, B. Ruf and M. Schmitt, Unsupervised Learning and Self-organization in Networks of Spiking Neurons, in Self-Organizing Neural Networks (Springer Series on Studies in Fuzziness and Soft Computing, Springer-Verlag, Heidelberg, 2001), pp. 45-73. 10. F. Alnajjar and K. Murase, A simple Aplysia-like spiking neural network to generate adaptive behavior in autonomous robots, Adaptive Behavior 14(5) (2008) 306-324. 11. L. Squire, F. Bloom, S. McConnell, J. Roberts, N. Spitzer and M. Zigmond, Fundamental Neuroscience (the 2nd edn.) Academic Press (San Diego, 2003) 12771283. 12. F. Mondada, E. Franzi and P. lenne, Mobile Robot Miniaturization: A tool for investigation in control algorithms, in Proc. 3rd Int. conf. Experimental Robotics (Japan, 1994), pp. 501-513. 13. F. Alnajjar, I.B.M. Zin and K. Murase, A Hierarchical Autonomous Robot Controller for Learning and Memory: Adaptation in Dynamic Environment, Adaptive Behavior (2009), in press.

14. H. Yao and Y. Dan, Stimulus Timing-Dependent Plasticity in Cortical Processing of Orientation, Neuron 32(2) (2001) 315-323. 15. S. Song, K.D. Miller and L.F. Abbott, Competitive Hebbian Learning through Spike-Timing Dependent Synaptic Plasticity, Nature Neuroscience 3 (2000), 919926. 16. F. Alnajjar and K. Murase, Self organization of spiking neural network that generates autonomous behavior in a real mobile robot, Int. J. Neural Systems 16(4) (2006) 229-239. 17. Md. Monirul Islam, X. Yao and K. Murase, A constructive algorithm for training cooperative neural network ensembles, IEEE Trans Neural Networks, 14(4) (2003) 820-834. 18. W. Gerstner, R. Ritz, J.L. van Hemmen, Why spikes? Hebbian learning and retrieval of time-resolved excitation patterns, Biol. Cybern. , 69 (1993) pp. 503–515.