Chaotic Neurodynamics for Autonomous Agents

0 downloads 0 Views 883KB Size Report
dynamics may play in the role of adaptive behavior [20], ... neurodynamics of perception and decision making have been ...... of Bifurcation and Chaos, vol. 6, pp.
CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

1

Chaotic Neurodynamics for Autonomous Agents Derek Harter Member, Robert Kozma Senior Member Division of Computer Science, University of Memphis, TN, USA

Abstract— Mesoscopic level neurodynamics study the collective dynamical behavior of neural populations. Such models are becoming increasingly important in understanding large-scale brain processes. Brains exhibit aperiodic oscillations with a much more rich dynamical behavior than fixed-point and limitcycle approximation allow. Here we present a discretized model inspired by Freeman’s K-set mesoscopic level population model. We show that this version is capable of replicating the important principles of aperiodic/chaotic neurodynamics while being fast enough for use in real-time autonomous agent applications. This simplification of the K model provides many advantages not only in terms of efficiency but in simplicity and its ability to be analyzed in terms of its dynamical properties. We study the discrete version using a multi-layer, highly recurrent model of the neural architecture of perceptual brain areas. We use this architecture to develop example action selection mechanisms in an autonomous agent. Index Terms— neurodynamics, chaos, dynamic memory, autonomous agent

I. I NTRODUCTION A. Connectionist Models of Spatio-Temporal Neural Dynamics Recent biologically inspired control architectures for adaptive agents utilize complex spatial and temporal dynamics to model cognition. Clark [1] categorizes such biologically inspired architectures as third generation connectionist models. Third generation connectionist models are characterized by increasingly complex temporal and spatial dynamics. More complex temporal dynamics are due, in part, to the use of feedback and recurrent connections in the models. Complex spatial dynamics are seen in the variety of connectionist architectures produced, usually meant to capture some aspect of the architecture of biological brains. Such simulations are no longer strictly three layered, with input, hidden and output layers, but have many layers connected with specialized and complex relations. Examples of third generation connectionist models include the DARWIN series produced by Edelman’s research associates [2] and the Distributed Adaptive Control (DAC) models of Verschure and Pfeifer [3], [4]. Neural networks with recurrent connections are widely used in the literature. Such architectures have the potential of producing complex behavior, including chaos. However, the operating range of these systems has been predominantly selected in the fixed-point regime; see e.g., [5], [6]. This research has contributed to the explosive growth of neural networks with powerful generalization capabilities. More recently, chaotic models of neural processing have been introduced by a number of researchers. Biologically plausible dynamical models of neural systems have been developed for example in [7], [8],

[9], [10], [11]. Chaotic models have been established also in the field of computational neural networks [12], [13], [14], [15], [16], [17]. These works emphasized chaos control, which meant the suppression of chaos in the models [18], [19]. Some researchers in dynamical cognition and neurodynamics have discussed the possibilities that aperiodic, chaotic-like dynamics may play in the role of adaptive behavior [20], [21], [22], [23], [24]. Chaotic dynamics have been observed in the formation of perceptual states of the olfactory sense in rabbits [20]. Mathematical theories of the nonconvergent neurodynamics of perception and decision making have been proposed based on the principles of the olfactory neurodynamics [25], [26]. Other researchers have analyzed activity patterns of primate and human cortex and reported on the dynamics of large-scale neural organization [27], [28], [29]. Hardware implementation of the proposed dynamical principles has been reported on VLSI circuitry [30]. Skarda and Freeman [20] have speculated that chaos may play a fundamental role in the formation of perceptual meanings. Chaos provides the right blend of stability and flexibility needed by the system, with swift and robust transitions from one cognitive state to the other using first order phase transitions. According to Skarda and Freeman, the normal background activity of neural systems is a chaotic state. In the perceptual systems, input from the sensors perturbs the neuronal ensembles from the chaotic background. The result is that the system transitions into a new attractor that represents the meaning of the sensory input, given the context of the state of the organism and its environment. But the normal chaotic background state is not like noise. Noise cannot be easily stopped and started, whereas chaos can essentially switch immediately from one attractor to another. This type of dynamics may be a key property in the flexible production of behavior in biological organisms. Based on the neurophsiological findings, Freeman [31], [32], [20] has developed a model of the chaotic dynamics observed in the cortical olfactory system, called the K-sets. K-sets have been used successfully for dynamic memory designs and for robust classification and pattern recognition [21], [33], [24], [34]. Principe and colleagues have developed a discrete implementation of Freeman’s K model using gamma processing elements followed by a nonlinearity. This approach has proved to be very efficient to transform the K model to a discrete formalism that allows obtaining a solution without the need for Runge-Kutta integration. Based on this approach, efficient and accurate solutions have been obtained both on digital computers and in VLSI hardware domains [30], [35]. Discrete models of dynamical systems are widely used in the literature, and they provide an alternative to continuous time systems by

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

solving the discretized equations by recursive iterations, see, e.g. [36], [37]. In the present work we introduce an alternative discrete approach for solving Freeman’s K models, which is called KA model. In KA we introduce a second order time difference equation to describe the dynamics of the basic processing elements, called KA-0. Consequently, we build higher-level discrete KA-I, KA-II, and KA-III models. We solve the difference equation directly, without the need of Runge-Kutta integration. B. Introduction to K Sets The K-set dynamics are designed to model the dynamics of the mean field (e.g. average) amplitude of a neural population. A nonlinear, second order, ordinary differential equation was developed to model the dynamics of such a population. The parameters for this equation were derived by experimentation and observation of isolated neural populations of animals prepared through brain slicing techniques and chemical inhibition. The isolated populations were subjected to various levels of stimulation, and the resulting impulse response curves were replicated by the K-set equations. The basic ODE equation of a neural population of the Kmodel is: dai (t) d2 ai (t) + (α + β) + ai (t) = neti (t) (1) dt2 dt In this equation ai (t) is the activity level (mean field amplitude) of the ith neural population. α and β are time constants (derived from observing biological population dynamics to various amounts of stimulation). The left side of the equation expresses the intrinsic dynamics of the K unit (which captures a neural populations characteristic responses). On the right side of the equation are factors that allow for external network input to the population neti (t) Stimulation between populations is governed by a nonlinear transfer function. The nonlinear transfer function used in the K-models is an asymmetric sigmoid that was again derived through measurements of the stimulation between biological neural populations: X neti (t) = wij oj (t) (2) αβ

j

−(eaj (t) − 1) ]} (3)  where  is a parameter that indicates the level of arousal in the population (high values indicated a more aroused, motivated state), and aj (t) is the activation of the j th population connected to the target unit. The asymmetry is an important property in the transfer function as it means that excitatory input causes a destabilization of the dynamics of networks. This destabilization is essential in the collapse of aperiodic attractors observed in biological perceptual systems. These equations model the dynamic behavior of the activity of isolated neural populations. In Freeman’s K-model, these are the basic units that are connected together to form larger cooperating components. Two excitatory or inhibitory units together form a K-I set. A K-I excitatory with a K-I inhibitory oj (t) = {1 − exp[

2

pair form a K-II set of four units (see Figure 2). Freeman and associates used these neural population units to construct a model of the olfactory system that replicate the dynamics observed from EEG recordings. Three or more groups of K-II units connected together form a K-III unit. The KIII forms a multi-layer, highly-recurrent neural population model of biological perceptual systems. The K-III model was originally used to replicate the chaotic dynamics observed in the olfactory bulb of rabbits and rats. According to this view, the dynamics of the brain, as modeled by the K-III, is characterized by a high-dimensional chaotic attractor with multiple wings. The wings can be considered as memory traces formed by learning through the animal’s life history. In the absence of sensory stimuli, the system is in a high-dimensional itinerant search mode, visiting various wings. In response to a given stimulus, the dynamics of the system is constrained to oscillations in one of the wings, which is identified with the stimulus. Once the input is removed, the system switches back to the high-dimensional, itinerant basal model [38]. These results from the study and development of the K-models have led to the establishment of a dynamical theory of perception [39]. Recently, a new class of chaotic behavior, called chaotic itinerancy, has been introduced [40], [26], [38], which is related to dynamical behavior of K-sets. Chaotic itinerancy is observed in high-dimensional dynamical systems with trajectories evolving through successions of “attractor ruins”, with each attractor being destroyed as soon as it is reached, and the system continuously remains unstable, as in a search mode. Results by the KIII model indicate that the complex, intermittent spatio-temporal oscillations in KIII are possible manifestations of Tsuda’s attractor ruins and chaotic itinerancy in a biologically plausible neural network model [24], [41]. C. Motivation of KA Modeling The K sets are an attempt to model the aperiodic dynamics observed in cortical sensory systems, and to begin to explain how such dynamics contribute to the recognition and learning of sensory patterns in biological brains. Recent work in aperiodic dynamics in cortical systems [42], [43], [21], [44], [45], [46], [38] have begun to move beyond sensory systems to look at how such dynamics may also help us better understand the production of intelligent behavior in biological agents. The motivation behind the KA model is to develop a simplification of the original K-sets that is still capable of performing the essential dynamics, but is simpler and faster and therefore more suitable for use in large-scale simulations of more complete autonomous agent architectures. The KA is to be used in developing autonomous agents that take advantage of aperiodic dynamics for perception, memory and action. This introduction of KA model has many possible advantages. The KA simplification uses a discrete difference equation to replicate the original K-set dynamics. The discrete difference equations are more mathematically tractable and analyzable. Besides mathematical analyzability, the KA units are much more efficient. We will show that our KA based

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

3

method performs much faster than approximation techniques which solve the ODE equations using, e.g., Runge-Kutta method. This gain in efficiency allows for correspondingly bigger and more complex models to be built, and greatly expands the types of problems that can be investigated using these highly-recurrent neural models. The KA simplification offers units that we believe are on a very useful level of abstraction. The neural population model is more detailed and biologically plausible than standard ANN and even simpler cellular automata models of neurodynamics. The simplification closely replicates the dynamics of the original K-sets while being simpler and more efficient. First we describe a version of the K-set model that we have developed for use in the creation of adaptive agent control architectures, referred to as KA-sets (K-sets for adaptive agents). We then present simulations using the KA-sets to model some of the important principles of chaotic neurodynamics. Finally we demonstrate the ability of the KA model to generate deterministic chaos and show how the KA units may be used to learn simple behaviors in an autonomous agent. II. KA M ODEL A. Description The purpose of the model presented here is to provide elementary units capable of the complex mesoscopic dynamics observed in the brains of biological organisms. These units model the dynamics of populations of neurons, rather than a single neuron. The modeled units presented here are also designed to be computationally efficient, so that they may be used to build real-time control architectures for autonomous agents. At its heart the KA model uses a discrete time difference equation to replicate the dynamics of the original second order ordinary differential equations of the K-sets. A unit in the KA model simulates the dynamics of a neuronal population. Each KA unit simulates an activity level, which represents an average population current density. The basic form of the difference equation can be given simply as shown in Eq. (4), which states that the current at time step t is a function of the current in the two previous time steps, as well as the external influence from the net input of units connected to the simulated unit. ai (t) = F (ai (t − 1), ai (t − 2), neti (t − 1))

(4)

The evolution equation of the KA unit can be described by three components that are combined to compute the simulated current at time t from the current and the rate of change of the current at time t − 1 and t − 2. These three influences on the simulated current are 1) a tendency to decay back to the baseline steady state deci (t); 2) a tendency to maintain the momentum of the current in a particular direction momi (t); and 3) the influences of external excitation or inhibition as input to the unit neti (t). When isolated neural populations are externally stimulated away from their baseline steady state, once the external stimulation is removed the population experiences an exponential decay back to the baseline. In the KA model the tendency

to return to the baseline steady state is modeled by a decay term. The resting, or baseline state of the current in these model is defined as an activity level of 0. The effect of decay is described as: deci (t) = −ai (t) × α

(5)

Here α is a parameter that indicates the rate of decay. Since the difference is proportional to the current, the effect is to cause the decay to be rapid when the activity of the unit is far from the baseline, while the rate of decay slows down as the activity approaches the steady state. Neural populations exhibit a certain amount of momentum in the dynamics of their activity over time. In essence, once a population’s current begins to move in a certain direction (positive or negative) it tends to keep moving in that direction even for some time after any influence pushing it has been removed. In the original K-set models, this was observed when stimulating a isolated brain-slice population. After stimulation ceased the population rapidly returned to its resting level. However in the process of decaying back to the baseline it will undershoot and actually go below the baseline steady state for some time before returning to equilibrium. This slight oscillation in the neural populations is what necessitates the used of the second order term of the differential equations, as only second order equations are capable of capturing such oscillatory behavior. The momentum term is needed in the KA difference equation in order to capture this dynamic behavior of the population. Similarly, the momentum term is also second order, as it relies on two previous time steps in order to calculate its influence. To simulate the momentum of a units activity, we need to use a function of the previous two time steps. This is necessary so that we can simulate a momentum based on the rate of change of the activity of the unit, as well as the other reasons mentioned above. We first define the rate of change of the activity at time t, ri (t). This is the difference of the activity of the unit at time t from the activity at a previous time step t − 1. The rate of change at time t is thus: ri (t) = ai (t) − ai (t − 1)

(6)

With the rate at time t defined, we can describe the momentum as shown below: momi (t) = ri (t) × β

(7)

Where β is a parameter that controls how much of an influence the momentum has on the dynamics of the model. β can be thought of as a percentage which indicates what portion of the momentum at the present time step should continue into the next time step. The effect of the net input at time t is the same as in the K-model and is shown in Equation 8. This is the standard summation of the activity of the input units through a transfer function multiplied by the connection strength. The output or transfer function oj (t) of a KA unit is a function of the activity of the unit. The KA model uses the same asymmetric sigmoid transfer function and summation mechanism of the original K-sets. The transfer function is shown in Equation 9. The 

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

4

TABLE I KA M ODEL VARIABLES Variable ai (t) deci (t) momi (t) ri (t) oi (t) neti (t)

Description Simulated activity of ith population at time t Difference at time t due to decay to baseline Difference at time t due to momentum Rate of change of the activity at time t Transfer function of the activity of the ith unit at time t Difference at time t due to external net input TABLE II KA M ODEL PARAMETERS

Parameter α β 

Description Rate of decay to baseline Rate of momentum Transfer function arousal level

Default 0.1505 0.0985 5.0

parameter is a scaling factor that indicates the level of arousal of the KA unit. Arousal in biological organisms is a function of history and experience, and can vary with things like surprise and familiarity with the current situation. neti (t) =

X

wij oj (t)

(8)

j

−(eaj (t) − 1) ]} (9)  The sum of the influences in Equations 5, 7 and 8 represent the total influence that will be applied to the activity of the unit in the next time step. oj (t) = {1 − exp[

ai (t) = ai (t−1)+deci (t−1)+momi (t−1)+neti (t−1) (10) In other words, activity of a KA unit is a function of the decay and momentum terms along with the influence from the net input of external units. We sum the values from these three influences and add them to the previous activity of the unit to determine the new activity of a KA unit. In Table I we summarize the variables used in the KA model. Table II provides a summary of the KA parameters and their values used in the experiments described in this paper. The decay and momentum rates were determined experimentally by fitting the dynamics of a single KA unit to those of the original K unit under various conditions of stimulation and inhibition. The determination of these time constants will be discussed next. B. Determination of Momentum and Decay Time Constants We use an empirical method to determine the parameters of the KA model that allow it to closely approximate the original K model dynamics. Keep in mind, however, that the α and β parameters of the two models represent different time constants, and as such will be set at different values in the two models. We take as our target the dynamics of a K model unit, and subject it to varying intensities of external stimulation and inhibition, for varying lengths of time. We then find the decay, momentum and other parameters that allow the KA responses

to best approximate the original K model, using least squared fit to measure the difference. Therefore we subjected a K unit to levels of stimulation ranging from -0.49 to 0.5 in 0.01 increments (intensity = [-0.49:0.01:0.5]). We also varied the time each stimulation level was applied to the K unit from 1 to 50 ms in 1 ms increments (time duration = [1:1:50]). We ran the simulation of the K unit for 500ms in Matlab 6.5 using Runge-Kutta to solve the ODE, and captured its response to the 5000 different combinations of intensity and time durations. These 5000 time series represented the target dynamics we tuned the KA model to replicate. With the 5000 samples of the K unit dynamics, we then exhaustively searched the momentum (α) and decay (β) parameter space of the KA model to find a combination that replicate the dynamics of these 5000 samples of the K0 unit by a KA0. We applied the same 5000 combinations of the intensity and time duration of stimulation to a KA unit for the various α and β values. Through a systematic search we could reduce the difference in the dynamics to an arbitrarily small amount. We used a hill-climbing algorithm in order to zeroin on the exact values for the parameters that provided very good approximations of the original dynamics. We found that a decay rate of α = 0.1505 and a momentum of β = 0.0985 produced a good fit of the KA to the K model. The parameter space defined by the momentum and decay parameters ends up forming a smooth function in the KA model, with only 1 global minima. This makes it easy to find the appropriate parameters to fit the KA single unit dynamics to the original K0 unit. For example, in Figure 1 we show a part of the decay and momentum parameter space of the KA model. Here we plot decay along the X axis from values ranging from 0.1 to 0.2, and momentum is plotted on the Y axis from 0 to 0.5. Color is used to indicate the error in the fit at each point in the α/β parameter space. We can see visually that the space is smooth, and there is a global minima in the error somewhere in the area of α = 0.15 β = 0.1. The global minima depicted in this figure is the place where the momentum and decay parameters of the KA model yielded the closest results to the dynamics of a K0 unit. Table II summarizes the parameters of the KA model discovered by the parameter fitting process and used for the simulations and experiments described in the rest of this paper. The arousal level parameter  is only significant when we have networks of units connected together, it does not affect the dynamics of a single unit in isolation. Since we are using the same asymmetric sigmoidal transfer function in both the K and KA models, we have used a standard arousal level of 5.0 in the experiments described next. Future work is needed to explore the uses of the arousal level in models of cognition, and its possible relation to more global and slow-changing dynamics such as neuro-chemical processes that affect the dynamics of populations in brains. C. Learning Mechanisms In this section we discuss in more detail the learning mechanisms used in the KA multi-layer recurrent neurodynamical models. In the simulations with autonomous agent

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

5

synaptic nodes times a learning rate parameter ε. In mean field models where the resting or normal level of the unit is not necessarily 0, we can’t simply use the activity level of the unit. Instead we must look at the firing rate of the unit over some time period. We can determine if a unit is more or less active by comparing its current firing rate to what its normal or average firing rate usually is. The slightly more complex Hebbian rule thus becomes: ∆wBA = ε(aA − aA )(aB − aB )

Fig. 1. A portion of the α / β parameter space of the KA model. Decay (α), plotted along the X axis, varies here from 0.1 to 0.2. Momentum (β), plotted along the Y axis, varies from 0.0 to 0.5. Intensity indicates the calculated error (using sum squared difference) between the dynamics of the KA0 and K0 unit over the 5000 sample time series. A global minimum is present in the area of α = 0.15 β = 0.1

architectures we will be using two types of unsupervised learning, Hebbian synaptic weight modification and habituation of unreinforced stimuli. 1) Hebbian Mechanisms in the KA Population Model: The basic idea behind Hebbian mechanisms is that when the activity of two connected neural units co-occur, they have some statistical relationship to one another. We can exploit this relationship by increasing the likelihood that in the future if one of the units is active the other becomes active. This can be done by increasing the strength of the weight between the units. In other words, units that tend to fire together should have the weights between them strengthened so that they are more likely to fire together in the future. The converse of this rule is also true, if the units do not tend to fire together, then the strength of any connection between them should weaken over time. This simple mechanism defines a type of competitive process among the links between neural units. Hebbian learning is a simple concept, but it is very powerful in shaping the weight space of a neural model to process stimuli. Hebbian mechanisms allow the models to capture statistical regularities in the stimulation patterns that occur in the environment of the organism. In the simplest formal definition of the Hebbian learning mechanism we consider a pre-synaptic node A and postsynaptic node B connected by a link with weight wBA . The activity or firing rates of the nodes are represented by the values aA and aB respectively. For simple models where the activity of the units ai is a measure of the mean firing rate of a neuron, we can correlate the activity between the units to determine the difference we wish to apply to the weight as [47]: ∆wBA = εaA aB (11) Here the proposed change to the weight ∆wBA is simply a function of the product of the activity of the pre and post-

(12)

Here aA and aB represent the average firing rates of the pre and post-synaptic nodes respectively. In these firing rate models, notice that the current firing rate can be lower than the average, which can lead to negative, or decreasing weight changes. This may or may not be what is wanted depending on the type of model being experimented with. For example, it may or may not make sense to strengthen the weight between two units when they both have less than average activity at the same time. The K family of models, including the KA model, are neural population models, not models of single neurons. Therefore the concept of the firing rate of a node is not relevant. The activity level in KA units represents an average current density for the population. However, unlike the simple case, this average population current can change rapidly, since the units are oscillatory in nature, which makes a simple Hebbian equation inadequate for our use. We instead need to develop a concept of the activity of a unit over some time window. In the KA models we use the root mean square to calculate the activity over a time window: v u b u 1 X ai (t)2 rms(i, a, b) = t b − a t=a

(13)

This states that the root mean square intensity of unit i over the time interval a to b is given by taking the sum of the squares of the activity over the time interval, dividing it by the time interval, and taking the square root. The root mean square is a better measure of the activity of a unit over a time interval than simply taking the average of the units activity. The root mean square is invariant with respect to the average activity level, which makes comparing the rms of two units more plausible. Given the definition of the rms to calculate the activity of a unit over an interval, we can define the Hebbian equation for the KA model. Normal Hebbian rules compare the activity of a unit to its average activity. We instead compare the average activity of a unit over an interval to the average activity of some subset population of units. This is necessary as determining a base or average activity is not a straightforward proposition in the K family neural population models. We therefore determine how the activity of a unit is varying by comparing it to the current average activity of a population. The Hebbian equation used by the KA experiments is given

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

6

by: ∆wBA

=

ε×

(14)

(rms(A, a, b) − rms(sea, a, b)) × (rms(B, a, b) − rms(sea, a, b)) where rms(sea, a, b) is a spatial ensemble average of some population of units that the units A and B belong to. The learning rate parameter, ε, is determined experimentally for each simulation by tuning it to have optimum performance as defined by the simulation. In a similar manner, the time window used is also determined experimentally for each problem. The normal time window is take from the current time to some time in the past which can vary from 50 to 250 time steps. 2) Habituation in the KA Model Experiments: The second learning mechanism used in the following experiments is habituation. Habituation is defined as a diminished response to sensory stimuli that is not reinforced. Sensory signals that are repeatedly encountered but never co-occur with appetitive or aversive signals become diminished in the organism. This phenomena is very familiar to people as, for example, we quickly “tune out” background noise such as an air-conditioner in our environment. Therefore habituation is a type of cumulative rule based process, whereby unreinforced stimuli are iteratively tuned out and ignored by sensory systems. In the KA models, Hebbian learning only occurs when reinforcement signals are generated in the organism. In the following experiments, reinforcement signals are usually hardcoded in the agent such that when it bumps into objects pain signals are generated which signal opportunities for Hebbian modification. When a reinforcement signal is not currently being produced by the organism, habituation of stimuli will be performed. Habituation of stimuli is performed in KA by lessening the strength of connections to neural units that are more active than an average population activity during times of nonreinforcement. The basic weight modification for habituation is defined as: ∆wBA = −η|(rms(B, a, b) − rms(sea, a, b))|

(15)

Here the habituation weight of a link from A to B ∆wBA is a function of how far the unit B’s activity is above or below a spatial ensemble average of some subpopulation of units, times a habituation decay constant η. Again η is determined experimentally for each simulation by tuning it for optimum performance. For some cases where we only want to habituate nodes whose activity is higher than the average (not those that are lower), we can use ∆wBA = 0 if the rms of B is lower than the spatial ensemble average. Hebbian modification and habituation are usually performed on all plastic connections in the simulation. That is to say, some connections in a simulation are not variable, and therefore do not learn and change in response to environmental experiences. The internal links within a KA-II are examples of non-plastic connections in simulations using the KA model. Plastic connections are usually those links between units within a layer of, for example, a KA-III.

Fig. 2. The KA hierarchy. The KA-I are a combination of two excitatory or two inhibitory units connected with mutual feedback. The KA-II is a combination of a KA-Ie and a KA-Ii , connected with various weights between them. The KA-II level allows for both positive and negative feedback which can create oscillatory behavior. The KA-III level is a collection of three (or more) KA-II connected with various feedforward and feedback connections. When the three layers of the KA-III are nonhomogeneous, the resulting dynamics of the KA-III system is chaotic.

III. KA M ODEL C HARACTERISTICS A. Oscillatory Dynamics and KA-II Sets Freeman [21] postulates ten building blocks of neurodynamics that help to explain how neural populations create the chaotic dynamics of intentionality. The first three principles deal with the formation of non-zero steady-state and oscillatory dynamics through various types of feedback in neural networks with excitatory and inhibitory connections. Figure 2 shows a particular configuration, called the KA-II set, with two excitatory and two inhibitory units connected together. Such a configuration provides excitatory-excitatory, inhibitoryinhibitory and excitatory-inhibitory feedback simultaneously. This is one of the simplest configurations with all possible connections between excitatory and inhibitory units and it will be used as the basic model of a mixed excitatory-inhibitory population in this paper. In Figure 3 we compare the behavior of an original K-II with a KA-II. In this comparison all the connections between units are set to the same value in the K and KA model. We can see that the 4 units in the K and KA model maintain similar activity levels. Moreover, each model reaches an approximate steady state after a transient time of around 10ms. The KA-II configuration, in the vast majority of parameter settings, produce damped or sustained oscillatory behavior. Though some regimes of chaotic behavior may exist in the simple KA-II configuration (see, e.g., [48]), we restrict our self to working with oscillatory KA-II. With such a configuration the KA model is capable of producing oscillatory behavior of varying frequencies depending on the values of the ten internal weights. Table III gives the parameters and some major properties of three different KA-II sets. wee , wei , wie , wii are

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

7 1

Original K−II Model Simulation, ee=1.1, ei=0.5, ie=1.0, ii=1.8

2

0.8 0.6

0

0.4

−1 −2

e1 i1 e2 i2

−3 −4 −5

5

10

15

20

25 time (msec)

30

35

40

45

50

−0.4 −0.6

KA−II Model Simulation, ee=1.1, ei=0.5, ie=1.0, ii=1.8

−0.8

1

A)

0

−1

0

50

100

150

200

250 time (ms)

300

350

400

450

500

0.2

e1 i1 e2 i2

−1 −2 −3

0

−0.2

0

2

Simulated Current

0.2

ct (current)

Simulated Current

1

0

0

10

15

20

25 time (msec)

30

35

40

45

50

Comparison of K-II and KA-II (all internal parameters equal).

−0.2 t+5

Fig. 3.

5

TABLE III KA-II UNITS USED IN KA-III WEIGHT SCALING SIMULATION Group 1 2 3

wee 0.94 1.05 1.29

wei 1.41 1.40 1.27

wie 0.80 0.44 0.65

wii 1.33 0.05 1.19

mx -0.25 -0.12 -0.08

σx 0.14 0.30 0.25

f0 31 27 25

the connection weights, mx is the mean and σx is the standard deviation of the simulated current over a given 10 second window (excluding initial transients). The frequency (f0 ) is determined based on the main peak of the power spectrum of the simulated current. Although there are 10 weights, we reduce this to 4 parameters by setting the weights between like pair types to be the same. For example, the weights between excitatory units (from E1 to E2 and from E2 to E1 , correspondingly) are set to be equal and are shown by the value wee in the table. Similarly for the 2 inhibitory-inhibitory (wii ), 3 excitatory-inhibitory (wei ) and 3 inhibitory-excitatory (wie ) weights. Figure 4A. shows a time series of the first excitatory unit from the first KA-II group in Table III. In Figure 4B. we display a state space representation for the same series with time delay of t vs. t+5. We see a stable limit cycle oscillation with frequency 31 Hz, after the initial transients die out. The three groups shown in the table are naturally oscillatory, that is to say that they oscillate without external stimulation, as shown in the figure. The mean (mx ) and standard deviation (σx ) shown in Table III are measures of the behavior of the time series after initial transients have been discarded (1000 in this case). The dominant frequency (f0 ) is the frequency that the KA-II groups oscillate at (in simulated cycles per second). The selected three parameter groups in Table III are the results of an extensive parameter search aimed at identifying KA-II sets with strong limit cycle oscillations at various differing frequencies. In this approach we generated 500 KA-II groups at random, with different wee , wei , wie

−0.4

−0.6

B)

−0.7

−0.6

−0.5

−0.4

−0.3

t

−0.2

−0.1

0

0.1

0.2

Fig. 4. An example of the oscillatory behavior that can be generated by a KA-II configuration. In A) we show a time series of the first excitatory unit from a KA-II configuration. And in B) we display a delayed state space plot of the same KA-II at t vs. t + 5

and wii parameters uniformly distributed in the range [-2.0, 2.0]. From these candidates we selected three such that they 1) showed sustained oscillations and 2) oscillated at different and incommensurate characteristic frequencies respectively. In the next section, we link the three KA-II sets into a network and show that under certain conditions, the incommensurate frequencies compete with each other but none of them wins. As a result, a complex aperiodic oscillation emerges. B. Chaotic Dynamics in KA-III Sets Freeman’s [21] fourth principle building block of neurodynamics concerns the formation of chaotic background activity: The genesis of chaos as background activity by combined negative and positive feedback among three or more mixed excitatory-inhibitory populations. We demonstrate the production of deterministic chaos by the KA model using the mixed excitatory-inhibitory KA-II populations described in the previous section in Table III. The KA-III set, shown in Figure 2, right, is an example of a configuration of three KA-II groups connected together in order to produce chaotic dynamics. In these simulations, the excitatory units from higher layers have projections to deeper

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS KA−iii simulation, chaotic time series of e1 from groups 1,2 and 3

1 ct (current

8

g1, e1

0.5 0

−0.5 2000

2500

3000

3500

4000

4500

5000

5500

6000

ct (current

1

6500

7000

g2, e1

0.5 0

−0.5 −1 2000

2500

3000

3500

4000

4500

5000

5500

6000

ct (current

0.5

6500

7000

g3, e1

0

−0.5 −1 2000

2500

3000

3500

4000

4500 t (time)

5000

5500

6000

6500

7000

Fig. 5. An example of a chaotic time series generated by a KA-III configuration. We show the time series of the E1 unit of layer 1 (top), layer 2 (middle) and layer 3 (bottom).

1

KA−iii simulation, Return Plot, g1 e1, t+12

t+12

0.5

0

−0.5 −0.5

0

t

0.5

1

Fig. 6. A state space plot of the activity of the E1 unit of group 1. We plot the activity of the unit at time t vs. the activity at time t + 12.

layers. Recurrent back-projections are also present from lower layers back up to higher layers. These back-projections may have delays associated with them, which reflects the delayed nature of these back-projections in biological neural tissue. Figures 5 and 6 display a time series and state space representation from a KA-III simulation generated using this KA-III configuration. A calculation of the first Lyapunov exponent of the time series using Wolf’s method [49] shows a strictly positive exponent of around 0.1 for this series, indicating strong chaotic behavior. We now demonstrate the effects of changing the weights between the groups on the calculated Lyapunov exponent. In this simulation, the projection weights between layers were varied from 0% to 100% of their original connection strengths, in 5% increments. Ten simulated time series were generated

Fig. 7. Effects of scaling the excitatory weights between the KA-II layers of the KA-III on the calculated Lyapunov exponent. The intergroup excitatory weights are scaled from 0.0 to 1.0 in 0.05 increments. We show the average calculated Lyapunov exponent for 10 experiments at each scaling factor along with an indication of the variation (error bars). Above the figure are examples of time series and state spaces generated by the KA-III at weight scaling factors of 0.0, 0.6 and 1.0 from left to right respectively.

for each weight setting, and the Lyapunov exponent calculated on the resulting time series. Figure 7, bottom, plots the effects of scaling the projection weights on the Lyapunov exponent for this KA-III. When the projection weights are reduced to 0% of their original value, the KA-II layers become isolated and no longer affect one another. In this case we observe the damped oscillatory behavior of the KA-II in layer 1 (Figure 7 top left, we show both the time series and a state space plot of the delayed activity of the unit against itself). At a 100% scaling factor we show the dynamics of the KAIII where the measured Lyapunov exponent is close to 0.06 (Figure 7 top right). In general, as the projection weights between layers are increased, the behavior of the KA-III unit becomes incrementally more chaotic. Even very small projecting weights between layers are enough to push the damped oscillatory dynamics of a KA-II into a sustained quasi-periodic orbit. Some initial conditions at some scaling factors, however, produce stronger chaotic interactions. For example, at a scaling factor of 0.6 we show one example with a measured Lyapunov exponent of 0.15 (Figure 7 top middle). C. Comparison of Power Spectra of KA Models and Rat EEG Signals In the original K model, the purpose of the K-III set was to model the chaotic dynamics observed in rat and rabbit olfactory systems [32], [45], [50]. The K-III set was not only capable of producing time series similar to those observed in the olfactory systems under varying conditions of stimulation and arousal, but also of replicating major power spectrum characteristics of these time series. The power spectrum is a measure of the power of a particular signal (or time series as for example that obtained from an EEG recording of a biological brain) at varying frequencies. The typical power spectrum of a rat EEG (see

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

9

and asymmetric sigmoidal transfer function to describe the influence of activation passed between the population units. The final item of this table shows the total time needed to run a simulation of a K/KA III that contained a total of 513 units and over 10,000 connections. The simulation was of 10 seconds of activity in the neural model and both were coded and executed using Matlab 6.5 on a 1.0 GHz Pentium class computer. The KA implementation used the discrete equations described in the previous section. The original K unit implementation uses the Matlab Runge-Kutta method for approximating the solution to the coupled ODEs. The KA model executes the simulation in just under 10 seconds, while the K model takes over three times as long to run the same simulation. We will discuss more results of this type comparing the efficiency of the two models in coming sections. The simulations described in the next sections use an implementation ported to C++, which is faster still than the Matlab implementation. The generic form of the difference equation used in standard FFNN models can be stated as:

KA−III model (Group 3)

2

10

1

10

0

Power

10

−1

10

−2

10

ai (t) = F (ai (t − 1), neti (t − 1)) −3

10

−4

10

0

10

1

10

2

Hz

10

Fig. 8. The power spectrum of a rat Olfactory Bulb EEG is simulated with the KA-III model. The calculated “1/f” slope of the EEG and model is approximately -2.0. Rat OB data from [51]

Figure 8, top) shows a central peak in the 30-40 Hz range, and a 1/f α form of the slope. The measured slope of the power spectrum varies around α = −2.0. 1/f α type power spectra are abundant in nature and are characteristic of critical states, between order and randomness, at which chaotic processes operate. The atypical part of the experimental EEG spectra is the central peak, indicating stronger oscillatory behavior in the γ frequencies. This central peak in the 30-60 Hz range is known as the γ frequency band, and is associated with cognitive processes in biological brains. In Figures 8 we show an example of the KA-III models ability to replicate these types of dynamics. In particular the power spectrum analysis (Figure 8, bottom) shows the typical “1/f” power spectrum with a slope of around -2 and a central frequency peak, similar to that produced from the EEG recordings of a rat olfactory bulb. D. Comparison of KA to K-Sets and FFNNs In Table IV we compare the features and equations of the K and KA models. The KA model is a discrete, 2nd order difference equation, as opposed to the original continuous 2nd order ordinary differential equation of the K-sets. Both the K and KA models use time constants as parameters (α and β) in order to tune the dynamics of the models to those observed from real neural populations. It should be noted, however, that these time constant parameters are different between the two models and will take on different values in order to achieve the same dynamics. Both of the models use the same net input

(16)

Here the activation of a unit i in a FFNN model is a function of the activation of the unit in the previous time step along with the net influence of input from other units in the previous time step. In the vast majority of ANN models, however, the influence of the activity of the unit by the units activity in a previous time step is ignored, and thus the normal usage simplifies to: ai (t) = neti (t − 1) (17) In other words the activity of a unit in the next time step depends solely on the net input to the unit from externally connected units. This is a reasonable simplification in strictly feed-forward networks, since there is only a single time step being simulated. On the introduction of the input to the first layer, the activity simply flows forward in one direction in the network. However, this simplification becomes less useful in the realm of recurrently connected networks, where the dynamics of a unit over time can be simulated, and such dynamics may effect the performance of the network. Most research in recurrent ANNs still only use the simplified equation. This means that even in recurrent ANN research, the dynamics of the units depends solely on the activity of connected units in the previous time step. The units do not have nor use any intrinsic dynamics of their own. The KA model is a simplification of the K-sets. One of the purposes of both models is to captures the dynamics of an isolated neural population in response to external stimulation. As such, both the K and KA models have intrinsic dynamics associated with a neuronal unit, such that in the absence of external stimulation they will continue to modify their activity levels as a function of the passage of time. This can of course be seen most clearly in the KA difference equations, which include terms that depend on the previous activity of a unit in determining the activity of the next time step, which differentiates these models from the vast majority of ANN modeling. Further, both the K and KA models depend on a second order term in order to correctly replicate the

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

10

TABLE IV C OMPARISON OF F EATURES AND E QUATIONS OF KA AND K-S ET N EURAL P OPULATION M ODELS

2nd

KA Discrete order difference equation

2nd

ai (t) = ai (t − 1) + deci (t − 1) + momi (t − 1) + neti (t − 1) deci (t) = −ai (t) × α momi (t) P = ri (t) × β w o (t) neti (t) = j ij j oj (t) = {1 − exp[ 9 sec.

−(e

aj (t)



−1)

]}

dynamics of neural populations. The second order term is necessary in the K-model ODE in order to capture the damped oscillations of neural populations. Similarly, in the KA model, the momentum parameter depends on two previous time steps of the activation of the unit in order to capture this type of behavior. Another difference between the K and KA models on one side, and ANN models on the other is, of course, the form of the transfer function. In all cases, the nonlinearity of the transfer function is an important feature in capturing the nonlinear nature of neural functioning. ANN research uses many different transfer functions, though the most popular is the standard sigmoidal transfer function used in models using real activation values: 1 (18) oi (t) = 1 + e−ai (t) However, the K and KA models use a particular asymmetric sigmoidal transfer function (Equation 9), that has a firmer basis in biological networks. The asymmetric transfer function used was derived by Freeman and associates by studying the nonlinear passing of activation between biological neural populations [50]. The asymmetry is an important property in the transfer function as it means that excitatory input causes a destabilization of the dynamics of networks. This destabilization is essential in the collapse of aperiodic attractors observed in biological perceptual systems. In Table V we summarize the comparison of the KA and feed-forward neural network (FFNN) models. Both use discrete difference equations to describe the activity of units and it’s changes over time. The vast majority of research in standard FFNNs use a discrete equation, that simply depends on the activity of connected units at a previous time step to determine the activity of the unit in the current time step. The KA model (and original K-sets) model the intrinsic dynamics of isolated neural populations. Both need a second order term in order to capture the description of these dynamics. In the discrete KA model case, two previous time steps are needed in order to describe the momentum of a neural population. Both FFNN and KA models use nonlinear transfer function. The form of the transfer function in the K and KA models is an asymmetric sigmoidal transfer function that has a firmer basis in biological observations. The asymmetry is important in the K family of models as it allows for the destabilization of populations of units in response to inputs [21]. The biological

K Continuous order ordinary differential equation

d2 a (t) αβ dti2

+ (α + β)

neti (t) =

dai (t) dt

P

+ ai (t) = neti (t)

wij oj (t) a (t) −(e j −1) exp[ ]}  j

oj (t) = {1 − 32 sec.

models of the K family of equations are always multi-layered highly recurrent models that capture the architecture of brain regions. A final difference between the KA and FFNN models is the learning rule. Backpropagation is the main type of learning mechanism used in standard FFNN research. The KA and K models use Hebbian learning, habituation and homeostasis to adjust the weight space in simulations [24]. These learning mechanisms have a firmer basis in biology and have been directly observed as processes in brains. The learning mechanisms used by the KA model will be discussed more thoroughly in later sections when we describe simulations using the KA model to control autonomous agents. IV. KA C ONTROL OF AUTONOMOUS AGENT The continuous K sets have been shown to be good models of olfactory cortical dynamics. They can replicate the complex dynamics and power spectra of biological cortical EEG recordings. The K sets can learn using unsupervised methods, such as Hebbian modification and habituation, to replicate some of the behavior of rabbits when learning new olfactory sensory stimuli. The K sets have also been extended to more abstract domains to demonstrate their use in standard pattern recognition tasks [45], [23], [24]. We are currently extending the KA model to not only perform perceptual tasks, but to also model the complete behavior of an organism, from perception to action and the steps needed in between [42]. One of the purposes of producing the KA model was to provide a simplified and efficient system that was still capable of producing the types of dynamics deemed important to biological organisms in producing general intelligent behavior. The KA model is a discrete version of the original K sets and is used to experiment with autonomous agents to replicate and explain the dynamics of cortical systems in organizing and producing behavior. Because of the efficiency gains made possible by the discrete simplification, much larger neuronal models may be explored in the context of building control mechanisms for autonomous agents. In this section we describe some simple examples of how KA units can be used to produce behavior in autonomous agents. We will show a simple example of learning with the KA units and compare the results to other dynamical neural architectures.

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

11

TABLE V C OMPARISON OF F EATURES AND E QUATIONS OF KA AND FFNN M ODELS KA Discrete difference equation ai (t) = F (ai (t − 1),P ai (t − 2), neti (t − 1)) neti (t) = w o (t) j ij j oj (t) = {1 − exp[

a (t) −(e j −1) ]} 

multi-layer Highly recurrent Hebbian, habituation

FFNN Discrete difference equation ai (t) = F (net P i (t − 1)) neti (t) = w o (t) j ij j oj (t) =

1

1+e

−aj (t)

multi-layer Feed-forward Backpropagation

A. Learning Object Avoidance Behavior In this experiment we use a Khepera robotic agent in a virtual environment. The task we choose is similar to that explored in the original Distributed Adaptive Control models of Verschure, Kr¨ose and Pfeifer [52]. Figure 9 illustrates the morphology of the Khepera robot and the internal architecture used to perform the experiment. The Khepera robot is a simple robot that contains 8 infra-red distance sensors (labeled DS1−8 in the figure). In this task, the simulated Khepera robot is originally endowed with a set of basic reflexive behaviors that allow it to wander around in its environment, bumping into obstacles and turning away from them. For example, if the robot bumps into an object on the left side of its body, it turns to the right until it is no longer bumping the obstacle and then attempts to continue forward. We used a virtual simulation of a physical Khepera robot to perform these experiments [53]. The Khepera robot is equipped with two independent motors attached to wheels, that allow the robot to move forward, backward and turn. We use only the 6 forward facing distance sensors in this experiment. In Figure 9 we show the architecture used to perform the experiment. KA-0 units are used in the Reflex, Sensory and Motor areas to build the architecture. A set of three reflexive behaviors are hardwired to perform appropriate actions to allow the robot to wander in the environment. The Left Obs and Right Obs reflexes are connected to the three sensors on the left and right sides of the robot respectively. If any of the three connected sensors is at its maximum value (indicating the sensor is touching an obstacle) then the Left Obs or Right Obs unit will be stimulated appropriately. The No Obs unit is similarly connected to the four forward facing distance sensors, and it is only stimulated when all four sensors are less than maximum, indicating that the robot is not bumping into an obstacle in front of its body. The Left Obs and Right Obs behaviors respond to the robot bumping into an obstacle on the left or right side of the robot respectively. They are hardwired to the Turn Left and Turn Right motor behaviors. For example Left Obs, which detects the presence of an obstacle on the left, is wired to stimulate the Turn Right behavior in order to turn away from the detected obstacle. Again, in a similar manner, the No Obs reflex which detects the condition of no obstacle currently impeding the robot is hardwired to the Move Forward behavior which causes the robot to move in a forward direction. The Turn Left and Turn Right motor behaviors are wired as would be expected to

Fig. 9. (Bottom Left) The morphology of the Khepera agent with 8 infrared distance sensors positioned around the body and 2 motors for movement Above is a graph of the response of the distance sensors (dashed line labeled DS) and the inverse distance sensors (solid line labeled DI) to an obstacle. (Center) The internal architecture of the Khepera agent. Reflexes are hardcoded such that the agent moves around and bumps into obstacles in the environment. When the agent bumps into an obstacle, it triggers motor units to turn away from the obstacle and continue in a new direction. Units in the Sensor area gradually learn to trigger avoidance behaviors to avoid objects at a distance before running into them.

the Left Motor and Right Motor units to produce appropriate left turn and right turn behavior. The values of the Left Motor and Right Motor unit are read out at discrete intervals to set the speed of the robots left and right wheel encoders. The Turn Left and Turn Right behaviors are connected together with mutually inhibitory connections in order to avoid a conflict situation which can results in an impasse when both left and right turn behaviors are equally stimulated. In this experiment, the goal of the agent is to learn to associate long-range distance sensory information with behaviors to learn to trigger avoidance behaviors at a distance, before the agent actually bumps into the obstacle. Therefore in the robots behavior architecture we also have a set of units that are connected to the long range infra-red distance sensors (labeled ’Sensory’ in Figure 9 Right). The distance sensors can sense obstacles at a distance from the robot. Six KA-0 units are connected to the normal output of the distance sensors (DS1−6 connected to S1−6 ) while six other KA-0 are connected to the inverse of the indicated distance sensor (DI1−6 connected to

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

S7−12 ). The inverse of a distance sensor is maximally active when no obstacle is detected, and is minimally active when the sensor is right next to an obstacle. Initially the 12 sensory KA-0 are fully connected to each other with small random weights (not shown in figure). Also the 12 KA-0 are fully connected to each of the 3 basic motor behaviors (Turn Left, Turn Right and Move Fwd) again with small random weights. We use Hebbian learning and habituation on the connections between the ’Sensory’ units and from the ‘Sensory’ to the ’Motor’ units. Since these connections are initially random, typically they do not affect the behavior of the robot in the beginning. The reflexes cause the robot to move around in the environment. Later on the robot may bump into something on its left. This will cause some of the Motor behaviors to be performed, such as turning right. Since the Sensory units that are connected to sensors on the left side of the body have become stimulated while approaching the obstacle, they remain highly active when the right turn behavior is activated. This allows the strength of the connection between the Sensory unit for detection of obstacles on the left and the right turn behavior to become strengthened due to Hebbian modification because of their co-occurring excitation. Similar strengthening is happening between units that sense the absence of obstacles on the right and right turn behavior as well. Hebbian modification is only performed in response to collisions, and therefore collisions produce a type of pain valence signal. Habituation is performed at other times which lessens extraneous responses between the long-range sensors and motor behaviors in the absence of important stimuli. Gradually the links between the long-range sensors and the motor units become strong enough to activate behavior when an object is sensed at a distance, before the robot actually bumps into it. Therefore the robot has learned a type of object avoidance behavior through coupling of the activity of its sensors with its motor behaviors. B. Results In Figure 10 we show the results of learning object avoidance using the architecture and methods described above. In this figure we display the average performance of the robot over 50 independently conducted simulations. We plot both the results with only reflexive behavior (No Learning) and with the Sensory unit connections being manipulated through Hebbian modification and habituation (Learning). Along the X axis we show the time (in seconds) that the simulation has been running. We plot the total number of times that the robot has bumped into an object in the environment. In the case of the ’No Learning’ condition, the robot continues to move and bump into obstacles in the environment. In the Learning condition, the robot quickly begins to avoid objects, and eventually learns to move through the environment without bumping into anything at all. These results are comparable, in terms of performance and learning rate, to those obtained by the original DAC architecture [52]. The KA-0 units using unsupervised learning methods, as shown, can learn to avoid obstacles at a distance. This simple example also shows that KA units can be used to build and control the behavior of autonomous agents.

12

Fig. 10. Results of Khepera simulation. As time goes by, the robot learns to bump into things less and less. This figure represents the cumulative results of 50 simulations. Time (in seconds) is plotted along the X axis, and the average cumulative bumps is plotted along the Y axis. We show the results without learning (only reflexive behavior) and with learning turned on.

As another example, consider the simple dynamical neural Schmitt trigger [54]. H¨ulse and Pasemann have shown that a simple architecture of 2 units is capable of producing object avoidance and exploration behavior in a Khepera robot. In their paper they used a genetic algorithm to learn appropriate weights to solve the avoidance and exploration task. Their simple architecture contains two input units, which receive the average activation from the three left and three right distance sensors respectively, and two motor units. The motor units are connected with mutual inhibitory connections, similar to how our Turn Left and Turn Right motor units are mutually inhibitory. We will use the weight settings they evolved and describe in [54] to compare the performance to our KA-0 units in this similarly learned task. Figure 11 displays a comparison of typical paths generated in an environment using the KA architecture described previously and compared to the architecture using the dynamical H¨ulse-Pasemann Schmitt Triggers (HPST). We use the KA units after they have adequately learned obstacle avoidance, at which point we freeze the weights, similar to the evolved weights learned for the HPST. The path of a HPST is shown on the left, while the results from the KA units behavior is shown on the right. In general, the KA exhibits comparable performance as the HPST in this environment. For example, we ran 10 simulations each of the HPST and KA architectures. Each of the trials simulated 60 minutes of activity by the Khepera robot. These results are summarized in Table VI where we show the distance and standard deviations obtained for the 10 trials for each architecture in this first experiment. The results indicate that the KA traveled a somewhat shorter distance over the same time. The main goal of this section is to demonstrate that KA can perform at the same level, and in some cases better, than alternative control algorithms, like the HPST. This is proof-of-principle of the feasibility of the K-based control

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

13

Fig. 11. A comparison of typical paths created by the Hu¨ lse-Pasemann neural Schmitt trigger (Left) and the KA units (Right). TABLE VI ¨ R ESULTS OF KA AND H ULSE -PASEMANN S CHMITT T RIGGER K HEPERA S IMULATIONS

Arch KA HPST

Experiment 1 dist std 246.11 m 1.08 250.52 m 0.27

dist 3.62 m 3.68 m

Experiment 2 std-d time 0.04 48.51 s 0.08 51.13 s

std-t 0.66 1.31

approach. In Figure 12 we study how much time it takes to move from the top of a long corridor to the bottom end. It is seen that the trajectory produced by KA is more smooth, while HPST control gives trajectories with sharp corners. In order to analyze this behavior, an additional experiment has been designed with 10 trials of trajectories. We display the results of 10 trials for the HPST architecture (Left) and the KA architecture (Right), starting at the same location and orientation (the orientation was varied over the 10 trials). The results for the second experiment are summarized in Table VI. By both measures, in this environment, the KA is more efficient because it travels to the end in less time using less distance. This is mainly a result of the form of the path taken by the KA architecture. The KA units trigger the turning behavior in a more smooth manner, and at a greater distance from the obstacles, resulting in smoothed, curved turns. It is not claimed, however, that the KA architecture developed here is in any way superior to the HPST for the given simple task. Other performance criteria, such as area explored and covered or mean times to revisit areas, may give different results. But, given appropriate evaluation functions in the case of the HPST architecture, and value signals for the KA architecture, these differing tasks could be learned equally well by either approach. The dynamics used in this experiment by the KA units are relatively simple. We use a homogeneous collection of KA-0 units. The various recurrent connections, in the ’Sensory’ and ’Motor’ areas do produce KA-I and KA-II level behaviors. The real power of the K and KA family of models comes

Fig. 12. Paths created by 10 trials of the Hu¨ lse-Pasemann neural Schmitt trigger (Left) and the KA architecture (Right). We study how much time it takes to get to the end of the corridor and how long of a distance the agent travels during this traversal.

when we use and exploit chaotic dynamics to form perceptual categories and produce complex learned behaviors. We have begun work along these lines of using such chaotic dynamics in autonomous agents. The research along these lines of using chaotic dynamics is in progress [55], [56], [57]. V. D ISCUSSION The above task serves to demonstrate that the KA units can effectively be connected together to form the control mechanism for an autonomous agent. The performance of the KA units is comparable to that achieved by Hu¨ lse and Pasemann with their HPST for the object avoidance task [54]. The learning of object avoidance by the KA is also comparable with Verschure, Kr¨ose and Pfeifer’s results in their original distributed adaptive control experiments [3], [4]. The dynamics of the KA units can be shaped by Hebbian modification and habituation to reliably associate the conditioned stimuli from the long-range sensors with the unconditioned and instinctual motor responses to turn away from collisions. This type of learning is an example of classical conditioning using an unsupervised learning mechanism to associate stimuli with instinctual behaviors. We have not yet, in this simulation, shown how a full implementation of an aperiodic KA-III might be used to form a control mechanism for an autonomous agent. We believe that mechanisms based on the formation and dissolution

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

of an aperiodic attractor landscape have great potential for improving the cognitive abilities of autonomous agents. The demonstration of this remains our ultimate goal using KA-III in the future. The performance of the KA units for control in this simulation is by no means meant to be an example of what we believe is ultimately achievable by the application of aperiodic dynamics to the control problem. Much simpler architectures are known to exist that effectively solve the obstacle avoidance problem in complex environments. For example H¨ulse and Pasemann [54] show effectively how one can evolve the connection weights between two recurrently connected units to perform obstacle avoidance. The recurrent nature of the connections is also important in our models, as they form the basis for generating the oscillatory and aperiodic dynamics. The ultimate goal of developing the KA model, however, is to explore biologically motivated architectures using autonomous agents of complete intentional systems. In this paper we have demonstrated the basic ability of the KA model to replicate the important dynamics of the original K sets developed by Freeman et. al. [32], [45]. The KA model is a discretized simplification of the cortical dynamics first developed to model the sensory systems of biological brains. We are now beginning to extend the original K sets to not only model cortical sensory dynamics, but to also explain the production and selection of behavior in complete autonomous systems. Towards that end, we are using the KA sets to build more complicated architectures that capture pieces of the important areas believed to contribute to basic intentional behavior [46]. In our view of cognition and the production of intelligent behavior, aperiodic dynamics plays an important role in the process. Chaotic dynamics provides many advantages to a system that needs to balance between stability and flexibility in the actions it produces. Aperiodic dynamics have been observed in the sensory cortices of biological brains, and have been speculated to be useful in the sensory recognition process. The K-III and KA-III are capable of replicating the types of dynamics observed in these cortical regions. But perceptual systems alone, though very important, are not the only component necessary for the production of intelligent behavior. In [42] we have speculated on the essential pieces necessary for the production of general intelligent behavior. Besides sensory and motor systems, organisms need at least a basic memory system (provided by the Hippocampus) and a motivational system. There is biological and experimental evidence [21], [33], [58] that the same types of dynamics observed in the perceptual system, and modeled by the original K-III, may also be the essential building blocks used in these other three areas. The K-IV architecture is a model of a complete intentional system, comprising sensory, motor, memory and motivational systems. Each individual system is modeled by some form of a K-III, and the K-III together form a complete agent. We have taken steps towards modeling the complete K-IV. In this paper we presented an example of using KA units to form the perceptual and motor systems. We are currently working on KA-III models for the simulation of Hippocampal functions such as place cell formation and cognitive map

14

building [42], [58], [59], [55], [56], [57]. These steps are essential to better understanding how observed cortical dynamics participate in the production of intentional behavior in biological brains. VI. C ONCLUSION In this work we have developed a discrete time model of neural dynamics in neural networks with excitatory and inhibitory connections. We have built a hierarchy of KA models, starting from the KA-I and KA-II units with fixed point and limit cycle dynamics, to the KA-III model with complex aperiodic dynamics. We have demonstrated the feasibility of generating chaotic oscillations in KA-III and compared the dynamics of the KA model to the original K sets. The developed KA units can be used to build an adaptive autonomous system that explores an environment and generates behavioral strategies in order to solve a given task. The K and KA series of models represent steps to a better understanding of how aperiodic dynamics observed in the cortical systems of biological brains play a part in the production of intelligent behavior. ACKNOWLEDGMENT This work was supported by NASA Intelligent Systems Research Grant NCC-2-1244 and by the National Science Foundation Grant NSF-EIA-0130352.

Derek Harter (Member) is an Assistant Professor of Computer Science and Information systems at Texas A&M University - Commerce. He received his Ph.D. in 2004 from the University of Memphis on research involving neurodynamical models and their applications to autonomous agents. His research interests are in AI, Cognitive Science and the study of Complex Systems.

Robert Kozma (Senior Member ’98) holds a Ph.D. in applied physics from Delft University of Technology, The Netherlands (1992). Presently he is Professor of Computer Science, Department of Mathematical Sciences, The University of Memphis. He is the Director of the Computational Neurodynamics Laboratory. He has published 3 books, over 50 journal articles, and 100+ papers in conference proceedings. His research interest includes autonomous adaptive brain systems, mathematical and computational modeling of spatio-temporal neurodynamics, and the emergence of intelligent behavior in biological and computational systems. Dr Kozma serves on the Board of Governors of the International Neural Network Society INNS, he Chairs the Special Interest Group on NeuroDynamics, and is a member of the NN Technical Committe of IEEE Computational Intelligence Society. He has been Program Chair and acted as Program Committee member of a number of international conferences on neural networks, fuzzy systems, and computational intelligence.

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

R EFERENCES [1] A. Clark, Mindware: An Introduction to the Philosophy of Cognitive Science. Oxford, NY: Oxford University Press, 2001. [2] G. M. Edelman and G. Tononi, A Universe of Consciousness: How Matter Becomes Imagination. New York, NY: Basic Books, 2000. [3] R. Pfeifer and C. Scheier, Understanding Intelligence. Cambridge, MA: The MIT Press, 1998. [4] P. F. M. J. Verschure and P. Althaus, “A real-world rational agent: Unifying old and new AI,” Cognitive Science, vol. 27, no. 4, pp. 561– 590, 2003. [5] S. I. Amari, “Neural theory of association and concept formation,” Biological Cybernetics, vol. 26, pp. 175–185, 1977. [6] J. J. Hopfield, “Neuronal networks and physical systems with emergent collective computational abilities,” Proceedings of the National Academy of Science, vol. 81, pp. 3058–3092, 1982. [7] A. Babloyantz and A. Desthexhe, “Low-dimensional chaos in an instance of epilepsy,” Proceedings of the National Academy of Science, vol. 81, pp. 3513–3517, 1986. [8] I. Tsuda, “Can stochastic renewal maps be a model for cerebral cortex,” Physica D, vol. 75, pp. 165–178, 1994. [9] X. Wu and H. Liljenstrom, “Regulating the nonlinear dynamics of olfactory cortex,” Network Computing and Neural Systems, vol. 5, pp. 47–60, 1994. [10] I. Aradi, G. Barna, and P. Erdi, “Chaos and learning in the olfactory bulb,” International Journal Intelligent Systems, vol. 1091, pp. 89–117, 1995. [11] M. A. Sanches-Montanes, P. Konig, and P. Verschure, “Learning sensory maps with real-world stimuli in real time using a biophysically realistic learning rule,” IEEE Transactions on Neural Networks, vol. 13, pp. 619– 632, 2002. [12] K. Aihara, T. Takabe, and M. Toyoda, “Chaotic neural networks,” Physica Letters A, vol. 144, pp. 333–340, 1990. [13] Y. V. Andreyev, A. S. Dmitriev, and D. A. Kuminov, “1d maps, chaos, and neural networks for information processing,” International Journal of Bifurcation and Chaos, vol. 6, pp. 627–646, 1996. [14] L. P. Wang, “Oscillatory and chaotic dynamics in neural networks under varying operating conditions,” IEEE Transactions on Neural Networks, vol. 796, pp. 1382–1388, 1996. [15] R. M. Borisyuk and G. N. Borisyuk, “Information coding on the basis of synchronization of neuronal activity,” Biosystems, vol. 40, pp. 3–10, 1997. [16] M. Nakagawa, “Chaos associative memory with a periodic activation function,” Journal of the Physical Society of Japan, vol. 67, pp. 2281– 2293, 1998. [17] H. Nakano and T. Saito, “Grouping synchronization in a pulse-coupled network of chaotic spiking oscillators,” IEEE Transactions on Neural Networks, vol. 15, pp. 1018–1026, 2004. [18] T. L. Carroll and L. M. Pecora, “Stochastic resonance and chaos,” Physica Review Letters, vol. 70, pp. 576–579, 1993. [19] W. L. Ditto, S. N. Rauseo, and M. L. Spano, “Experimental control of chaos,” Physica Review Letters, vol. 26, pp. 3211—3214, 1990. [20] C. A. Skarda and W. J. Freeman, “How brains make chaos in order to make sense of the world,” Behavioral and Brain Sciences, vol. 10, pp. 161–195, 1987. [21] W. J. Freeman, How Brains Make Up Their Minds. London: Weidenfeld & Nicolson, 1999. [22] W. J. Freeman, R. Kozma, and P. J. Werbos, “Biocomplexity: Adaptive behavior in complex stochastic dynamical systems,” BioSystems, vol. 59, pp. 109–123, 2000. [23] R. Kozma and W. J. Freeman, “Encoding and recall of noisy data as chaotic spatio-temporal memory patterns in the style of the brains.” in Proceedings of the IEEE/INNS/ENNS International Joint Conference on Neural Networks (IJCNN’00), Como, Italy, July 2000, pp. 5033–5038. [24] ——, “Chaotic resonance - methods and applications for robust classification of noisy and variable patterns,” International Journal of Bifurcation and Chaos, vol. 11, no. 6, pp. 1607–1629, 2001. [25] H. Liljenstrom, “Global effects of fluctations in neural information processing,” International Journal of Neural Systems, vol. 4, pp. 497– 505, 1996. [26] I. Tsuda and A. Yamaguchi, “Singular-continuous nowhere differentiable attractors in neural systems,” Neural Networks, vol. 11, pp. 927–937, 1998. [27] S. L. Bressler and J. A. S. Kelso, “Cortical coordination dynamics and cognition,” Trends in Cognitive Sciences, vol. 5(1), pp. 26–36, 2001.

15

[28] H. L. Liang, M. Z. Ding, and S. L. Bressler, “Detection of cognitive state transitions by stability changes in event-related cortical field potentials,” Neurocomputing, vol. 38, pp. 1423–1428, 2001. [29] O. Manette and M. Maier, “Temporal processing in primate motor control: relation between cortical and emg activity,” IEEE Transactions on Neural Networks, vol. 15, pp. 1260–1267, 2004. [30] J. C. Principe, V. G. Tavares, and J. G. Harris, “Design and implementation of a biologically realistic olfactory cortex in analog VLSI,” Proceedings of the IEEE, vol. 89, pp. 1030–1051, 2001. [31] W. J. Freeman, Mass Action in the Nervous System. New York, NY: Academic Press, 1975. [32] ——, “Simulation of chaotic EEG patterns with a dynamic model of the olfactory system,” Biological Cybernetics, vol. 56, pp. 139–150, 1987. [33] W. J. Freeman and R. Kozma, “Local-global interactions and the role of mesoscopic (intermediate-range) elements in brain dynamics,” Behavioral and Brain Sciences, vol. 23, no. 3, p. 401, 2000. [34] R. Gutierrez-Osuna and A. Gutierrez-Galvez, “Habituation in the kiii olfactory model with chemical sensor arrays,” IEEE Transactions on Neural Networks, vol. 14, pp. 1565–1568, 2003. [35] D. Xu and J. Principe, “Dynamical analysis of neural oscillators in an olfactory cortex model,” IEEE Transactions on Neural Networks, vol. 23, pp. 46–55, 2000. [36] C. M. Marcus and R. M. Westerveld, “Dynamics of iterated-map neural networks,” Physica Review A, vol. 40, pp. 501–504, 1989. [37] L. P. Wang, “On the dynamics of discrete-time, continuous-state hopfield neural networks,” IEEE Transactions on Circuits and Systems, II: Analog and Digital Signal Processing, vol. 45, no. 6, pp. 747–749, 1998. [38] I. Tsuda, “Towards an interpretation of dynamic neural activity in terms of chaotic dynamical systems,” Behavioral and Brain Sciences, vol. 24, no. 4, pp. 793–847, 2001. [39] W. J. Freeman, “The physiology of perception,” Scientific American, vol. 264, no. 2, pp. 78–85, 1991. [40] K. Kaneko and I. Tsuda, “Constructive complexity and artificial reality: An introduction,” Physica D, vol. 75, pp. 1–10, 1994. [41] R. Kozma, “On the constructive role of noise in stabilizing itinerant trajectories of chaotic dynamical systems,” Chaos, vol. 1193, pp. 1078– 1090, 2003. [42] R. Kozma, W. J. Freeman, and P. Erdi, “The KIV model - nonlinear spatio-temporal dynamics of the primordial vertebrate forebrain,” Neurocomputing, vol. 52-54, pp. 819–826, 2003. [43] D. Harter, R. Kozma, and S. P. Franklin, “Ontogenetic development of skills, strategies and goals for autonomously behaving systems,” in Proceedings of the Fifth International Conference on Cognitive and Neural Systems (CNS 2001), Boston, MA, May 2001, p. 18. [44] W. J. Freeman, “Olfactory system: Odorant detection and classification,” in Building Blocks for Intelligent Systems: Brain Components as Elements of Intelligent Function, D. Amit and G. Parisi, Eds. Academic Press, 1997, vol. 3, ch. 1, pp. 1–1. [45] K. Shimoide, M. C. Greenspon, and W. J. Freeman, “Modeling of chaotic dynamics in the olfactory system and application to pattern recognition,” in Neural Systems Analysis and Modeling, F. H. Eeckman, Ed. Boston: Kluwer, 1993, pp. 365–372. [46] W. J. Freeman, “The neurodynamics of intentionality in animal brains may provide a basis for constructing devices that are capable of intelligent behavior,” in NIST Workshop on Metrics for Intelligence: Development of Criteria for Machine Intelligence, National Institute of Standards and Technology (NIST), Gaithersburg, MD, 2000. [47] W. Gerstner and W. M. Kistler, Spiking Neuron Models. Cambridge: University Press, 2002. [48] F. Pasemann, “Complex dynamics and the structure of small neural networks,” Network: Computation in Neural Systems, vol. 13, pp. 5– 35, 2002. [49] A. Wolf, J. B. Swift, H. L. Swinny, and J. A. Vastano, “Determining Lyapunov exponents from a time series,” Physica D, vol. 16, pp. 285– 317, 1985. [50] W. J. Freeman and K. Shimoide, “New approaches to nonlinear concepts in neural information processing: Parameter optimization in a largescale, biologically plausible corticle network,” in An Introduction to Neural and Electronic Networks, Zornetzer, Ed. Academic Press, 1994, ch. 7, pp. 119–137. [51] L. Kay, K. Shimoide, and W. J. Freeman, “Comparison of EEG time series from rat olfactory system with model composed of nonlinear coupled oscillators,” International Journal of Bifurcation and Chaos, vol. 5, no. 3, pp. 849–858, 1995. [52] P. F. M. J. Verschure, B. Kr¨ose, and R. Pfeifer, “Distributed adaptive control: The self-organization of behavior.” Robotics and Autonomous Systems, vol. 9, pp. 181–196, 1992.

CHAOTIC NEURODYNAMICS FOR AUTONOMOUS AGENTS

[53] O. Michel, “Webots v4.0 3-d physics based mobile robot simulator,” www.cyberbotics.com, 2003. [54] M. H¨ulse and F. Pasemann, “Dynamical neural schmitt trigger for robot control,” Lecture Notes in Computer Science, ICANN 2002, vol. 2415, pp. 783–788, 2002. [55] D. Harter and R. Kozma, “Navigation and cognitive map formation using aperiodic neurodynamics,” in From Animals to Animats 8: The Eighth International Conference on the Simulation of Adaptive Behavior (SAB’04), Los Angeles, CA, July 2004, pp. 450–455. [56] ——, “Aperiodic dynamics and the self-organization of cognitive maps in autonomous agents,” in Proceedings of 17th International Florida Artificial Intelligence Research Society Conference (FLAIRS), Miami Beach, FL, May 2004, pp. 424–429. [57] ——, “Aperiodic dynamics for appetitive/aversive behavior in autonomous agents,” in Proceedings of the 2004 IEEE International Conference on Robotics and Automation (ICRA), New Orleans, LA, April 2004, pp. 2147–2152. [58] R. Kozma and W. J. Freeman, “Basic principles of the KIV model and its application to the navigation problem,” Journal of Integrative Neuroscience, vol. 2, no. 1, pp. 125–145, 2003. [59] R. Kozma and P. Ankaraju, “Learning spatial navigation using chaotic neural network model,” in Proceedings of the IJCNN 2003 International Joint Conference on Neural Networks, Portland, OR, July 2003, pp. 1476–1479.

16