Encoding Techniques for Complex Information Structures in ...

1 downloads 0 Views 3MB Size Report
can be simultaneously maintained in different regions of the array. The propositions can share relationships or arguments-- say JOHN -- since the JOHN pattern.
Encoding

Techniques

for Complex Information Connectionist Systems

John Barnden

& Kankanahalli

Structures

in

Srinivas

MCCS-90-186

Computing

Research Laboratory Box 30001

New Mexico State University Las Cruces, New Mexico 88003

The Computing under

Research Laboratory was established by the New Mexico State Legislature, the Science and Technology Commercialization Commission as part of the Rio Grande

Research

Corridor.

Encoding

Techniques

for

Complex

in Connectionist

John Barnden Computing

Head:

_; KankanahaIli

Srinivas

Laboratory and Computer Science New Me.co State University Box 30001-3CRL Las Cruces, New Me.'dco 88003-0001, USA

Connectionist

Encoding

Structures

Systems*

Research

(50.5) 646-6235/4535

Runtdng

Information

Department

jbarnden/srini_.nmsu.edu

Techniques

ABSTRACT Two generM information-encoding techniques called "relative-position encodinE'" and "'patternsimilarity association" are presented. They are claimed to be a convenient basis for the c_Jnnectionist implementation of complex, short-term information processing of the sort needed lit commonsense reasoning, semantic/pragmatic interpretation of natura2 language utterances, and other types of high-level cog_fitive processing. The relationslfips of the techniques to other connectionist information-structuring methods, and also to methods used in computers, are discussed in detail. The rich iater-relationslfips of these other connectiorfist and computer methods are also clarified. We detail the particular, simple forms that the relative-position encoding and pattern-similarity association techniques take in our own connectionist system, called Conposit, in order to clarify some issues and to provide evidence that the techniques are indeed useful in practice.

* This work was supported in part by grant AFOSR-88-0215 from the Air Force Office of Scientilic Research, to Barnden, and grant NAGW-1592 under the Innovative Research Program of the NASA Office of Space Science and Applications, to Barnden and C.A. Fields.

1.

INTRODUCTION

Our purpose is twofold: to present and discuss two somewhat atypical techniques for shortterm i.nfo,'matio,z .structuring in connectionist systems, and to use them as a starting point for a detailed examination of the space of such techniques. This examination also brings in the connections to basic information-structuring techniques used in computer science -- connections which

are often The

issue

ignored

and

whose

of short-term

significance

information

is underestimated.

structuring

is one

of the

application of connectionism to "high-level cognitive processing". to cover, for instance, commonsense reasoning, action planning,

major

problems

facing

the

We use this phrase as a shorthand and the semantic and pragmatic

aspects of natural language understanding and generation. In these areas cognitive system to be able to deal with highly dynamic and unanticipa.ted inforlnation. For instance, in understanding the sentence "Mike gets angry

there is a 1_,ee(l for a cong;lomeratio_s of whenever Sally ta.lks

about going t.o Tibet", the system must cope with the particular combinatio_ of ideas pr,,sonted by the sentence. Although we presume that tile system in some sense knows who ._Iike and Sally are and is familiar with the notions of someone getting angry, someone talking, someone going somewhere, and so on, it is quite possible that the system has never before had to consider Mike being angry, Sally talking about going to Tibet, or of someone's getting angry whe_zez'_.v (or becr_use) someone talks about going somewhere. It is very likely that the system has never before' encountered

the

particular

proposition

conveyed

by the sentence.

Thus, it is important for a tfigh-level cognitive system to be able to bring together, in some sense or other, some of its representations, concepts, knowledge or whatever, in a way that is una,zticipated in detail, even though it may have had much experience in using those ideas it_ other combinations in the past. Moreover, the system must do the bringing together very ,'(_12i(tl_j.and the result of doing it must be that the system is now in a state that enables it to process the combined information in efficient ways. Strongly related to the unanticipatedness issue is the arbitrariness of the way' information can be combined. For instance, although normally it is people who are the agents of sI)eaking actions. in children's stories one might find a banana talking; similarly, the system has to have the ability to represent a situation in which an adult believes that think of any firm constraints on what can be combined

a banana can speak. It is very difl'ic,flt with what, especially when ont_ briags

to in

metaphorical language. Although one may legitimately claim that it is too early to start devising detailed connectionist systems that deal with real children's stories, radically unusual belier;, and metaphorical language (though see Weber 1989 for a start on the this claim to shut one's eyes to tile problems that such applications they are eventually

last of these), one cannot use will obuiou.sly bring in when

considered.

To the issues of unanticipatedness and arbitrariness we can add the sheer information structures that need to be processed cover a wide range of compIeocity.

fact that the For instance.

our example sentence above conveys a much more complex inforlnation structure than does a sentence like "Mike is angry". The variability of comple:dty is also self-nourishing ill the sense that a "slot", like the agent slot for a talking action, can itseK be filled with information ot' widely varying comple_ty. A sentence might specify that it is Sally who is doing something, or Martin's mother, or the perso,_ down the road who always trips over her cat wheT_ co,zfl_g back from work. The reason we stress the three issues of unanticipatedness, arbitrariness and variability of comple_ty is that much connectionist research focuses on tasks that to a large extent lack these qualities. For instance, in a printed letter recognition task, only certain highly-restricted, anticipated combinations of features need be recognized, and for any given letter the complexity

of tile (:oml)ination levels of complexity Many some

of features of feature

of the

or M1 of the

is essentiMlv combination).

techniques three

fixed

to be discussed

problems

in this

information

in connectionist

bringing out also the rich One motivation for doing between

connectionism The

are hope

our

the

system

purpose

(Conposit)

hinder

the

well

have in

by

aware

that

details of the treatment. In

rooul

the

ist information ships with our

used

in

SOME

The techniques

of this

which

we will

the

are

we

will

will

techniques. been placed

be putting

of our

forward

We emphasize although

that

than

occasionally techniques

our

that

we certainly

own

connectionist

techniques

and

first

review

the

address (a

to make

linked, (1984b)

some

but

must

defer

or

stressed

issue).

for an early

important

important facilitate

matter

"variable-binding"

indissolubly Barnden

upon

we

of processing

the

see

touch that

Section

4 will

describe

\\"e

are

attention

though

existing

relative-position

Section ,5 will explain association are used

how straightforward in Conposit, our

relationships

the

relate

by

to

detailed

connection-

aiming primarily for ones that have interesting relationin Section 3 we will briefly review the basic information-

computers.

among

various

INFORMATION

section

is boosted

computer that has

to others,

nature

underlies

However,

paper,

the

of complexity,

illumination

emphasize

"systematicity"

which

CONNECTIONIST

purI)ose

more

sufficient

structuring methods, own techniques. Then association. pattern-similarity

with

precise.

to do

processing

papers.

the

tem. Section 6 will discuss Section 7 will conclude.

2.

this

association".

superior

representational

1988,and and

of

methods

pattern-similarity encoding and

aJJow

to later

remainder

structuring

and

representation linkage

to dealing

variability

we ourselves

also

to clarifv more

paper

various

are

"We should

and

the

& Pylyshyn

techniques

in order

clearer

in the

which

c(mli)arable

ful]y solved. The intent of our of possible ways of structuring

that

"pattern-similarity

useful.

only

issues

processing,

Fodor

our

are

techniques

and

that

they

is presented

not

efficient

recently

encoding"

agree

ways

believe

approaches and

have been of the space

roughly

science.

information-structure

of generM

We do of

arbitrariness,

We

have

are promising

problems subtlety

systems.

computer

to argue

will

discussion

issue

and

short-term

reader

paper

letters

connections between connectionist techniques and this is to resist the excessive paradigmatic distance

"relative-position

it is not

the

two

called

different

of unanticipatedness,

though it cannot be claimed that any of these discussion is to illuminate the true richness and short-term

(and

is to outline to PSA

and

RPE,

techniques

described

STRUCTURING

some

major and

and

in the

paper.

TECIINIQUES

connectionist

to each

encoding

forms of relative-position connectionist rule-based sys-

information-structuring

other,

in Section

6.

We

defer

the

detailed discussion of the inter-relationships of the ideas to be found in the literature,

until that section. Although we will be covering many we will not be exhaustive, either with respect to ideas

or with

embodying

the

omit

interesting

respect

outline to.

each

Nor

will

techniques that

are

we be saying

partially

Our of the

particular

and

much

And,

throughout

technicM

challenge

will

about

the

but (see this

have e.g.

section

that

many

relative

we will be ignoring

connectionist

imt)lementation focus

svstems

as a whole,

portrayed.

connectionist

heart

to the system

ideas.

strengths

hybrid

will

high-level

1989, be on

review

aspects and

will

of

the

weaknesses

aspects

Lehnert the

cognitive

not

systems

that

attempt

systems

of the

symbolic/connectionist

symbolic-processing

Itendler

Our

purposely

to

referred

systems -are

and

systems given

no

1990).

following processing

basic

question,

presents

which

lies

to connectionism:

at

the How

arepiecesof informatio_*put system?

For

dynamically together?

t[ow

Positional

how plan

are

rule

are

temporary

the

different

of action, variables

association parts

label

"positionM"

of technique.

is our

It is especially

own

name

How

prevalent

for

each

other

are

i.

temporary

frame

a col_t,

_r

of a visual

slots

ctio_i._!

Im)l)osith)tl.

description

Rumelhart

& Rumelhart

(McClelland

be attended

to crop

up

Our hypothetical its visual fieht is always

in many

systenl is divided

bound

of a

scolie,

put

to vahms"

to each hi each

to indicating,

by

comlnonly

used,

systems

though

directed

rolatively

at

visual

perception.

at a simple, prototypical system. This example the word-perception system of McCl(qland and

Rumelhart

perceptual

& McClelland

1982).

The

basic

idresent is e(luality (1o wilhiIt the resolution of whi'ch activity-pattern comparators are capable), lit the terms o{" {}ur geliei'al description (tile

of PSA,

ones

the

simplest

containing

X),

view

in which

is to take

tile

case

"component

the

association

to be between subnetworks"

PSA are just the registers. be between tile temporary

However, it is often intuitively subconfigurations in which those

ponent

tile

subnetworks"

of component

Doln

subnetworks

in Conposit differ subconfiguration, registers other is no separate is implicit bv

than role

the the

tile part.

of PSA

means

structure

PSA

within

each

register of tile

of those

can

also

be deployed

denoting one.)

with

as before,

so does of the

fewer

than

seven

A rule that contiguous

tile

every red

white

other

and

4,

part.

PSA

register

register

green

The

register.

in

arguments

the

discussion. hlformation

It should

is of course tile subconfiguration

Also,

involving

this

If there

idea

the

Associator 1}art. There with resl)ec_ to others

thai

be

is a pair

part,

from

a Role

clear

that

tile

a rule is not

techniques and creates enough

be represented

local

stvuclu,'e

i>

have

the

part

in Conposit

point

Conposit

presented quantifiers. a new

data

space,

tile

of the encode

4, is the

Fol a

of single

if forced to })y' _h(' t(_ be used.

more

rule.

The

and

then

interpretation

situation

nested

deeply'

svmbol

of that

in Fig..5.

to realize

CM, do

tile part

of the

other

such

a.s that

are ones of resou_ces: simulations pt'ovide

tile equivalent

in Barnden

Associator

register.

propositions,

only limitations available (current

to

situation;

loving

the

part

fails

b v means

X is again

highlighting

is detailed

simply

seve,_

and PSA much like that discussed at subnetwork" from the point of view

of Section

in the

loving

of the

containing

be used

matter

structure rule

case

one

Mary. Tile symbols

can also

the

semantic

in the

The

of view

can

This

denotes

a mixture of RPE or "component

of registers.

than

neighbors other used to split up

subconfigurations. 5. although norlnally

only split it up subconfiguratiol_

symbol-sharing

emactl 9 as it was

consisting

more

ItERE***

X temporarily

X, by the

is then

mixture,

is also

connectives

\Vhen

this

love-proposition parts, namely

That

propositions

6 ABOUT

Tom believes that Bill hopes that .John loves size of the CM and the number of unassigned thirty).

of

subconIigurations

The

as ill Fig. 5 can only have seven of unassigned synlbols can be

would

containing

containing

registers

Information There

of view

\Ve met

although

earlier sever_

creates such a proposition would fl'ee space ill the CM for a single

\Vith the split up as shown, we have end of Section 4. Each subconfiguration,

of tile

l>oint

subconfigurations.

subconfigurations.

to encode

a situation The sharing

***FIGURE

.just

the

registers

to take the association to lie, in which case the "com-

representation of a single simple proposition (situation) into several sake of illustration, a split up of a loving situation is shown ill Fig.

subconfiguration. absence of enough

it.

the

in Section

X. That significance

from

reasonable registers

individual

of RPE.

(a white register black, class-denoting

proposition

the

are

subconfigurations

one containing Rather, the

local

Conposit's participants than the

being

of view

in detail from the ones assumed in that for instance, is best regarded as having

in the

encoded

point

two

of logical

(1988,

it must

find

enough

what

it should

be

the for

formulae

1990). "free doing.

space"

for

Conposit

as it currently stands does not try to avoid such failures or take special action when they occur. This is one respect ill which the system lacks qualities of graceful degradation. It should be note(l_ however, that one could design enhanced versions of the system that would contain multiple "back up" CMs that would hold data structures there is no room ill the main CM. Indeed,

not the

currently in the focus of attention alld new version of Conposit that performs 19

fc)t' whi(:h massively

parallel case-based reasoningdoeshavemultiple CMs. and data structure processingoccursin manyCMs simultaneously. An idea of the capacityof a CM two-argument registers,

propositions and

(in

the

current

for another

proposition.

configured free space.

as a square In this way

a2 × a2 CM not

can

currently

in

and

capacity

hold

to pack

each

front

of the

can

up

to

100

registers

assuming

each

to touch

tightly

any

if each

of the

one

has

registers

its registers

be separated by at least one register width of consumes a square of nine registers, so that the

into

propositions.

a four-register

in a random but

the ma.vinmm number of proposition requires four

is allowed most

two-argument

proposition

exactly,

from considering A two-argument

be packed

groups must effectively

registers

to analvze

results

none

propositions

group, where each proposition

argument

difficult

propositions

Conposit)

The

principle

seek

relationship

can be gained conceivably hold.

it could

way

a crude

one consumes

square,

around

lower

However,

the

bound

but

head

a sixteen-register

instead

register.

of 64 for

the

Conposit arranges This

the

makes

ma_mum

square

does the

number

including

of

separating

space. Although Conposit. proposition consumes, it

as does

placed

to existing

as close

highlighting On

the

and

arrag

as possible

manipulations. Use

Our

of

a 2D

decision

Register

and

the

We do have applicability region

dimension_ that

(see

spatial

As Some may the

regards

simple "big

proposal

a PSA contain

scheme.

array

all,

that

simply

of RPE,

_.

an unassigned

away'

One

be achieved

by simple

with

of the symbol,

RPE registers while

we do not consists and

was 2D

simply

to treat

in a big tile other

the

register would

to use

insignificant

to choose some dimensionality, presentation and directness on the

Massively

Parallel

are interested in the a 2D projection of a

is that we to represent to confine we are

stands)

matrices

our

attention

initially

interested could

to

be

be the

possibility

basis

of a theory of regions

to incorporate potential

to two

in the

idealizations

desire the

of processors).

technique.

an unstructured

collection

of wha.t

registers.

Essentially,

big registers would

as component

be tile

Information

Associator part

and

of

as concep-

of that

of two of our actual

be the

of

One CM that

have

decision

arrays

to investigate

essentially

the

is relatively

simulated

reflects

in order

iudeed

two-dimensionM

it is simpler

as it currently the

at why

and

taking

an array

been

have

though. using the

by

a version

of which

has

motivation

Conposit

having

is to do

and

two-dimensionality

of which

on this),

other

not brain,

asked

dimensional,

The

motivations, reasoning by

The

each

two

(Conposit both

as possible have

be

we wanted an array, we had the point of view of graphical

processors

(but

in the

registers",

can easily

that from

6 for more

system

commentators call

placing

questioned.

substantive to spatial

reasoning.

of high-level cognition cerebral cortex.

This

been

Machine,

Section

a Conposit-like

tually

array

Connection

two more of Conposit

of space

Conposit's given both

on present-day

Processor

structures.

Array

sometimes

from some points of view: and two was a good choice simulation

data

implied, does not seek to minimize the space each them tightly, ill the sense that a new proposition is

_

to make

at a]], have

we have just seek to pack

subnetworks part would

and

we in

would

in simple

s The free-space grabbing mechanism in current Conposit simulations is an advance over that reported in other papers, such as Barnden (1988, 1990). Conposit as described there did not try to place new propositions near to old structures, and always split a new proposition with at least one argument up into two-register subgroups in the wa.v described above (although rules always had the ability to detect and process existing non-split propositions). Nevertheless, the special highlighting manipulations used to effect free-space grabbing are similar in kind and complexity to those reported before. 6 Though, in line with a point made in Section 4, there is still a limited form of RPE within individual component subnetworks. 2O

casescontain a constantsy,mbol. This is cert_nlv a logically possibletechnique,but we seethe followingadvantages in includingthe presentarray-basedRPE. First, weagainhavethe pohlt that we areinterestedin usingall array-structuredCM as a spatial analogue. Second,the inclusionof array-basedRPE makesfor moreefficientpi'ocessing, ill that it leadsto PSA-basedlinks being processedlessoften,both in data structure detection durhlg rule-enablingand in data structure modificationin rule-execution.For instance,without RPE it would be more time consumingto find tile agentof a loveproposition,given accessto the "big register" L acting as tile head of tile proposition. This is because we would need both a broadcast

of the

Associator from

part

the

of an

associator

of the

Associator

big

pa.rl

RPE-encoded

symbol register

of A part

proposition

in L (combined

A encoding to its

merely

the

with agent

Infornlation

requires

a highlighting

of the

part.

a nlove

check)

loving,

On

the

a_d

other

of attention

hand.

from

mented

as

distance

a physical

on

Conposit

itself

(1988,

data

phic

show,

CM

have

had

inoves

of attention

between

different

a physical can

appear

is no

connected

analogous

in the

particular

difficulty

(Non-)Synchrony,

CM,

and

to each

other

sent

by

to all

the

problems purposes,

Time-Based

signal

the

variations

speeds the

of reaction

command

registers

and

signal

synchrony

times

both

a rule the

imple-

a much

shorter

version.

svstems

Although

might.

subconfiguration

for a given

In fact,

DetecBarnden

the

CM in a simple,

near-topographical

will

system

arise

no matter

to process within the

actually

ill any

how

unlike

that

"'spatial"

way. call

Conposit

recruit

it is in other

a given information network on any given

structure occasion.

to

distributor,

goes

registers.

and

at which to the

for

was

no

m;ttter

Each

the

CM's

register

systematic

level of description simultaneously, at

the the

between

different

reaction

steps may

take

biases

and

random

system detailed

for

itself

whereby

whether

and

or less simultaneously realization there are

occasions

occur.

register

parallel

decides

Also,

different

for

we assume

times

effects).

a given

and likely'

register) that

to reach

the

their

Therefore,

in the copies

there

of

destination

although

is best thought of as synchronous, level of connectionist circuitry

how more to be

at

the

with registers is no strict

assumption.

hldeed, in Conposit. satisfying

the

CM

Selection

registers,

distributed

(allowing

register-machine reacting literally

register

(between

the

be over

problem.

to react to the signal, and the reacting registers proceed more or less at the same rate. However, we assume that in a physical randoln

white

shapes, tile Subconfiguration However, as the details in

in solving

and/or

pattern-invariance

distributed

the

agent

tile service of I)SA mixed up_ whereas

in a RPE-free

Conposit-like

the

give it an advantage, in simplifying and regularizing tile Detection Module is an acyclic graph of matrices isomor-

representational

and

would

ro the

propositiol_'s

in which

to neighbors other

anywhere

respects: the system must have the ability where and exactly how it might be realized

A command

realization

"registers"

realization,

arrays actually Subconfiguration

for temporary

it is identically

a physical

have any one of possibly many different with a "pattern invariance" problem. there

and

important,

components

then moves

structures

of Conposit's needed. Tile

to'the

More

not

could is faced

1990)

regularity circuitry

than

does

structure Module

as it stands

array,

average

Since data tion

if Conposit

get

of at_eution

to find

the

head register to the neighboring red register. Worse, symbol broadcasts in based association-following have to be seri_dized on padn of getting the symbols RPE-based associations can be followed in parallel. Moreover,

to

a move

the

lack

of strict

A comn_and a highlighting

is achieved

signal condition

through

synchrony often

is exploited needs

transmitted

to

cause

ill tile

a "temporal-winner-take-all" 21

in an important, just signal,

one,

to react. (TWTA)

and

arbitraw This selection

apparently register, arbitrary

novel, out

of the

selection

mechanism.

\\:hen

way set of a a

register detects that it satisfies the signal's highlighting condition, it transmits an announcement to tile parallel distributor. Because of the types of asynchrony noted (plus variation in the time taken to decide to send announcements), the announcements are spread out ill time, and the parMle] distributor tries to select the register sending the earliest arriving announcement. Itowever, because of delays within the parallel distributor itself, this cannot always be done unambiguous!y, so that a further round of announcement may well be necessary (with a reduced set of registers), and so on. The TWTA mechanism is further described and analyzed in Barnden, Srinivas & Dharmavaratha (1990). It turns out that the number of rounds of announcements is roughly logarithmic in the number

of registers

Variable

Binding

initiMly

sending

announcements.

in Conposit

I/PE and PSA allow arbitrary temporary associations among information items to e_st within the CM. In particular, the examples of the use of PSA and unassigned symbols ill Section :3 show that PSA/RPE-based associations provide a sort of variable binding: if an unassigned symbol is regarded as a variable, then placing it, say, at the head register of a love proposition "binds" it to that proposition. Another particular effect of RPE and PSA is to Mlow role-filling: tbr example. the agent of a love proposition is represented by whatever red-highlighted register is adjacent to the proposition's head register or to a register linked to that head register by PSA. We could, if we wanted to, regard role-filling as a type of variable binding, because a role could be viewed as a sort of variable. So, Conposit achieves a type of variable-binding (and role-filling) u,ithi** the CM. But there is also the question of the variable-binding performed in the process of firing and executing hardwired rules: this is a type of binding operating between the CM and mechanisms ot, tside the CM. Barnden (1990) looks at how Conposit achieves variable-binding for rules. Overall. Conposit's variablebinding facility is fully general, although special types of variable binding are effected particularly quickly by the Subconfiguration Detection Module, the remaining types being a.chievable only by the execution of rule action parts. We will confine attention in the present paper to the aspects of rule variable binding that can be described by reference oMy to the CM. One version of Conposit contains a rule that can be paraphrased as "It" a man 31 loves a woman then M is hungry." In the sense that the action part of the rule, when it acts upon a particular detected love situation, is able to access the register representing the man, we can say that Conposit is binding the variables M to that man. The action part of the rule sends a couple of command signals that have the effect of switching on a certain highlighting flag I at the agent register of an arbitrary one of the love propositions detected by' the Subconfiguration Detection Module. Note that it is I highlighting that identifies the agent for the purposes of building the hungriness proposition. The installation of the register clump for that proposition copies u,hc,tever" symbol is in the/-highlighted register. The marking of a register with I highlighting can theretbre be viewed a.s binding the variable M in the rule paraphrase. Since bindings are represented by highlighting, within the manipulated

sooner or later, and since highlighting is an intrinsic CM, the usage of bindings is just a speciM case of the by command signals.

part way

of the encoding scheme CM data structures arc

It is also important to observe that the variable binding achieved through highlighting is most simply viewed as a binding of rule variables to CM registers rather than either to the entities denoted by those registers or to the symbols in those registers. If we wish to view bindings as reaching out to the symbols (or denoted entities), as is often convenient, we must note the crucial played by the fact that a register can have different values at different times. That is, the system's capability to bind rule variables to different symbols (or denoted entities) on different occasions depends

both upon

the fact

that

the

rule is able to iinpose 22

appropriate

highlighting

(by l in our

example) different

on selected times. In fact,

called

registers

variable-binding

';processing-locus

identify

and

CM

regarded

as

of the

as

the

as loci

a combination issue

of different

two

parts

ways

connectionist Most

within

svstem

discussioils

the

section

we discuss

information-structuring examine

the

techniques

relationships

techniques

and

discussing

RPE

the

the

and

PSA

RPE

Sequential

Allocation

RPE

is a natural

extrapolation

as easy the

Conposit's bidirectionality

to go from

opposite

versions

tile agent

direction.

register

This

_o be

be

The

realized

in a

places

in the

can requh'e d(/'f_.reTzt processingwould also be useful in discussing can

and

This

PSA

be represented proper

bear

gives

in mol'e

accou,l

to

is a particularly that sequential

to the

us tile

techniques,

and

of' thi>.

insight

allocation

opportunity

to the

be expected

also

ionis;

these

Ii_

l.echttiqltcs

ill

tlowever, we will oft in a

of the objects.

following:

IS-RIGHT?:

given that two registers are highlighted using some special fla_ f, where the registers rnav be widely . separated in the CM, determine whether one of t-hem is " to the _ rlgqlt ' '_ the other (assuming that some fLxed direction in the CM corresponds to the direction "right" in the real spatial plane being represented).

Currently, the wait

the

registers

way send

we propose out

to see if some

series

of steps,

that

this

a highlighting

register

at each

becomes

of which

any

operation

be carried

"wave"

in a rightwards

highlighted

in both

register

more

the the

so that

Notice relative akin

each

process can only basis of direction. also

position to the

global

non-edge

be done

that

in this

relationships numerical

register

if registers

version

or less to the

are

has

three

registers

have

some

ability

of Conposit

important.

addressing

is that

w and f. The

becomes highlighted in w. By "more or less to the right" one further along in one dimension of the CM and possibly dimension,

out

direction right

The

question

24

system

wave

make

a flag spreads

of a register

one

of

,L:), alLd then by meat>

highlighted

of a in u'

we mean the register whose position is one further along or back in tile other more

or less to the

to distinguish

it is no longer

in a computer

the (using

of

the

therefore

memory

should

among

case arises

that

right their

only

of it.

on

short-range

of whether

be added

Clearly.

neighbors

in order

something to speed

up tile implementationof tile operationjust discussed.Weobserveheretile obviousbut important fact that arbitrarily long-rangerelative-positionrelationshipscan be discoveredin a computer memorymerely by numericMlycomparingaddresses(assuming,crucially,that we already know what the relevantceils' addressesare -- if we did not, we might needto engageill a very time consumingsearchprocess).Nevertheless, we haveresistedthe temptationof addingsomesort of numericaladdressingto speedup tile spatial-analogue versionof Conposit. TILe reason is that the CM

is assumed

iterative

to contain

"wave" These

tion

is the

allusions

practice,

tool

to support

in scientific

a computer

locations is not

analogue

a spatial as an

use

than

too

arrays,

analogue

can

of some

repl'eseatation

should

( 1985,

1987).

RPE

in

not

RPE.

This

from

Conposit's

only

form

out

earlier,

is again RPE.

connectionist directions

the

up-down

dimension,

with

the

a simple The

arrav

does;

us that

sequential

can paper.

together

be

in a computer

is no more using

hence,

tile

radical

than

alloca-

memory. the

very

as the

in real use

space.

array

Notice

of a particular

in a CM,

also

region

temporary. At other times, the Indeed, there is no reason why

simultaneously

The

('OlillllOl/

a conventionally-implemented

of points

just

remind

dimension,

that

itl the

of computer CNI both

as is suggested

CONPARSE

extrapolation

model's

system from

use of RPE

makes

of Cha.rniak

sequential

&: Santos

allocation,

it Conposit's

dimension tile

aid

of the

is for representing "binding

units",

constituency

might stylc's

be of

in Barnden

to regard a CONPARSE binding field is generally a unit that,

portions of the network, rather than a unit that is able to bind different times. The interesting thing about the binding units in CONPARSE RPE and the usual notion of pointer: a CONPARSE "pointer" from

immediately binding unit

by

r binding

more

sort,

as)

nodes

of the

RPE

and

defined

We turn now to the relationship between RPE in Section 2. To take an example mentioned

conventional

space

ol'

the

left-right

things together at different is that they are a. hybrid of a register in a given column

within the column each CONPARSE

in the

in detail

in the

of pointer, since a "binding is taken to bind two fixed

but rather an address out in Section 2 that

(or implemented

of

of constituents.

is not a global address, However, we also pointed rows

a form

different

neighbor

whereas

identity

unit as a sort when active,

uses

that of Conposit, in that significance. Recall that

relationships,

is for representing

(1987)

though

closest

systems, r CONPARSE's RPE is less uniform than in CONPARSE's array have different representational

It is probably best in the connectionist

unit"

mixed

memory

CONPARSE

As we pointed

e_sting different

be

of

be temporary,,

memory to support an array representation used in the way described earlier in this

computer

should

of any

rel)resentation

for instance,

isomorph

of a CM

an ordinary

time-consuming.

representations

for implementing

programming

memory

spatial-analogue

fewer

described to spa.tiM

standard

use of a CM

far

process

where

to the right of C'. could be replaced r is the

number

o[

array. Absolute-Positional

that

represents

the this

presence description

the of

the and

presence

Techniques

of the

letter 'It' in say tliat tile

letter

'T'

in position

and "absolute-positional" there, suppose there 1, and

a letter

unit

techniques, is a letter unit H: that

the next position, 2. Then we may certainly simultaneous activity of Tt and H., represents

as T_

represents

abstract from the presence of

r Conposit was developed independently of the Charniak & Santos system -- an early version, using much the same type of RPE (and PSA) as current versions do, was presented in Barnden (1985). \\;e should also emphasize that Conposit bears very little relationship to tile memory-field proposal of hiohonen, Oja & Lehti5 (1989). A similarity between memory fields and Conposit's CM has sometimes been claimed, but in fact the form,.'r make no use of RPE or PSA. 25

a 'T' and and 'H' togetherwith the fact the 'H' followsthe 'T' in the word viewed, tIence, a contiguity relationship(a type of "abstract association")in a word is representedby making particular choices of the units for representing the two letters. This brings out a loose similarity to RPE:

an

component

RPE-based

representation

subnetworks

\Ve position

can

with

make

this

i as forming i as forming

word,

given

a component

which

point

more

precise

array

subnetwork

consists

has

either

unit. To represent two contiguous for some i. We have now described in Section 4.

extra

restriction

reflecting

the

a linear

array

correspond

of component

that

the

so that

word,

Conposit they

if the

only

highlighting

represent

positions

unit

is on

one

the

contiguous

than

the

relationships

letters,

scheme the

original,

positional

Let

the

unit

be the

case

somewhere,

to

which

one

be dedicated

would

is forced

to particular

be "less

Absolute-Positional

uses

time.

However,

the

a set

speciftc

positions

So.

that

--

word

and

unit

being

no

longer

in each

have

simply

include

violating RPE -of it. A modified Suppose we have C,, with

first

Then,

making

to the

position

extra letter

taken

unit

in

is like a

units

on then

to a represent

could be used. The absolute that this modified, RP£-based We are

in

.itl._t oue

at

we did not

the

viewed.

of a

pattern

vMue

to the

Cj+l

C) for

Mlowed to choose any Therefore, we have an

but

(In essence,

the

units in C, a,M C',+_ description of RPE

that

to correspond

if a Q

version.

of

_Lnv given

activity

ON

letter general

mentioned,

time.

for

representation the

or of an

an extra

taken

in the

to choose

The

a point

than

following

the

variant are

Instead,

for each

word.

the

binding

(We

purposes.

defer

a

positions scheme about

the

pair

For

example,

(as

(and PSA) portions

opposed

to

sake of illustration, portions of the CM

personal

relationships

would reign at will, the

in another, would depend

in each portion, relative-position

Nodes for

representing

each

of which into

ofsubnetworks

is active, later

subnetworks

positional is that o[

as it stands.

not organized

node until

the usual RPE cross between

Binder scheme

component

region, physical position relationships interpretation of a subconfiguration

in Conposit

subnetworks,

subnetworks

When

in the

and

of component

position. D.

relative"

to

between a truly absolute of variation in between

subnetworks), just for the scheme in which different

representational

although be able

possibilities dimension

particular

in one particular that the system's

Encoding

Consider again

to C and

on.

encoding

just

of N C/ subnetworks We are not saying

there are intermediate positional scheme.

on which region it was in. Then, and indeed PSA linking might

any

form

on at any

absolute-positional

would have to be represented and so on. Further, we assume

encoding

In a proper

us introduce

choosing a particular relative "position" of those one intermediate possibility would be a Conposit-like would

units

we regard

is the fact that we are not is at position 1 and so on.

of the

choices

of techniques.

As might be expected, and a fully relative

extent

of letter

at all umts

Ci is currently

Let it still

word of length N, any contiguous series in the array are no longer important. is better

G

in words.

then

Ci will have

mtit.)

particular

This is not to say that we are necessarily a very special, or perhaps degenerate, case RPE in a more typical sense is as follows.

subnetworks

to particular

intention

set

on the appropriate fits roughly in the

use of an absolute

in the general description of RPE. we could instead say that we have form of the scheme that would be

oll

Furthermore.

one of its units

values

here letters

the

Ci say.

subnetworks.

at most

letters we turn a situation that

Of course, what we have suppressed i must be 1 if the earlier of the two

rests

associated.

regarding called

of OFF

i: the

--

by

things

of component

subnetwork

association

the

subnetwork,

a linear

component

abstract

to represent

a component

increasing any

of an

the

C and issue 26

a word, represents

an array,

and

(C, D) there D are

of specifying

taken

based (at

on binding most)

no subnetwork is a binding to correspond

whether

it is the

one

nodes.

letter

It

at any

corresponds node

to

connected

to contiguous C position

that

follows tile D position or the other way round.) There node again, for specifying which conlponent subnetwork in the word.

is also some corresponds

meaus, porhal)S a high ghting currently to tile first position

Now, in the absolute-positional scheme, contiguity was represented by means of the activalion of particular units (letter units). And, in the present binding-node scheme, contiguity is still represented by means of the activation of particular units: in this case, a coordinated choice of letter units and binding nodes, tlence, simply saying that particular units are activated does not distinguish the schemes. The obvious next step is to say that the critical difference t'rom the absolute-positional scheme is that nodes other than the letter units are involved. Itowever. matters are not as clearcut as they seem, since if one considers extra machinery that might be presenl in a system using the absolute-positional eisewhere in tile system. To takea

scheme, we are likely to find things similar _o bindiltg 1Lodes simple instance of this, suppose there is a single OUtl_lt node

OrnE that represents the word 'TtIE' and which therefore lights up when 7i (i.e. 1he "T" utlit in C1) is on, H2 is on, Ea is on, and no other letter unit is on. Then, Or_1L" can. if we wish. be considered as a sort of binding node, since it connects some component subnctworks and is active just when they are to be taken as being "bound together" -- in a rather specific way. A less extreme example of the same point call be made by considering possible digram units. ()_ _*}ler subword units such as the triple units in Section 2, that might be present. An important difference from the binding nodes postulated at tile beginning of this subsection is perhaps that nodes like OruE become active as a result of activity in a self.su._TeieJ_Z lower level representation, in this case the letter-level representation of the word 'TIIE'. Unit OTnz; cc, dd be turned off without destroying our ability to say that the system is encoding this word. By contrast, the earlier binding nodes' activity was an essential part of tile representation of the word -- they could not be turned off without destroying our view of the system as encoding "Till';'. Notice_that the distinction holds even if Orl4E dvnamicall9 contributes, by top-dou,_ fe_z(Ib(_clc, to the establishment, a_M ever_ the maintenance, of the letter-level representation. For, we can fall back on the following counterfactual statement: if it were possible to turn off Or,tE without anything at there letter level on or off, then the system could still be seen as representing The

moral

from

this

is that

the

description

of a given

system

as using

a given

turning

'TILE'. encoding

technique can depend very much on one's view of the system and of the allowable variation in the technique -- on the level of description, on how one parcels up the parts of the system as to function (representation and recognition, for example), and, in our case, on what one is prepared to accept

as a "binding

RPE

and

should

Having seen that absolute-positional techniques ask whether RPE can do so. We answer this

version

Binder

node".

of RPE.

Nodes

Barnden

(1990)

shows

that

Conposit's

can implicitly bring ill binding nodes, we by looking, for definiteness, at Conposit's RPE

as manifested

in the required

in the Subconfiguration Detection Module brings in binding nodes in a somewhat way. The nodes temporarily bind nodes in the rule action parts to CM registers.

circuitry

straightforward

If, however, we look at Conposit's RPE as manifested in the role it plays in the CM, we find that it can again be seen as bringing binding nodes in, but only in a rather complex, forced and artificial sense. We observe first that an adjacency relationship within the CM is construed as representing a temporary association only if the registers concerned have suitable highlighting. For example, a register A adjacent to a register S that represents a love situation is only construed as representing the agent if it (A) is highlighted in red. Consider what happens if a rule wants to find the agent register A. on tile assumption that S is ah'eady marked with "detecled" highl/hling. 27

say. I,et us say that tile rule must mark A with highlighting l. Then what tile rule does is to set_d a command signal to the CM, telling every red register adjacent to a "'detected" register to turn highlighting some Thus,

I on.

Some

component

sort of connection path ONness at A's red flag

traced

through

sense

we might

binding

the

node

any

one

be tempted

the

of the

particular

red

to Fig.

of A aim

that

A's

unit

registers, of

7. The

red

figure

We

signal

l.s boxes

shows

register)

there

mentioned

not actually of activity should red

that

does

dictated

an

really

say

that unit

Highlighting that

has

causes

do

I highlighti,lg

marks

Although versions

RPE

and

Time

Phases

The

reader

will

phase

of RPE

method

mtdtaneity "space"

(CM

space

relationship

RPE

the

the

no doubt

the

adjacent

whereas

a green

association

in

to PSA

register typing

that

are

with

can

flag. be

S. Ih this

moment_

as a

have

than

roles

act

:mall

this

way

the

duality'

latter

uses

relative

We find,

however,

We look

at the

the

i,

units

,ev

(actually..,i-

to relative

relationship

seen

binding-node

superficia]

it is to

between

A's

unit.

For example,

"position"

as opposed

A (as

binding

to S's

subsection, analysis.

associations,

_)it to

"'h,calJvlo the ,_

actually by the co_nbi_+,tio_, highlighting unit. Thus. we

is a relatively

to a similar

2.

In

highlighting

on by a rule.

in

fighl g tti,t_

as a binding

processing

space/time

in network has

a more

below.

Associations

to specify

involved their

denoting to the

"orienting"

S. The

of A's "'tocally-satisfyitLg'" flag. change (namely, the switching

in propositions.

during

to RPE.

two

detail,

'_neighborly".

register to turn its the d aud ,' bo×,,s

then,

indirect

the

of Conposit).

more

n for is ON.

connections.)

Altogether,

noticed The

into

betw+,elL

a little

including

"neighborly"

together

Conposit

connected

"detected"

the value the state

to be worked

on

register

adjacent and

signal.

combinations

and

on the

occasions

of

highlighting to a white

the

lnediate

in

neighbors,

for direct

ANDed with that performs

specifying

to encode

case

register

stand

to be amenable

Section

pulses)

associations

register this

likely'

of its

in an elaborately

concentrated

are

does

account

on, and every red by the lines joining

not

of units

at some

Typing/Orienting

Clearly, to "type"

that

in the

substantive and

registers

mentioned

of periodic

pair

than

we have

other

(for

play

HERE***

every

command

other

flags matter.

normally

above

three

OR operation

connected

uses

view of some highlighting of a more fundamental

into

A and

preferentially

at A's red highlighting unit alone, but unit and activity at S's "detected"

it is this is only

node

7 ABOUT

lines

by the

encoded by activity at A's red highlighting

highlighting

bring

/ highlightit_g path that

between

is acting

A is not

the

A and

A's neighbors. The result of this OR is then If the AND result is ON, then A is a register on of I highlighting)

therefore

r standing for red, d for "detected". box has a double wall then the flag

actually

(These

is a unit

unit

register

pursue

register

flags, with If a small

respectively.

must

connections

highlighting

turn its so-c,'_ed "neighborly" highlighting flag satisfying" flag on. These effects are illustr_ted and

signal

the

a binding

must

***FIGURE command

S and

in a given

whereas

nodes.

boxes illustrate highlighting aim Is for "locally-satisfying".

The

command

S.

highlighting

or sets

in this

logic

to say

A and

neighboring

nodes

referring

register-internal

between

IIowever,

signal

P from S's "detected" higlLlighting ftag to A's acts as temporary facilitation of some connection

register

shouhl 28

not

RPE-based

or "orientations".

a love-situation

situation

facility

in Conposit's

"directions"

denotes denotes

the

the

agent

object.

be underestimated:

associations For

serve

example, of the

The

situation.

importance

it is evidently

a red of a very

important those that

capability, but seek to encode

by weight-enhancement

is one that associations

or activity

is not trivial as facilitated

at binding

to realize in connectionist connection paths, whether

especiaUy facilitation is

systelns,

the

nodes.

In a binding node scheme, there are dimculties enough just in orienting, and we will comment only on this point here. For definiteness, we can go back to our hypothetical introduction of binding no(ies into a word-reI)resenting system. One solution might be to have, for each subnetwork-pair C, D, two binding nodes, connected in exactly the same way as each other to C' and D, but having different interactions with the the rest of the system, so that is is up to rest of the system as a. whole to "know" the differing significance of each of the binding nodes. A further possibility is to have two binding nodes, but connect them differently to C and D; for instance, one binding node couht have a. stronger connection to C than to D, vice versa for the other binding node. This aswnmetrv might in principle be a suitable basis on which the rest of the system ca_t proceed. (T'he obv(ous idea. of using different directions for the connections is problemalic, orientation of a binding has nothing to do with the directions in which the system /ravcr._c it.) Another proposal is as follows. There is a binding node b is connected

because the might have to to l>,)lh (' alL(1

D, and in the same way to each. Another node bc is connected to C and b, in the sa.me way _o each. Similarly, a third node , bo, is connected to D and b. Node b is on if C and D represent contiguous positions, either way round. When the C position is meant to be the earlier on,', node bc is also on; and similarly for D. This scheme is an implementation of the standard set-theor_'tical device for representing ordered pairs by means of unordered sets: represent the pah" ((*, I)) as the set {C, {C,D}}

and

the pair

We do not dwell further in the elaborateness

(D,C)

as the set {D,{C,

on these possible

of the whole

D}}.

solutions,

which

all involve

a considerable

increase

svstem.

the role-distinguishing highlighting flags in Conposit can be seen as binding nodes, as discussed a moment ago. The orientation problem is solved by the fact that a highlighting unit's relationship to the rest of the circuitry in its own register is different to that to the circuitry in adjacent registers. To put it another way' no register confuses any neighbor's highlightil_g state with its own. The orienting issue for binding nodes has received very little attention in the connectionist literature. There is an interesting reason for this: proposals generally confine binding nodes to mediate between subnetworks of markedly different types or which have clearly' different roles the distinction between which is hardwired into the rest of the system. For instance, in Smolensky's tensor-based subnetwork

system (applied as presented and a filler subnetwork. There

in Section 2) a binding node sits between is an assumption that the system already

a frame-role knows, so to

speak, which subnetwork constitutes the role or rule and which the filler. Similarly, in DCP$ (see Section 2) a binding node can be viewed as binding something in one clause space to something in another. Again, there is an assumption that the system already knows which clause space is which -- they permanently play different roles in the whole system. Thus in both systems the required asymmetry subnetworks

is built-in, tt_owever, there is no such built-in in the word-representation example.

PSA

Associative

and

asymmetry

in the case of the

component

Addressing

Association bv symbol sharing, which is Conposit's version of PSA, is a simple extrapolation from associative addressing in computers: clumps of CM registers linked by PSA are like concrete records linked by associative addressing. In fact, Conposit's PSA is probably more like associative addressing preserves

than its RPE is like sequential the bidirectionality of association

allocation in computers. that is provided by (the 29

Conposit's PSA obviously simpler forms of) associative

addressing well.

in computers,

Ill a more other

parts relation

being

developed

The

complex than

in the

strict

form

of PSA,

equality.

of activity

activity

PSA).

pattern,

PSA

differences

PSA

and

Binder

Earlier bringing

in

this

the

saw

binding

that

strict

equality

several

association

would

was

the

(at

least

can

be made

CM

parallel

actually

through

A sharing state

state

witness 2. The

unassigned

fly by the and

systen,.

highlightitlg

computation of the to each component

relationships patterns,

(i.e but

ul_wanted

hlslead

allows

the

"node"

total

For every

of S is binary'

this view of Conposit's connectionist research

The

it, other

is even

unassigned

circuitry

signal

as

versions).

more

symbol

in each

formed

set

by

the

in question

other

strained. X. The

to cause

register

SR

way

a symbol othe1' The

R contains

networks

registers

symbol

association

(ON/OFF) we can

is equivalent

unassigned

temporary

Then

the

viewed

a

for

all the

sharing.

in the

state

binds

same

probably view

be

some change to take place in the are equal to the one just broadcast.

subnetwork

a Y-based

which

the

can

the symbol in the register. The symbol-broadcast paths joining each SR to every other .9_,, through

registers

is ON.

quite

that

RPE

is for a command

by symbol

over S.

of S that

CM

distributor.

to define

an activation portion

the

which

and

the

and subsequently their own symbols

associated

pattern

except

in

say

at each that

each

Y, anothel'

among unit,

to the the

so that

such

portion

same

presence activation registers.

we can identify is a complex,

in question.

PSA as involving has countenanced

binding nodes quite complex

& Hinton's distributed binding nodes is merely

binding node at an extreme

is a highly complex one, binding node arrangements, schemes point

at

reviewed in Section on the same path of

elaborateness.

There is going

is nothing very to need circuitry

component and It

reviewed some

be

Smolenky's and Touretzky view of PSA as involving

among PSA

S

sense

contain

constitutes connection

X bv the

deemed

binding

increasing

PSA

such

Clearly, However,

best.

be a

of Conposit

contain

symbols

and

might

version

on the

to the

as activity

version,

within

parallel

pattern indirect

of a symbol

by the

distributed

computed

symbol-sharing

PSA,

in Conposit

CM's

mutuaJly

activation

that

reasoning

the arguments. The random perturbation

strained

about

be used

Let

set

of S is similarly

Assume the

the

distributor. R in some

of a particular

Information. patterns

of propositions

applied

of sylnbols

in Conposit's

registers

S/_ whose activity therefore involves

registers

Associator,

it as

pattern.

a rather

to be broadcast from one of the registers, registers, bv virtue of their noticing that goes

to preserve

associator

case-based

automatically

unwanted

of the

separate

registers

transformation

to prevent

there

nodes

Suppose

broadcast

new, head

are

be expected

similarity

the relationship and addition of a small

component

observation

subnetwork machinery

have

can

Nodes

we

An analogous

still

in the

that

of PSA

required

to appear),

a hashing

requires

at each

the

Indeed,

in order

no longer

small

we might

patterns

is includes

versions

but

& Srinivas,

states in the registers representing unassigned symbols also involves of the

advanced

subnetworks,

(Barnden

computation

more

version

hi component

looser symbols

and

special to Conposit directly or indirectly

in the above argument, since allowing associator patterns

any system using to be transmitted

networks.

Signatures should

be

in Section

sort

an associator

of activity pattern

clear

that

a "signatures"

2 is a special pattern, in our

case

be it as simple terms.

technique

of PSA,

A special

such

if we assume

as an feature 30

activity

as that that value

of ROBIN's

used

in the

a signature at a single PSA

is that

ROBIN

system

is implemented node. the

A signature signature

as is node

for a concept(onetype of componentsubnetwork)containsa consta_tsignature pattern), whereas can contain any node plus semantics

subnetwork

Each

slot

component that

unassigned

part

of a ROBIN containing has

symbols,

Since for other

the

identifying, at like at; address

of an

or pointer

tul'es

address

or pointer

order

lnight

to

buttress

neuron Arndt

siblv widely information

their

clump

of a -1. No_ _

can

and

different

con._'trml

concept. associative

from

_he

col_tain

particular

Hence. a signature a.ddr_ssing key.

mechanism

siat_nlllr_'-:

.s_lbl_etu,ol"k.

for

using

as

well

as

is somewhat Certainly. the

using a system to access a concept machinery used in a computer.

to any

inclusion

appear

parts fl'om

of oscillation

possibility

in Section

clumps.

in the

of signatures,

brain

as

Lange

phase-locked

& Dyer patterns

assemblies and central pattern generators. & Dicke (1989) have proposed that observed

separated extracted

has a that

subneIwork Itowever, the'

a giw_u

address

it identifies.

conceivably

pacemaker Reitboeck,

similarity

wl,at

The

in a register

a pa.cticuIar,

system for address-decoding

part.

countenanced

is constant,

identifies

tied

was

registers

other

subnetwork

is not

associator

parts

several

to various

actu_y

in a signature-based in detail from the

to access

In

since

in a concept

a. signature

as a distinct

a higher level of description, a particular or pointer as well as being like an

machinery used mav be different notion

clumt_

subnetwork together wi*h bind nodes) generally in PSA there is no a.ssump_ion

associator

feature,

associator

subnetwork) (a) a concept

at. a]l.

counts

several

the

signa.ture

fixed

frame

a. similar

linking

concepts,

or (by a frame fixed, whereas

is semantically

subnetwork Conposit

(i.e.

node for a role of a fl'ame (another type of component Also, in this view, a component subnetwork (either

associated sigllature node, that is wholly or tm.rtialIy

a component

also

the binding signature.

of visual different

frequency

cortex (in the cat) parts of an image.

and

phase

can

(1989)

note

of oscillation

as a valid

notion

signa-

produced

We may _dso note that phase-locking of oscillations

might be used to transiently This fits in well with the

be taken

that

by

Eckhora. ]n po.s-

link together PSA id¢,a, slut>

of associator

pattern

similarity. PSA,

the

Time

In Section phases.

The

Phase

Technique,

2 we looked

last

paragraph

at

and

a connectionist

of the

previous

phase technique is a PS'k technique and where, furthermore, the temporal at different

nodes

in the

possibly at different that links the node In the

this

idea

view

the

information-structuring leads

phased

& Ajjanagadde

pulsations

an associator

time-phase

(1989)

as providing

pattern

technique

gener_d

are

in the

similarity

as

to each (the

patterns,

other similarity whereas

following

consisting PSA

exploits

sense.

all at the serves

as an

typical

earlier that the time-phase is just a special case of the

dual

are

that

of PSA

address

as a PSA

technique

technique fact that

RPE

in the

similarity

of pattern

similarity case

not

time-

in naluic. of activity

associator

we get

but pattern

away

of a particular, than

the

froln fixed

signature

dual of I{PE. and PSA in

of spatial

of Conposit)

but

ol_ lime the

frequency,

is a temporal RPE in general

exploits

in adjacency',

same

as an

a form

(signature)

is more

is. Now, we remarked now observe that this

patterns

based

observation

every associator pattern is p¢_rel!j temporal is purely a matter of phase. The pulsations

technique We can

of activity

technique

us to the

A node's pulsation at a given phase nodes pulsing at the same phase.

of using

so that

Duality

subsection

in which quality

of Shastri

of diversely

in ROBIN

subnetwork,

system

phases. to other

RPE/PSA

similarit.v

but

positio_ not

their

ot spat.ial

position.

mate.

The negations For instance,

independent

here should it. is possible

PSA-based

be ta.ken to imagine

representations,

as being partial, so that the duality is only approxia PSA-based system in which different modules used so that

the 31

presence

of the

same

associator

I)atlcrn

in

component subnetworks ill two different modules did not constitute a cross-module association. In this case. the PSA in the system as a whole involves a crude type of simih_rit.v of" spatial position, in the sense that an association requires the associator pattern instances to be in the same module and thereby to be in "similar" positions. Conversely, one could imagine a form of llPE in which two adjacent component subnetworks were only taken as being associated if they were in similar states in some sense -- for example, if they both had a certain highlighting flag on, if the system were somewhat Conposit-like. (We should recall that in Conposit adjacency only signifies association if the adjacent registers are in suitable highlighting states. However, it is difficult to see this dependence on lfighlighting as a type of pattern similarity.) PSA

and

BoltzCONS

At a high level of description based on symbol dressing scheme. Given that PSA is also associative, BoltzCONS technique bears to PSA. An CAR field first) field, TAG field objects ill

triples, BoltzCONS the question arises

active triple (a.b,c) in a space called Cons Memory is (its second component) to the (assumed unique) active and similarly by its (:DR field (its third component) to (assuming there are such triples, i.e. that b and c are not the domain of discourse).

uses an associative of what relationship

adthe

considered to be linked by its triple having l)in its TAG (i.e. the active triple having c in its just "'basic" symbols denoting

However, triples are not implemented by separate comt)onent subnetworks as would be required for a simple view of BoltzCONS as using PSA. Each triple is represented by about "28 units in Cons Memory, and the unit-sets for different triples can overlap. Therefore, the question arises of whether one can discern a form of PSA using distributed component subnetworks (= triple representations), a possibility mentioned in Section 4. The simple answer is that we cannot: Cons Memory is simply an unstructured set of units, and the 28 or so units representing (a,I),c) _ecd have no overlap with the set of 28 or so units representing the triple starting with b. We fail to see any useful sense in which activity over the former set is similar to activity over the second, in general. On the

other

hand,

there

does

remain

a sense

in which

BoltzCONS

can be seen as using

PSA. The way a triple in Cons Memory is actually used in processing is for it to be converted patterns of activity over the so-called TAG, CAR and CDR spaces. See Fig. 8. ***FIGURE The

triple

(a,b,c)

is converted

into

the

8 ABOUT pattern

in the

to

HERE*** TAG

space

representing

the

symbol

a,

the pattern in the CAR space representing the symbol b, and the pattern in the C1)R space representing the symbol c. Similarly, on a different occasion the triple (b,d, e) could 10t collverted into the b, d and e patterns over the TAG, CAR and CDR spaces respectively. Now. as far as we can determine fl'om Touretzky (1986), the TAG, CAR and CDR spaces are isomorphic, with one-to-one connectivity between corresponding units, in order to support the simple copying of symbol-representing patterns between the spaces. Under this assumption, the (a, b, c) and (b, d. e) triples lead to the same symbol pattern (for b) being present, only the pattern is instantiated in the CAR space in the case of the former triple and in the TAG space in the case of the latter. A typical operation in BoltzCONS is to convert (a,b,c) in Cons Memory into the representation over the TAG/CAR/CDR spaces, copy the b pattern from the CAR space into the TAG space, set the other spaces to zero activity, use the TAG space to cause stimulation of the representation of (b, d, e) in Cons Memory, and then convert this representation into the corresponding representation over the TAG/CelR/CDR

spaces. 32

The upshotof this is that the combinedTAG/CAR/CDR spacecanbe regardedas _I single component dyi_amically "diachronic"

s,bnetwork that represents different triples at different times, where triples appear through an operation based on pattern similarity. Thus, we might call this scheme a PSA scheme, where, instead of having several component subnetworks simultaneously

present and containing triples synchronically inter-linked by pattern similarity, there is a *ingl( component subnetwork containing triples diachronically inter-linked by pattern similarity. Clearly, these are extreme ends of a spectrum, and we could presumably design schemes that intermix synchronic

and diachronic

Suppose sense in which

PSA.

now that the TAG and the pattern representing

CAR spaces are not isomorphic, and there is no obvious b in TAG space is similar to the pattern representing b in

CAR space• Rather, there is some more or less complex arrangement of connections joining the two spaces, such that the presence of the b pattern in one space can be used to cause the preseace of tile b pattern in the other space• Can we still say we have a diachronic PSA scheme? Our inclil_atioa is to say yes, regarding the b patterns in the two spaces as being similar in an advanced seuse. IIowever, for the reader who objects to this liberal view of similarity we suggest that IIoltzCONS can be viewed as using a diachronic, "pattern-relationship association (PRA) _echnique. PIIA is just

a simple

generalization

Pattern-Relationship

of PSA: Association

PRA is defined by modifying the general presentation of.PSA i,)Sectioa-1 as t(_}!,_,,i_[ An a.ssociator paitern X in one component subnetworK is collsl(lere(l uo ;_ssoc_t_? L,_ patterns XX in other component subnetworks that are related to it in some specified way. rather than similar to it. Then, PRA PRA,

PSA

is simply

for other

an especially

reasons

Reduced

important

species

of PRA.

We will in any

case be bringing

in a moment.

Descriptions,

and

PSA

In our description of the reduced descriptions technique (RDT) in Section 2, if the activity patterns D and DD must sit in distinct subnetworks. Although this a necessary aspect of RDT, we will assume it to be the case for simplicity. Then, if subnetworks component subnetworks, it should be clear that RDT fails clearly' under pattern-relationship association pattern is "related" to another defined as the transformation

Notice can be viewed

and

we talked as is not in fact we call these the notion of

(PRA) described at the end of the previous subsection• In RDT, a if it is a reduced version of the other or vice versa, reduction being effected by T. (Of course, we need to know which way round the

reduction is going here, but that subnetworks is non-homogeneous). In this way, RDT

in

is no problem

PSA can be viewed

also that a new version as using a version

of Conposit

of RDT

that

in the

systems

as sibling (Barnden

includes

cited,

techniques, & Srinivas,

since

the

with

PRA

to appear)

PSA as a subcomponent.

set. of component

as their re%rred

parent. to earlier

In Fig. 5 (for "'Bill

hopes that John loves Mary") the unassigned symbol X in the new version would be computed from the states of the registers holding the LOVES, JOHN and MARY symbols. The combined state of those registers can be taken as the activity pattern D in the previous paragraph. X is therefore the corresponding reduced representation d for "John loves Mary"• The register clump for the "Bill hopes" aspect of Fig..5 uses the reduced representation X (i.e. d). The symbol/highlighting activity pattern over this clump can be taken as the pattern DD. The reduction mechanism T is the subsysteln for computing unassigned symbols from states of appropriate registers. The inverse 33

mechanism takes

t is less

an

obviously

unassigned

attention

(e.g.

by

view,

PSA

arising

tile

Reduced Earlier

that

to this

relationship,

strong

artificial

of X can

transformation

I) itself.

related.

The

entry

information

the

out

that

are

certainly

First, unlike

the

in the

second

access fact

that

via

the

hash

hashing

a. table-entry This method Stallings

key

for accessing

reflect

1987,

one

difference

similarity,

that

that system's

Undel'

also

Conposit

Ibis

attention

plays

hashing

was

RDT

can

bv conncctionists

in computer

in fu]l

be regarded

science

and

several

d that are used smaller datum

form.

toge_ her with

as a mechanism

can, however, picture, but

have

of

transhashing

as follows.

D itself

can

a type

as the reduction theft RDT and

t that

be _'collisions", when does not significantly

or many

D mapped

to the

will continue to suppress interests of abbreviating

that

is that

representations similar data

(Hinton structures

of collisions.

Ilowever, of task

the

as a way of building

patterns

this

it can reduced

For instance,

feature hashing in the

data

and

could

typical

which

our be

relatively

of data

structure llowever.

purpose. is ti_at locations, of a sort they

merely

the

came).

reflects

values

in essence, (in that they However,

the

fact

th_tt

rather than in associative memories. After associative ones. However, in a computer

d of a bit-string

addressing

D is used

D bit-strings

as an

are

in fact

key

rather

than

as

the d values in RDT. Vax 11/780 computer associative addresses

addressing of blocks

in

point.) desirable

descriptions be effected

of hashing tends

D from

(in essence)

are

structures,

hashing

in hashing look much more like scheme for cache memory in the (The

they

for this

as an associative

1988). On the other hand, hash to markedly dissimilar

to which

but a mode

used

RDT

addresses

be used

to D.

disturb

that

affording

it being

between

t to the

a subsequence

operations

merely

computer memories, more common than

not

as

RDT,

used

to

location

related

and

used

prevents

point

d can indeed

does

hashing

T are normally addresses of storage by T are associative addressing keys

mechanism

in ordinary are much

so that

associative

technique.

might

normally

p.114):

A third

the

are

information but

types

gener,_lly

function d produced

memory,

memory,

the

X.

D to integer hash keys key d is usually a much

table

between

address, making the d values is used in the "set-associative"

(see

pattern

hastfing and

with d. There complicates the

latter

little

contains

the

new

in our description of RDT, and on RDT and hashing, in the

technique

expansion keys

an associative

main

the

differences

is not

difference

hashing occurs mainly all, ordinary memories with

typically

is predonfinantly

hashing

d produced by the hashing whereas in RDT the values allow

since

some

hashing

RDT,

is nothing The

contain

the

)

superficial there

to RDT,

suppressed this complication of collisions in our comments

There storage:

switch

of t.

been

that

of RDT

Therefore

the

discussion.

to

can be regarded case of the fact

has

role

in HT

D.

weaken

relationship

tha.t

by T in the

T maps data structures table HT. The hash

d indexes with

there

similarity

other effects) delivers D on presentation different D map to the same d. This

same d. (We c_nsideration

of lnechanisnls

is able

as a part

performed

important

the

function in a hash

associated

collection

and

of registers

be seen

Surprisiugly,

despite

We bring

a hashing of entries

as the

Hashing

(among several

the

be taken "a_ registers,

to neighbors

sharing

the

strongly

intelligence. In hashing, positions

extra

can

it to

\Ve have also just said that this transformation in an application of RDT. This is just a special are

than

the and

we said

but

broadcasts changes)

from

in computers

as the

X,

highlighting

Descriptions

"hashing". formation

isolatable,

symbol

for

the

can

take

more

is really

case

when

driven

34

and the

part

in

elaborately

in hashing, hash keys,

to be applied,

reduction

mechanism

T to

associative and

preserve

operations

accurately

on

that the

full

one often tries hard to ensure that in order to minimize the probability by special has

hashed

nothing data

considerations to do structures

with

to do with the

esseuce

zLre progra,nming

of

language symbols, symbols which are simi]ar as strings of characters usually have no s_,manlic relationship whicll ti_e language processor is expected to respect, so that there is no reason ii)l wanting them to hash to the same or similar keys. IIowever, such collisions or near-coUisiol_s could be desirable if they result from meaniTz_jful similarities among the hashimilarits bv the hashing function could enable positions similar to a given one, P, to be found ellicielttly by looking at the positions indexed bv hashing function values appro_mately equal to the hashiag function value for P. I[ashing is used in computers precisely because it is a way of avoidir_g .sequcl_tial _(:(ll'(:/z to a large extent. On average, it allows very fast retrieval, with ahnost no search of inemory, sinc_ (ideally) the value d gives immediate access to D. (The possibility of search arises because o[ collisions, but with the right parameters and inethods the effect of collisions can be made _malI on average.) 'l'his point is highly significant from the point of view of seeing how comtcctiouism relates to other areas of computation science. It is too often a.ssumed that one sharp contrast between connectionist systems and standard symbolic systems is that the former substitutes l'ast (parallel) memory access mechanisms there is a significant amount of truth Finally.

we observe

that

the

for slow (sequential) memory-search mechanisms. in this, it greatly oversimplifies the true situation. new Conposit's

hashing

is appreciably"

more

Altho_lgh

l_ike conveational

hashing in computers than prototypical RDT is. This is because we can regard the whole CX[ it.sell as the analogue of a hash table. Going back to the example we used in Fig. .5, the unassigned symbol (hash key) X provides access to the "hash table entry" consistiug of tke regist.cr cl_mi_>_ that itlvolve X. In RDT as found in the systems cited, there is no such clear an_dog_le of a. hash table. Indeed, in those systems the mechanism new Conposit an/1 in computer hashing, mechanism. Having said this, it would access and reconstruction. Pointers

and

Associative

t is not a straightforward acce.s.s mechanism as it is i_t th,' but is more straightforwardly speaking a _.ecoz_.,_t_,cti,J, be difficult to draw a sharp [ia(_ betw_'en the ltoliolls o1

Addressing

Many people probably think of associative addressing and pointers in computers as very different techniques. However, they are more similar, both at a conceptual level and at the hardware level, than is usually observed. In discussing this issue briefly it will become apparent that the notion

of pointers

is less antithetical

To take the conceptual as a function _r from the set

to connectionism

than is often

assumed.

level first, we view an instantaneous state of a computer memory ;I of memory cells to the set B of bit-strings of the right length.

Now, a function from ?,I to B is mathematically just a special type of relation a relation from M to B is in turn just some set of ordered pairs (re, b) witere

from M to B: al_d m is in :X/ and b is

in B. Also, there is a one-to-one correspondence between M and the set .4 of bit-strings that can be interpreted as addresses of memory ceils. (A is a subset of B.) Therefore, we may conceptt_atlv recast a memory state _r as a set of ordered pairs (a,b) where a is in .4 and b in B. \Ve may further recast by replacing each ordered pair (a,b) by the concatenation ab of the strings a and b. \Ve have therefore recast a memory state _ as an unordered set of bit-strings. Hence, the following of a pointer p held in a memory cell is to be recast as finding the unique bit-string of the form pb (for some b) in the current (recast) memory state. Viewed this way, pointer following is just a type ot' associative addressing. It is, of course, a very special type, in that the set of possible as._ocia.tive tags p is regarded as a set of consecutive on the tags (and therefore, for instance,

integers allowing

in some range, a.1]owing arithmetical Ol_erations the sequential allocation technique to be tlsed as

well ). 35

Our strike

suspicion

many

is that

readers

as

this

being

norinal views of such notions sort of low-level architecture tags

of some

are

in the

seeing to cut

sort,

but

business

more

explicit

and

such

as used

in Section

as names

about

what

we mean

exploited

is concerned

there

given

that

is not

in our The

view

of pointing,

question

then

low

of the

level

about the of pointer_

specific just as

tlowevcr,

if we

system, and

in particular

If we stick we may

conclude

that

pointers

and.

associative

between

to be something

to a different

to an abstract say.

view as far

connectionism

becomes we wish

is good

a qlfite natural special to include l¢_ss abaci'act

conclusion.

of computer

pointing,

if any.

might

w(, _t oJ"Massively Parallel 5'cientiJic C'om I)_tatio a. N A S :\ Con ['_,re_L ce

Barnden, J.A. (1988). Conposit, a neural net system for high-level symbolic processing: overview of research and description of I-egister-maeliine level. 3lemoranda in Comp,ter a_zd (2'og_iti_,_:ScieTzce, No. MCCS-88-14.5, Computing Research Laboratory, New Me_ico State University, Barnden, J.A. (1989). Neural-net implementation of complex symbol-processing in a mental model approacl{ to syllogistic reasoning. In Procs. iIth Int. Jobzt Conf. on Artificial [nlc'llige_,ce. San Mateo, CA: .hlorgan Kaufmann. Barnden, J.A. (1990). Encoding conlplex symbolic data structures with some tutusual techniques. In J.A. Barnden _c J.B. Pollack (Eds.), AdvaTtces in Co_z_ectioni.sl Computation

Theor'g,

Vol.

I. Norwood,

N.J.:

Ablex

Publishing

conn('ct

and

ionist .\'et_r,/

Corp.

Barnden, J.A. & Srinivas, K. (to appear). Overcoming rule-based rigidity and connecti_ai.,l limitations through massively-parallel case-based reasoning. To appear in I_tt..]. 3[a_>.ll,,:lzi,, Systems. Barnden, J.A., Srinivas, K. & Dharmavaratha, D. (1990). Winner-take-all networks: tinlc-l)ased versus activation-based mechanisms for various selection goals. In P_'ocs. IEEE ]lZtCvt*reZiOl_,/ SYmi)osi_m

on Circuits

and Sgstems,

New Orleans,

May 1990.

Charniak, E. & Santos, E. (1987). A connectionist context-free parser which is not contex;.-frce, but then it is not really' connectionist either. In Procs. 9th Annual Co_@re_ee of the (',:,9_ziti_,, Science Societ 9. tIillsdale, N.J.: Lawrence Erlbaum. A revised version appears in J.A. Barnden & J.B. Pollack (Eds.), Advances in Connectionist and Neural Compz_tation Theow, Vol. 1. Norwood, N.J.: Ablex Publishing Corp. Closslnan, G. (1987). A model Ph.D. Thesis, Computer

of categorization Science Dept.,

and learning in a connectionist Indiana University, Bloomington,

broadcast-system. IN.

Eckhorn, R., Reitboeck, It.J., Arndt, M. & Dicke, P. (1989). Feature linking via stimulus-evoked oscillations: experimental results from eat visual cortex and functional implications from network model. In Procs. Ist Int. Joint Conf. on Neural Nettvort's, 1.'51. L IEEE. Feldman, J.A. 27-.39. Fodor,

(1982).

Dynamic

connections

in neural

networks.

Biological

J.A. & Pylyshyn, Z.W. (1988). Connectionism and cognitive architecture: In S. Pinker k J. Mehler (Eds.), Connections and symbols, Cambridge, and Amsterdam: Elsevier. (Reprinted from Cognition, 28, 1988.) 39

C'gber_elic.s,

a

46, pp.

a critical :_na.lvsis. Mass.: _[IT Press,

Go(l(laM,G.V. (1980). Componentpropertiesof the memorymachiue:llebt) revisil('d. [ILI'.\V. .Jusczyk_ R.M. Klein (Eds). The Nature of Thought: E.,s¢19._ i, lion,or ,2]" l). U. ll_bb. tlitlsdale,

N.J.:

Lawrence

Erlbaum.

IIend]er, J.A. (1989). Marker-passing over microfeatures: model. Cognitive Science, i3, pp. 79-106.

towards

l-linton,

G.E. (1981). ence. In Procs.

Hinton,

G.E. (19881). Representing part-whole hierarchies lOlh Annual Conf. of the Cognitive Science Societp.

ttiuton,

G.E. Annual

Hwang,

K. _ Briggs, McGraw-tliLl.

.Jeusen,

A parallel computation that assigns canonical 7th Int. Joint Conf. on Artificial Intelligence,

Ok:Plaut, D.C. (1987). Colq" of the Cognitive

K. & Wirth,

a hybrid

F.A.

(1984).

N. (1974).

symbofic/couu_.,tiollis_

object-based frames of referVancouver, British Columbia.

in connectionist Hiltsdale, N.J.:

networks. In t)roe._. La.wrence Erlbauln.

Using fast weights to debtnr old memories. In Frocs. Science Society. tIillsdMe, N.J.: Lawrence Erlbaum. Computer

PASC\4L

architecture

user manual

and par(did

and report.

processin 9.

2rid. ed.

New York:

New

9th

"fork:

Springer-

Verlag. Johnson-Laird,

P.N. (1983).

Mental

models,

tlarvard

University

Press:

Cambridge.._la_.

Kohonen, T., Oja, E. & LehtiiS, P. (1989). Storage and processing ofinformatioll in distril)_lted associative memory systems. In G.E. Hinton & J.A. Anderson (Eds), Pa,",llcI ._l,,hl._ ,q Associatirc :lfemory, Updated Ed. Hillsdale, N.J.: Lawrence F:rlbaum. Lange,

T.E. & Dyer, M.G. (1989). Dynamic, non-local role bindings and inferenciug iu a localis_ network for natural language understanding. In D.S. Touretzky (Ed.), Advanee.s iJ_ .\:(:?lva! Information Processing Systems I. San Mateo, CA: Morgan Kaufmann, pp. 5-15 552.

Lange,

T.E. & Dyer, M.G. Science, i (2).

(in press).

High-level

Lehnert, W.G. (1990). Symbolic/subsymbolic In J.A. Barnden & J.B. Pollack (Eds.), Theory, Vol. 1. Norwood, N.J.: Ablex McClelland, D.E.

Pinker,

network.

Co_cction

model Parallel

of reading. Distributed

In J.L.._IcClelland, Processincj, l.%l.

2.

of context

in

MIT Press.

J.L. & Rumelhart, D.E. (1981). perception: Part 1. Psychological

An interactive activation model Review, 88, pp. 375-407.

effects

S. & Prince, A. (1988). On language and connectionism: analysis of a paratlel distributed processing model of language acquisition. In S. Pinker & J. Mehler (Eds.), (',)l_J_c_t_io_._ and symbols, Cambridge, Mass.: MIT Press, and Amsterdam: Elsevier. (Reprinted from Cognition,

Pollack,

Mass.:

in a connectionist

sentence analysis: exploiting the best of two worlds. Advances in Connectionist and Neural Computation Publishing Corp.

J.L. (1986). The programmable blackboard Rumelhart and the PDP Research Group,

Cambridge, McClelland, letter

inferencing

28, 1988.)

J.B. (1987). Cascaded back-propagation 9th Annual Conf. of the Cognitive Science

Pollack, J.B. (1988). Recursive resent_tions. In Procs. Lawrence Erlt)aum.

on dynamic connectionist networks. I_, Proc.,. Soc. tlillsdMe, N.J.: Lawrence Edbaum.

auto-associative memory: devising compositionaJ distributed lOth Annual Co@ of the Cognitive Science Soc. tlillsdale,

Rumelha, rt, D.E. & McClelland, J.L. (1982). letter perception: Part 2. Psychological

An interactive activation Review, 89, pp. 60-94. 4O

model

of context

l'ep.N.J.:

effects in

l{umelhart, D.E. _ .hlcClelland,J.L. (1986).On learningthe past ten._cs ot"15ng]}:l_ v.IlL.J.l.. McClelland,D.E. l{ulnelhart andthe PDP Research Group, Partfllcl Di._t,ibut