Evolving Grounded Spatial Language Strategies

0 downloads 0 Views 1MB Size Report
artificial agents that engage in communicative interac- tions about objects and actions in the physical world. [25,12,32,3,27]. We have now ... the.NOM linke left.ADJ.NOM. Block block.NOM. 'The left block',. (2) links left.PREP.GEN des the.DET.
Noname manuscript No. (will be inserted by the editor)

Evolving Grounded Spatial Language Strategies Michael Spranger

Received: date / Accepted: date

Abstract Each natural language phrase is evidence for a particular strategy of construing reality. One domain where this has been extensively studied is spatial language, which reveals an enormous amount of variation of conceptualization strategies both within a particular language and cross-culturally. This paper proposes a computational formalism for representing conceptualization strategies and shows how the formalism can be used to study and explain the evolution and emergence of spatial conceptualization strategies and their impact on shared grounded communication systems. Keywords symbol grounding · language evolution · procedural semantics · conceptualization strategies · spatial language

1 Introduction Numerous studies and experiments have shown how grounded shared lexicons can emerge in populations of artificial agents that engage in communicative interactions about objects and actions in the physical world [25,12, 32, 3, 27]. We have now a body of theories and mechanistic explanations for the emergence of sensorimotor category systems and their co-evolution with language especially for perceptually grounded domains such as color [2], space [17] and actions [28]. These studies demonstrate the feasibility of artificial, open-ended, grounded communication systems and they have contributed significantly to our understanding of the soMichael Spranger Sony CSL Paris, 6, rue Amyot 75005 Paris, France E-mail: [email protected]

cial and perceptual grounding of symbols in artificial systems. Most of this work, however, considers only simple models of language where utterances consist of one or multiple words, without taking into account the complex structure of language. However, natural language involves more than just lexical meanings. Rather, in human language the syntactic structure conveys meaning. For instance, the following two examples from German involve the same spatial category, determiner and object class. However, they do not mean the same thing and they will most likely have different referents when uttered in the same context. (1) der linke Block the.NOM left.ADJ.NOM block.NOM ‘The left block’, (2) links des Blockes left.PREP.GEN the.DET.GEN block.GEN ‘to the left of the block’, Notice that the difference between these phrases is not one of compositionality vs no compositionality, which has itself also been the subject of many symbol grounding studies [13, 4,31]. The difference between these two phrases is one of syntactic structure. The lexical classes, word order, case and morphology are different. Most importantly for our purposes, the syntactic differences hint at significant differences in semantics. In particular, they communicate different instructions how to process a particular real-world situation in order to identify a referent. In the first phrase, the speaker asks the hearer to find a set of blocks, and then to identify the leftmost block of the set of blocks. The second phrase actually references a region, namely the region to the left of the box [30]. In summary, both phrases

2

Michael Spranger robot-2

box-1

obj-253 obj-249

robot-1

speaker

robot-2 obj-265 box-1 obj-266

sensorimotor systems

conceptualisation

Fig. 1 Spatial setup. To the left and right the world model extracted by the each robot are shown.

reference

reference

action

interpretation

meaning utterance

parsing

Fig. 2 Schematic of the systems involved in autonomous processing of spatial scenes such as the one shown in Figure 8. Left are the processes the speaker goes through. Right shows the processes the hearer is running.

3. 2 Situated Spatial Interactions

1. The speaker selects an object out of the context, further called the topic T . 2. The speaker tries to find a conceptualization strategy comprised of spatial relations, category and con-

sensorimotor systems world model

meaning

production

One of the key methodologies for research and validation of theories of grounding are situated interactions called language games. Language games are interactions of two or more agents in the real (or in a simulated) world. For the experiments presented in this paper, we used Sony humanoid robots [5]. The robots encounter various objects in their environment and they are programmed to talk about these objects. Figure 8 shows the environment in which two robots interact. Robots are equipped with a vision system that fuses information from the robot’s camera with proprioceptive sensors distributed across the body. The vision system singles out and tracks objects [15, 20]. Here, the environment contains four types of objects: blocks, boxes, robots and geocentric markers. The vision system extracts the objects (as blobs) from the environment and computes a number of raw, continuous-valued features such as x, y position, width, and height. A spatial language game follows a script between two randomly drawn agents from the population P of agents. One acts as the speaker, the other as the hearer (see Figure 2 for an overview of processing steps involved).

world

world model

goal

robot-1

reveal different conceptualizations of reality using the same concepts and categories. In this paper we investigate 1) how to represent different conceptualization strategies, 2) how different conceptualization strategies influence the development of lexical systems and 3) how conceptualization strategies themselves can evolve and emerge in artificial systems. First though, we give a brief account of the experimental setup used in subsequent sections.

hearer joint attention

4.

5.

6. 7.

8.

strual operations for describing the topic. This process is called conceptualization. The speaker tries to express the conceptualization using his knowledge of the language (production). E.g. he looks up the word associated with the spatial relation in his memory and produces the word. The hearer parses the phrase (parsing). For instance, when hearing a spatial term he looks up which relation is associated with this word in his memory. When the hearer was able to parse the phrase or parts of the phrase, he examines the context to find the object which satisfies the conceptualization strategy he thinks underlies the phrase (interpretation). The hearer points to this object. The speaker checks whether the hearer selected the same object as the one he had originally chosen. If they are the same, the game is a success and the speaker signals this outcome to the hearer. If the game is a failure, the speaker points to the topic T .

Such an interaction can fail at different points in the interaction script. For instance, the speaker might be unable to discriminate the topic T because he has no adequate spatial relation or conceptualization strategy (step 2 fails). Success and failure of communication provide opportunities for agents to learn or adapt their internal representations. The exact mechanisms for evolving agent-internal representations are discussed later in this paper.

Evolving Grounded Spatial Language Strategies (bind

(identify-location-proximal

proximal-category

?cat

?target

?src

?cat)

(geometric-transform

?src

?ctx

(apply-class

(apply-selector

(bind

selector

?landmark

?boxes

?selector)

?selector

unique)

(bind

3

near)

?landmark)

?boxes

?ctx

?class)

(get-context

?ctx)

object-class

?class

box)

Fig. 3 IRL-program representing the semantic structure of the phrase “near the box”. The connections between the operations and bind-statements signify variables appearing in multiple places of the graph.

3 A Computational Framework for Complex Grounded Semantics To represent conceptualization strategies, we developed a formalism called Incremental Recruitment Language (IRL) [21, 19, 24]. The key idea behind this formalism is that the semantics of natural language can be modeled as a semantic program [6]. In IRL, the meaning of an utterance consists of an algorithm and data pointers that when executed by the hearer will lead him to identify the reference or topic. Let us exemplarily consider the phrase “near the box”. The phrase consists of a spatial relation (near), the concept (object class) box, and a determiner. Moreover, the phrase encodes a particular conceptualization strategy where the spatial relation is used in conjunction with a landmark. An interpreter of the phrase, has to construe reality from the landmark (the box) and apply the spatial relation. Figure 3 shows a graphical representation of the IRL-program underlying the phrase. Such programs consist of two things and links between them. Cognitive operations, also called semantic operations, are the algorithms used in conceptualization. They encode a particular cognitive function such as categorization using a spatial category, applying a selector or applying an object class. Cognitive operations are identified by their name, e.g. identify-location-proximal and they have a set of arguments, which can be linked to other operations or semantic entities via variables (starting with ?). Semantic entities are the data that cognitive operations work with. They can be prototypes, con-

cepts and categories. Besides such long-term data, semantic entities can also be discourse representations, the representation of the current context and data exchanged between cognitive operations. They are introduced explicitly in the network via bindstatements, which are special operations for retrieving the actual data representation using a pointer or shorthand notation for it. For instance, the statement (bind proximal-category ?cat near) encodes the access to the proximal category near. Linking via variables is important because it represents how data is flowing in the program. In the IRLprogram in Figure 3, the first argument of the operation geometric-transform is linked via the variable ?src to the second argument of identify-location-proximal. This means that the category is applied on the scene transformed to the viewpoint of the box. Evaluation Let us assume the hearer wants to interpret the example phrase “near the box” and has decoded the IRL-program in Figure 3 from the utterance. In order to find the referent of the phrase, the hearer will execute, i.e. evaluate, the program. First all bind-statements are evaluated, after which the variable ?class is bound to the concept box, ?selector to the determiner unique, and ?cat to the spatial category near. After that, the evaluation engine will try and find cognitive operations that can be evaluated. Probably, the first cognitive operation to be evaluated is get-context which binds the variable ?ctx to all the objects identified by the vision system of the hearer robot. Next apply-class is called with the arguments ?ctx and ?class, which means with the objects from the context and the box object class. The apply-class operation will filter all objects from the context which are boxes and bind the resulting set to the variable ?boxes. This is followed by the application of a uniqueness constraint through the operation apply-determiner, which binds the object “the box” from the set of objects in the context to the variable ?landmark. The box is then passed to geometric-transform which will use it to transform the context. After that the spatial relation is applied through identify-location-proximal. Here that operation identifies an object from the context and binds it to the variable ?target – the referent of the phrase. The above description sounds similar to normal control flow in computer programs. However, language processing requires data flow rather than control flow [24]. The program in Figure 3, therefore, does not describe a necessary sequence of computations but just how the data is linked through variables. Consequently, cognitive operations differ from standard functions, e.g. in C/C++, in that they are multidirectional. For instance,

4

Conceptualization strategies The IRL-program in Figure 3 is part of a general conceptualization strategy, namely the proximal spatial strategy, which in English also includes the relation far. If we remove the spatial relation near from the semantic program depicted in that figure, we are left with a template, which involves a landmark (the box) and a (unspecified) proximal spatial relations. That template would be equivalent to “X of the box”, where X is some proximal category. We call such partial structures chunks [19]. Chunks are reified conceptualization strategies. They have a score, which represents how much the agent prefers the strategy over others (e.g., see [11] for preferences in perspective choice). Spatial conceptualization strategies involve more than just the choice of a spatial relation. Landmarks, perspective, frames of reference [29] are all important aspects of the construal of spatial reality. Researchers are still mapping out the taxonomies and unifying theories for the vast amount of spatial conceptualization strategies found in natural language [9]. For instance, which landmarks can be used with a particular spatial relation – just people, animals or also inanimate objects – is part of the choices manifest in a particular strategy. We can represent all these different factors using distinct cognitive operations, chunks and IRL-programs.

1

communicative success

the operation apply-class, computes a set of boxes when given a set of objects and the object class box. But, the same operation is also able to compute, given two sets of objects, the object class most likely to transform one set into the other.

Michael Spranger lexicon only vs grammatical marking

lexicon only

grammar

0.8 0.6 0.4 0.2 0

similar perspective

different perspective experimental conditions

many objects

Fig. 5 Communicative success of agents equipped with a number of German locative conceptualization strategies. Two populations are compared on three data sets. The data sets vary in number of objects and perspective of robots on the scene. In the left data set all spatial scenes contain only two objects, robots are positioned close to each other, and robots face the same direction. In the right data set there are multiple objects in scenes (up to 14) and the perspective of robots on the scene varies significantly. There is an example scene from each data set shown. All populations interact 10000 times on each data set. The bar shows the average communicative success (1.0 means complete success in all interactions).

Grammar vs lexicon-only One way to test the formalism is to see whether it can be used to formalize Conceptualization and Interpretation IRL includes mechnatural language processing so that artificial agents can anisms for the automatic and autonomous configuraautonomously produce and interpret natural language tion of IRL-programs. Agents use these facilities in two phrases. We have carried out such modeling for German ways. First, when the speaker wants to talk about a locative language. particular object in the environment, he conceptualizes Figure 5 shows results of robots interacting in varia particular IRL-program for reaching that goal (see ous spatial scenes. In these experiments all agents share conceptualization in Figure 2). Secondly, a listener tryGerman locative conceptualization strategies. This ining to interpret an utterance will construct and evaluate cludes projective spatial relations (such as front, left), programs, in order, to find the best possible interpretaproximal relations (such as near, far) and absolute retion of the utterance (see interpretation in Figure 2). lations (such as north, west). Moreover, the robots and Interpretation and conceptualization are implemented boxes can be used as landmarks and perspectives. We as heuristics-guided search processes that traverse the tested two types of populations. Grammar agents are space of possible IRL-programs. The basic building blocks given a full German locative grammar1 . Lexicon-only for the search are IRL-programs packaged into chunks. agents are not given German grammar but only opThe IRL search process progressively combines chunks erate a German lexicon. Their utterances are bag-ofof IRL-programs into more and more complex IRL1 programs. Each program is tested for compatibility with Obviously, there is more to the system than we have space to explain here. In particular, German requires the handling the goal of the agent, as well as the context. Figure 4 of complex syntactic phenomena such as morphology and shows an example of such a search process. complex word classes, and semantic ambiguity that fall beyond this paper (see [18, 16] for a detailed discussion).

Evolving Grounded Spatial Language Strategies

5

Fig. 4 Part of the search tree involved in conceptualizing the IRL-program seen in Figure 3. From left to right, nodes represent progressively growing IRL-programs, which are each tried out and in some cases lead to solutions (green nodes).

words phrases. The numerical results show the success of communication and validate the whole system. Moreover, the higher success of grammar agents provides evidence for the importance of grammar for marking conceptualization strategies. Especially when the world gets complex with many objects and robots having different perspectives on the scene, grammar is an important factor in communicative success.

4 Social Grounding of Spatial Conceptualization Strategies Conceptualization strategies are not fixed or innate. The study of natural language reveals that there are in fact many different ways of construing spatial reality [8]. For instance, the Mayan language Tenejapan exclusively relies on an absolute strategy consisting of uphill-downhill distinctions for referring to objects [9]. English features another type of absolute strategy in the form of north-south, east-west distinctions. Recently the origins of conceptualization strategies have come under investigation using artificial systems (see [1] for preliminary investigations). Three important concepts guide our discussion.

and ecological significance. To organize competition and selection, the overall success of a strategy and the associated ontology and lexicons are tracked. Alignment Language is a phenomenon that occurs in the interactions of individuals of a group of language users. Language strategies or any linguistic material are invented in local interactions in which few members of a population participate. Different parts of the population might invent other strategies. This poses a problem. For language to be usable, it needs to be conventional, and it needs to be known to the complete population. Alignment is the process by which a strategy and the corresponding language systems spread in the population. Alignment operators organize the alignment of strategies by updating the same scores of strategies used for orchestrating selection and competition.

5 Evolution of Grounded Lexicons

In order to build a shared grounded lexicon, agents need a conceptualization strategy. For instance, to build a projective communication system (similar to front, back, left right in English) agents need a strategy that constructs angles from objects in the sensorimotor space Recruitment Conceptualization strategies are grounded and builds an ontology and lexicon of angle-based catin general cognitive capabilities and operations [26]. egories. For instance, the absolute strategy in English reThis section shows that given a predefined projecquires that agents are able to categorize objects ustive conceptualization strategy (a chunk) and a set of ing spatial categories that relate to particular geoinvention, adoption and alignment operators concrete centric features of the environment. In the English systems of spatial relations can be coordinated by arabsolute system this is related to compass readings tificial populations. Projective categories are described and map use [29]. The categorization of these obby an angle a, and a σ value. Together, they feed into a jects themselves is a cognitive ability that needs to similarity function implemented in the cognitive operbe present before a linguistic absolute spatial sysate identify-location-projective. The similarity functem can form. tion of an object with respect to a category is the follow1 Selection Once a strategy has formed it is used to ing sima (o, c) := e− 2σc |ao −ac | with ao being the angle build a concrete system of spatial categories and linto the object and ac being the angle of the category. guistic means to express them. Systems and strateThe similarity function is used in categorizing objects gies are subject to selective pressures. Other stratein the sensorimotor context. gies might compete in terms of success, expressivity

6

Michael Spranger 1

Repair

When the speaker cannot conceptualize a meaning (step 2 of the spatial language game fails). The speaker constructs a spatial relation R given a projective strategy and the topic pointed at. The new category is based on the angle observed for the topic object (the initial σ is small 0.1). Additionally, the speaker invents a new construction (link) associating R with s.

Table 2 Adoption: Hearer encounters unknown spatial term s Diagnostic Repair

When the hearer does not know a term (step 4 fails). The hearer signals failure and the speaker points to the topic T. The hearer then constructs a spatial relation R based on the relevant strategy and the topic pointed at. Additionally, the speaker invents a new construction associating R with s.

Invention and learning operators Initially, agents are given a projective conceptualization strategy that uses the angle of objects to build a category system. The concrete category system, however, is not given. Instead, agents are given invention and learning operators that in certain situations will add new projective categories and words to the inventory of an agent. Tables 1 and 2 detail the two operators given to agents. Category alignment After each interaction, agents update the angle of categories used in the interaction. Categories keep a sample set S of observed angles. S is updated with the new observation after an interaction involving the successful use of a category. Subsequently, angles are averaged over S. The new prototypical angle ac of the category is computed using the following formula for averaging angles. ! 1 X 1 X ac = atan2 sin as , cos as (3) |S| |S| s∈S

s∈S

The new σ value σ 0 of the category is adapted using the following formula.   s X 1 σc0 = σc + ασ · σc − (ac − as )2  (4) |S| − 1 s∈S

4

0.8

3.5

communicative success

Diagnostic

4.5

3

0.6

2.5 2

0.4 communicative success # categories # constructions interpretation similarity

0.2 0

# of categories

Table 1 Invention: Speaker cannot find a discriminating spatial category in production

0

1000

3000 5000 7000 number of interactions

9000

1.5 1 0.5 0

Fig. 6 Results for a formation experiment in which agents develop a projective category system.

This formula describes how much the new σc of the category c is pushed in the direction of the angle standard deviation of the sample set by a factor2 of ασ ∈ [0, ∞]. Lexicon alignment The invention and adoption operator introduce a particular problem – the problem of synonymy. Synonymy occurs when an agent explicitly represents that a spatial category can be named using different spatial terms. Allowing agents to track synonymy in their lexicons can be beneficial for overall lexicon size, but only if agents also have additional mechanisms for managing synonymy. Such a mechanism, called lateral inhibition, was introduced in [23]: – In case the interaction was a success both speaker and hearer reward the winning construction – the one used in production and interpretation – by a score of δsuccess . Competing constructions are punished by δinhibit . There are two types of competing constructions. First, there are those constructions which associate the same spatial relation but with a different word. Second, there are constructions that link the same word to different spatial relations. – After a failed game, both speaker and hearer decrease the score of the used association with δfail . Measures To be sure that our approach to category formation works reliably, we test it by running multiple trials of the same experiment. In each trial agents start with an empty ontology and lexicon. Success, performance and language development of the population are tracked using the following measures. 2 α is given by the experimenter and in all experiments described here α = 0.5

Evolving Grounded Spatial Language Strategies

Communicative Success is the most important measure as it reflects the overall performance of the population. Every interaction is either a success or a failure. Success is counted with 1.0 and failure is counted as 0.0. Number of Categories and constructions measures the average number of categories and constructions known to each agent. Interpretation Similarity is a measure tracking how similar the interpretation of each word known to each agent is. For this the categories attached to the word in each agent is compared. Since projective categories are described by a direction and a similarity function width parameter σ, two categories are most similar (1.0) when both angle and σ are equal. Results Figure 6 shows the dynamics of experiments in which 10 agents start without any categories and constructions and gradually have to solve their communicative problems by invention and adoption of linguistic and semantic material (25 trials). In each trial 10000 spatial language games are played, with two agents randomly drawn from the population, interacting, and inventing, adopting and aligning linguistic knowledge. The graph shows that agents are able to form successful language systems that gradually become more and more similar in the population as the linguistic knowledge spreads from agent to agent. After 10000 interactions agents are communicating successfully in over 95% of the interactions. In all trials, the population agrees on using a total of three spatial relations and corresponding names. Due to space constraints this section only examined a projective strategy. Similar propositions hold for other spatial category strategies such as absolute (similar to north, east, west and south) and proximal strategies (near, far) [17].

6 Selection and Alignment of Spatial Conceptualization Strategies In this section we investigate how conceptualization strategies themselves can be negotiated through local interactions by agents in a community. The idea is that a particular strategy survives when it is relevant to an agent because it is efficient and useful in discriminating objects and it contributes to the agent’s communicative success Selection and Alignment Selection of a strategy is intricately linked to the success of the spatial category

7

system it builds. For instance, if an agent is building a language system with an absolute strategy this entails that the absolute relations and the strategy itself are subject to the similar selective pressure. It is the success of the overall system, i.e. the spatial relations together with the performance of the strategy, that drives the organization of the syntactic and semantic repository of the agent. The previous section talked about the invention and alignment of words and categories for a projective category. The same operators are used for building different language systems. Additionally, the success of a strategy (chunk), is tracked after every interaction by updating its score. If the conceptualization strategy was used successfully its score is increased by a factor δsuccess otherwise it is punished by δfailure . All other conceptualization strategies not used are punished by the score δcompetitor . The value of these deltas is typically by a magnitude lower than the deltas for updating categories and words. Measures We test our approach by running experiments in which agents are given different conceptualization strategies. To monitor the alignment of conceptualization strategies we use additional measures. Number of chunks averages the number of conceptualization strategies over every agent. Strategy similarity signifies how similar the strategies of agents in the population are. A score of 1.0 means that all agents have the same strategies and all agent give the same score to each strategy. Experimental Setup and Results We test the power of strategy alignment using contexts which can be manipulated to feature absolute and intrinsic properties. More specifically, we manipulate the distribution of intrinsic and absolute properties in the environment. Figure 7 shows the dynamics of an experiment where agents start equipped with two strategies: an absolute and an intrinsic one. The environment is such that it favors absolute systems. In 50% of the scenes both intrinsic and absolute features are present. In the remaining 50% of the contexts only absolute features are present and no intrinsic ones. The environmental conditions have a strong effect on the development of the system. All 25 populations agree on using an absolute strategy. What is important is that the contexts where only absolute features are present reward the absolute strategy and punish the intrinsic conceptualization strategy. Consequently, even in a context where intrinsic and absolute features are present, the absolute strategy is preferred. The development of such a preference has important effects on the

8

Michael Spranger

1 2

success

1

0.4

communicative success # categories

0.2 0

# constructions interpretation similarity

0 1000

3000 5000 7000 number of interactions

9000

0.5 0

-0.5

strategy similarity

1.5

0.6

2

0.8

# of categories

0.8

0.6 strategy similarity # chunks

0.4 0.2 0

0 1000

3000 5000 7000 number of interactions

1.8 1.6 1.4 1.2 1 0.8

# of chunks

1

9000

Fig. 7 Dynamics of a category formation experiment in which 10 agents align conceptualization strategies.

invention of categories. Because of the preference for the absolute strategy, invention of categories shifts to producing only absolute categories. The successful use of these categories enforces the absolute strategy and leads to further punishment of the intrinsic strategy. The effect is that only the absolute strategy survives. Additionally, the graph shows that roughly together with the category system, agents align their conceptualization strategy.

7 Recruitment of Conceptualization Strategies Conceptualization strategies are represented as IRLprograms made up of cognitive operations that encode a particular way of construing reality. Consequently, strategies originate in a process of recruitment which assembles cognitive operations into chunks (partial IRLprograms). Once a chunk is invented it immediately extends the conceptualization capabilities of the inventing agent. Strategy invention is deeply integrated into the processing of agents. If an agent is unable to conceptualize (step 2 and 4 in interaction script) or unable to conceptualize with sufficient confidence, he starts the search for new conceptualization strategies. The reason for this integration specifically with other invention mechanisms such as category invention is that agents when inventing new strategies also immediately have to invent new categories with these strategies. A strategy itself is not verbalized (in lexical languages) but only the name of the spatial relation. This sort of dual invention is especially important in the beginning of experiments, when agents have neither strategies nor categories. But there is a second reason for deep integration of strategy invention. When an agent already has developed a strategy then he might also solve a particular communicative problem by inventing new categories for

established strategies. Such decisions whether to use a new category with an existing strategy or a new strategy with an existing category, or even to use a newly invented strategy with a newly invented category are made based on the discriminative power of each these different possibilities in the particular context. So for instance if an existing strategy has a low score the probability of inventing a new strategy increases, whereas if the current topic can be sufficiently discriminated using an existing strategy no invention occurs. Table 3 and 4 detail the learning operators including recruitment. Moreover, agents are equipped with the selection and alignment operators for chunks, spatial relations and words discussed earlier. Table 3 Invention: Speaker cannot find a meaning for referring to the topic Diagnostic

Repair

When the speaker cannot conceptualize a meaning (step 2 of the spatial language game fails). The speaker invents new conceptualization strategies by assembling cognitive operations such as identify-proximal, geometric-transform into chunks which is immediately followed by the invention of categories for each new chunk (see section on co-evolution of categories and terms). At this point the speaker might have a number of new solutions to his conceptualization problem consisting of new strategies and new corresponding spatial relations. Subsequently, the speaker selects the strategy and category which is most discriminating. Once selected, he invents a new word and construction for expressing the new strategy.

Results Figure 8 shows the dynamics of invention and alignment of conceptualization strategies in a population of 10 agents (25 trials). Agents have a repository of

Evolving Grounded Spatial Language Strategies

9

0.8 success

0.4

comm success # categories # constructions interpretation similarity

0.2 0

0 1000

3000 5000 7000 number of interactions

9000

# of categories

3

0.6

1

2 1 0

1.8

0.8

1.4

0.6

1

0.4

strategy similarity # chunks

0.2 0

0 1000

3000 5000 7000 number of interactions

9000

# of chunks

4

strategy similarity

1

0.6 0.2 0

Fig. 8 Results for strategy invention, alignment and category development. A population of 10 agents develops both conceptualization strategies as well as lexical systems for spatial strategies corresponding to these strategies.

Table 4 Adoption: Hearer encounters unknown spatial term s Diagnostic Repair

When the hearer does not know a term (step 4 fails). The hearer signals failure and the speaker points to the topic T. The hearer then constructs new strategies, i.e. chunks, and for each of them he invents a new spatial relation Ri based on the the topic pointed at. The hearer then decides on which of the strategies is most discriminating. This is the one selected for storing. Additionally, the hearer invents a new word for Ri .

10 basic cognitive operations from which they can draw new building blocks whenever there are problems in communication. They can choose different landmarks: the robot, or the box, and different category systems such as absolute and projective, as well as proximal. The agents manage to agree on one particular strategy while at the same time developing a category system and a lexicon from scratch. Interestingly, the process does not show the same overall success as previously discussed experiments. The reason is that conceptual alignment is a difficult process which is complicated by the number of choices in strategies, population size and the variety of different contexts and discriminative situations which might all favor different strategies. In some contexts proximal is the best strategy, some allow absolute and/or projective categories to be invented. Nevertheless, agents do come to an agreement. Here, they agree on average on a single conceptualization strategy. For space reasons, we can only discuss one particular experiment with trials all equal in environmental condition. But, of course once the system is setup, one

can study the effect of varying conditions. The systems discussed here are adaptive and open-ended. They can adjust to different environmental conditions. Agents react flexibly to different object distributions that favor distance-based or angle-based strategies.

8 Discussion This papers has proposed the IRL formalism for representing the semantics of phrases and the underlying conceptualization strategies. The formalism includes mechanisms for autonomous construction of semantic structure in production and interpretation of phrases and can be used to model the processing of complex natural language utterances, as shown for German locative phrases. Moreover, this paper has argued for selection, recruitment and alignment as the basic mechanisms explaining the evolution of conceptualization strategies together with corresponding lexical systems. All ideas were formalized and validated in robotic experiments. The basic claim validated is that we can understand the evolution of strategies as a process of cultural negotiation fueled by the cognitive capabilities of agents, i.e. the cognitive operations available. The process is constrained by environmental factors such as the availability of geocentric landmarks. While cognition and ecology influence the selection process, the negotiation takes place within a single static population via linguistic interactions. This is one of the main difference of this model to other models of cultural evolution which claim that intergenerational turnover is the main cause of language change [7, 14]. In this paper we have focussed on the origins of conceptualization strategies using lexical languages. Certainly, spatial language also shows variation in the kinds of syntactic material that is employed to convey distinct spatial semantics. A discussion can be found in [10] and

10

[29]. Evolutionary models of grammar evolution using the systems presented in this paper are detailed in [22]. Moreover, spatial language can feature other conceptualization strategies not discussed in this paper, such as toponyms, directional categories or body-centered spatial relations. Given a suitable implementation of cognitive operations, we claim that the same approach can be used to study the evolution of such strategies.

Acknowledgements I am greatly indebted to Masahiro Fujita and Hideki Shimomura for supporting this research. I thank Simon Pauw, Martin Loetzsch, Wouter van den Broeck, Joris Bleys and Luc Steels who have made contributions to aspects of IRL.

Biography Michael Spranger received his Diploma from the HumboldtUniversitt zu Berlin (Germany) in 2008 and a PhD from the Vrije Universiteit in Brussels (Belgium) in 2011 (both in Computer Science). For his PhD he was a researcher at Sony CSL. He then worked in the R&D department of Sony corporation in Tokyo (Japan) for almost 2 years. He currently holds positions in Sony CSL and Sony Corporation. He is a roboticist by training with extensive experience in research on and construction of autonomous systems including research on robot perception, world modeling and behavior control. After his diploma he fell in love with the study of language and has since worked on different language domains from action language and posture verbs to time, tense, determination and spatial language. His work focusses on artificial language evolution, computational cognitive semantics and robotics.

References 1. Bleys, J.: Language strategies for the domain of colour. Ph.D. thesis, Vrije Universiteit Brussels (VUB), Brussels, Belgium (2010)

Michael Spranger 2. Bleys, J., Loetzsch, M., Spranger, M., Steels, L.: The Grounded Color Naming Game. In: Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication (Ro-man 2009) (2009) 3. Cangelosi, A.: Grounding language in action and perception: From cognitive agents to humanoid robots. Physics of Life Reviews (2010) 4. De Beule, J., Bergen, B.: On the emergence of compositionality. In: A. Cangelosi, A.D.M. Smith, K. Smith (eds.) The evolution of language (Evolang 6), pp. 35–42. World Scientific, Singapore (2006) 5. Fujita, M., Kuroki, Y., Ishida, T., Doi, T.: Autonomous behavior control architecture of entertainment humanoid robot SDR-4X. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 960–967 (2003) 6. Johnson-Laird, P.N.: Procedural semantics. Cognition 5(3), 189–214 (1977) 7. Kirby, S.: The evolution of language. Artificial Life 8, 185–215 (2002) 8. Levinson, S.C.: Language and space. Annual review of Anthropology 25(1), 353–382 (1996) 9. Levinson, S.C.: Space in Language and Cognition: Explorations in Cognitive Diversity. No. 5 in Language, Culture and Cognition. Cambridge University Press (2003) 10. Levinson, S.C., Wilkins, D.: Grammars of Space. Cambridge University Press (2006) 11. Mainwaring, S., Tversky, B., Ohgishi, M., Schiano, D.: Descriptions of simple spatial scenes in English and Japanese. Spatial Cognition and Computation 3(1), 3–42 (2003) 12. Marocco, D., Cangelosi, A., Nolfi, S.: The emergence of communication in evolutionary robots. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 361(1811), 2397 (2003) 13. Smith, K., Brighton, H., Kirby, S.: Complex systems in language evolution: the cultural emergence of compositional structure. Advances in Complex Systems 6(4), 537–558 (2003) 14. Smith, K., Kirby, S., Brighton, H.: Iterated learning: A framework for the emergence of language. Artificial Life 9(4), 371–386 (2003) 15. Spranger, M.: World models for grounded language games. German diplom thesis, Humboldt-Universit¨ at zu Berlin (2008) 16. Spranger, M.: The evolution of grounded spatial language. Ph.D. thesis, Vrije Universiteit Brussels (VUB), Brussels, Belgium (2011) 17. Spranger, M.: The co-evolution of basic spatial terms and categories. In: L. Steels (ed.) Experiments in Cultural Language Evolution, pp. 111–141. John Benjamins (2012) 18. Spranger, M., Loetzsch, M.: Syntactic Indeterminacy and Semantic Ambiguity: A Case Study for German Spatial Phrases. In: L. Steels (ed.) Design Patterns in Fluid Construction Grammar, Constructional Approaches to Language, vol. 11, pp. 265–298. John Benjamins (2011) 19. Spranger, M., Loetzsch, M., Pauw, S.: Open-ended Grounded Semantics. In: H. Coelho, R. Studer, M. Woolridge (eds.) Proceedings of the 19th European Conference on Artificial Intelligence (ECAI 2010), vol. Frontiers in Artificial Intelligence and Applications, pp. 929–934. IOS Press (2010) 20. Spranger, M., Loetzsch, M., Steels, L.: A Perceptual System for Language Game Experiments. In: L. Steels, M. Hild (eds.) Language Grounding in Robots, pp. 89– 110. Springer (2012)

Evolving Grounded Spatial Language Strategies 21. Spranger, M., Pauw, S., Loetzsch, M., Steels, L.: Openended Procedural Semantics. In: L. Steels, M. Hild (eds.) Language Grounding in Robots, pp. 153–172. Springer (2012) 22. Spranger, M., Steels, L.: Emergent Functional Grammar for Space. In: L. Steels (ed.) Experiments in Cultural Language Evolution, pp. 207—232. John Benjamins (2012) 23. Steels, L.: A self-organizing spatial vocabulary. Artificial Life 2(3), 319–332 (1995) 24. Steels, L.: The emergence of grammar in communicating autonomous robotic agents. In: W. Horn (ed.) ECAI 2000: Proceedings of the 14th European Conference on Artificial Intelligence, pp. 764–769. IOS Publishing (2000) 25. Steels, L.: Grounding symbols through evolutionary language games. In: Simulating the evolution of language, pp. 211–226. Springer-Verlag New York, Inc. (2002) 26. Steels, L.: The Recruitment Theory of Language Origins. In: C. Lyon, C. Nehaniv, A. Cangelosi (eds.) The Emergence of Communication and Language, pp. 129– 151. Springer (2007) 27. Steels, L., Hild, M. (eds.): Language Grounding in Robots. Springer (2012) 28. Steels, L., Spranger, M.: Emergent mirror systems for body language. In: L. Steels (ed.) Experiments in Cultural Language Evolution, pp. 87—109. John Benjamins (2012) 29. Tenbrink, T.: Space, time, and the use of language: An investigation of relationships, Cognitive Linguistics Research, vol. 36. Walter de Gruyter, Berlin, DE (2007) 30. Tenbrink, T.: Reference frames of space and time in language. Journal of Pragmatics 43(3), 704–722 (2011) 31. Vogt, P.: Overextensions and the emergence of compositionality. In: A. Cangelosi, A.D.M. Smith, K. Smith (eds.) The Evolution of Language (Evolang 6), pp. 364– 371. World Scientific (2006) 32. Vogt, P., Divina, F.: Social symbol grounding and language evolution. Interaction Studies 8(1), 31–52 (2007)

11