A Grammar-based Approach to Specify and Implement Visual ...

3 downloads 131 Views 2MB Size Report
proposed XpLR methodology it is possible to automatically implement visual languages ..... Most of the visual notations used in common software development.
Dottorato di Ricerca in Informatica I ciclo Nuova Serie Universit`a di Salerno

A Grammar-based Approach to Specify and Implement Visual Languages Vincenzo Deufemia November 2002

Chairman:

Supervisor:

Prof. A. De Santis

Prof. G. Costagliola

Abstract

This thesis work presents a methodology for modeling and implementing visual languages. The approach relies on the syntactic framework of eXtended Positional Grammars (XPG, for short). This is a formalism to model the basic elements (visual symbols) of the visual notation, their syntactic properties, the relations between them, and a set of syntactic rules to formally define the feasible visual sentences. We present a powerful LR-based (XpLR) methodology for parsing visual languages described by XPGs. The result is the possibility of describing and compiling a broad class of visual languages yet keeping most of LR parsing efficiency. We describe this new algorithm, named XpLR(0) parser, and provide heuristics able to solve a number of conflicts usually arising in the previous applications of LR methodology to visual languages. The expressive power of the formalism has been highlighted by modeling UML state diagram languages, which represent one of the most complex visual modeling languages used in the software engineering field. An interesting feature of the XpLR methodology is the possibility of using standard compiler generation tools for the construction of compilers for visual languages. Indeed, we define mapping rules and conflict handling techniques to convert a generic XPG into an equivalent translation scheme. This conversion process allows us a rapid implementation of compilers for XPGs thanks to the use of standard and well-known tools, like YACC. Using the Visual Language Compiler-Compiler (VLCC) system extended with the proposed XpLR methodology it is possible to automatically implement visual languages once their formal XPG specification is given. VLCC generates both editor and compiler for the specified visual language. This makes our methodology a sound basis for the definition iii

of a new meta-CASE technology, since VLCC can be used for defining and automatically generating CASE tools. In fact, we have used it to model the diagrammatic notations of the Unified Modeling Language (UML), and to generate a set of flexible CASE tools for supporting them. One of the most interesting applications of VLCC is in the construction of meta-CASE analysis and design workbenches. Indeed, such workbenches are usually visual oriented since they support editing and manipulation of diagrammatic notations which allow engineers to prototype models of the system. Until recently, the main difficulty with their automatic generation derived from the lack of formal syntax and semantics specification of diagrammatic notations used as part of analysis and design methods. The formal specification methods proposed in the visual language research area can be profitably used to this aim. In this thesis, we show how the VLCC system can be profitably used for the construction of meta-CASE workbenches. The meta-CASE generates a workbench by integrating a set of visual modeling environments in agreement with a required method, which includes a process model and suitable rules/guidelines and is specified in terms of a suitable activity diagram.

iv

Acknowledgements

I’d like to thank my advisor Gennaro Costagliola for the profitable discussions that contributed to my research work and for the suggestions he gave me during the preparation of this thesis. I would like to express my gratitude to Filomena Ferrucci for her careful support. She has been of great help throughout the doctoral program. I am very glad to have had the opportunity to work with her. I thank Carmine Gravino and Giuseppe Polese who have been my co-authors for several recent research papers related to the results presented in this thesis. I want to thank Riccardo Distasi for his tireless LATEX assistance and for his moral support. I also want to thank all my friends and my colleagues. A final word of thanks is due to my family for their constant and invaluable support.

v

vi

Contents

Title Page

i

Abstract

iii

Acknowledgements

v

Contents

vii

1 Introduction

1

1.1

Visual Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

A Framework for Describing Visual Languages . . . . . . . . . . . . . . . .

3

1.2.1

Visual symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.2.2

Visual sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.2.3

Representation of visual sentences . . . . . . . . . . . . . . . . . . .

9

1.3

Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Extended Positional Grammars 2.1

13

Modeling UML Statechart Diagrams through XPGs . . . . . . . . . . . . . 22

3 The XpLR Methodology 3.1

35

The XpLR Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.1.1

The input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.1.2

The stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.1.3

The XpLR parsing table . . . . . . . . . . . . . . . . . . . . . . . . . 36 vii

3.1.4

The XpLR parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.1.5

Parsing time complexity . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.2

Constructing XpLR(0) Parsing Tables . . . . . . . . . . . . . . . . . . . . . 44

3.3

XpLR parsing table conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.4

3.3.1

Handling parsing table conflicts . . . . . . . . . . . . . . . . . . . . . 49

3.3.2

Building parsing tables with ordered substates . . . . . . . . . . . . 51

Applicability of XpLR parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Building LR(0) Parsers for XPG Grammars

59

4.1

Converting an XPG into a translation scheme based on string grammars . . 60

4.2

Comparing the recognized languages . . . . . . . . . . . . . . . . . . . . . . 65

4.3

Resolving conflicts in non-LR(0) translation schemes . . . . . . . . . . . . . 85

5 An XPG-based Visual Environments Generator

93

5.1

The Symbol Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.2

The VLCC Textual grammar editor . . . . . . . . . . . . . . . . . . . . . . 96

5.3

The generated Visual Programming Environment . . . . . . . . . . . . . . . 97

6 Constructing Meta-CASE Workbenches

101

6.1

The proposed approach for the construction of meta-CASE workbenches . . 102

6.2

The MEG Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.3

6.2.1

The architectural design of MEG and the underlying methodology . 106

6.2.2

Using the VLCC system as a support to MEG construction . . . . . 110

The Workbench Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7 Related Work

121

7.1

Picture Layout Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.2

Constraint Multiset Grammars . . . . . . . . . . . . . . . . . . . . . . . . . 123

7.3

Relational Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.4

Symbol-Relation Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7.5

Graph Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.5.1

Layered Graph Grammars . . . . . . . . . . . . . . . . . . . . . . . . 128 viii

7.5.2 7.6

Hypergraph Grammars . . . . . . . . . . . . . . . . . . . . . . . . . 129

Visual Conditional Attributed Rewriting Systems . . . . . . . . . . . . . . . 130

8 Conclusions and Further Research

133

References

137

ix

Chapter 1

Introduction

Visual languages are widely used in several application fields: teaching, development of GUIs, and the software development process. Indeed, graphical notations allow for the description and understanding of complex systems, such as concurrent and/or real-time systems, for which traditional textual descriptions are inadequate. This thesis proposes practical grammar-based means for specifying and implementing visual languages. In this chapter, we first introduce the concept of visual language. Then, we provide a framework for describing visual notations. We formalize the concepts of attribute-based, relation-based and linear representation of a visual sentence. Finally, we describe the structure of the rest of this thesis.

1.1

Visual Languages

In the last decades a wide variety of activities have made use of icons and diagrams to allow for multimodal communication and interaction between humans and computers. The ability of using graphics as a communication means in any activity that involves human-computer interaction is named visual programming. Typical activities that benefit from the use of visual languages are interpretation of low-level media (i.e., handwriting recognition, image processing, etc.) and graphical user interfaces (i.e., interpretation of user input, design support, visual and multimedia databases) [45]. Thus, a huge amount of visual programming languages have been introduced. Such languages allow a user to communicate with the system by spatially arranging visual objects on the screen, so as to compose a “visual sentence” [31].

2

Chapter 1. Introduction The use of visual programming is rapidly growing also in the industrial field. Indeed,

productivity improvements derive from a more effective communication between developer and customer, and from the availability of visual environments that facilitate the combination of development and execution while promoting iterative design and interactive prototyping (see, e.g., [5, 31, 40]). Moreover, visual languages are widely employed to support many activities of the software development process, such as specification, analysis and design (e.g., Petri nets, FSAs, Statecharts, and Dataflow diagrams). For instance, UML [57,50] diagrams are a common visual form of expressing and communicating design information; they are used for modeling, testing, specifying, and programming of software systems. Visual notations are able to provide abstract models and different views of software systems, which allow software designers to devise solutions and design systems. Much effort is presently being done to develop formal techniques for specifying, designing and implementing visual (programming or modeling) languages [8, 31, 44, 47, 53]. Currently, there are three main approaches to visual language specification: the grammatical approach, the logical approach, and the algebraic approach. The grammatical approach is based on grammatical formalisms which extend the traditional rewriting mechanism used in string language specification by describing geometric relationships between the objects to be rewritten. Close to grammar-based approaches are formal specification methods based on rewriting systems [49,8]. The logical approach uses first-order mathematical logic or other forms of logic from artificial intelligence. Logical techniques are usually based on spatial logics, which axiomatize the possible relationships between objects. Finally, the algebraic approach uses algebraic specifications consisting of composition functions constructing complex pictures from simpler picture elements. See [45] for an extensive survey. In this thesis we will mainly rely on the grammatical approach. The literature offers several grammatical formalisms for the specification of visual languages, which differ one from another under several aspects [31, 44, 47, 53]. In general, such formalisms extend traditional string grammars in that they rewrite sets or multisets of symbols rather than sequences, and specify several relationships between objects rather than mere concatenation. As a consequence, the analysis of visual languages is much harder, due to the high cost of parsing. Indeed, while strings naturally drive a sequential scan of the symbols, no precise scanning order is implicit in multi-dimensional structures. However, the effective use of grammars for specifying visual language syntax requires

Chapter 1. Introduction

3

efficient parsing techniques. The parsing efficiency issue is a very relevant topic that has been widely investigated in the literature. Several methods have been proposed that impose some restrictions on the form of the grammars, balancing the ability to express visual languages and the efficiency of parsing techniques [44]. Several grammar-based visual environment generators have been proposed [35, 11, 15, 22, 30, 62, 58, 56, 70]. They are able to generate powerful visual environments within which visual languages are embedded, and the editor and the compiler components are tightly integrated. Such characteristic derives from two specific issues. First, for any visual language a specialized editor is needed to assist the user in the specification of visual models, by providing him/her with a set of visual symbols and relationships. Indeed, the use of general-purpose drawing editors would force users to the unacceptable task of drawing the possibly complex shapes representing symbols and relationship of the language, in fact significantly limiting the benefits of visual languages. The second issue is related to the lack of a standard representation for visual input expressions, analogous to the ASCII format for textual languages. As a result, any parser is based on the specific input representation that is used by the associated editor [31]. Visual environment generators can also be used for the construction of meta-CASEs, to generate workbenches supporting other phases of the software development process. In particular, they are suited to the generation of visual oriented workbenches (i.e. analysis and design workbenches) since they support editing and manipulation of diagrammatic notations. Such languages are key elements in the software engineering field, since they effectively enhance the human to human communication, essential for cooperative work.

1.2

A Framework for Describing Visual Languages

Recently, several techniques for describing visual languages syntax have been created, and these have been thoroughly analyzed and compared [45]. It turns out that two main methods can be used to represent visual language sentences: relation-based and attributebased. The former describes a sentence as a set of graphical objects and a set of relations on them, the second instead conceives a sentence as a set of attributed graphical objects. In this section we provide a framework to formally describe visual languages, [13, 21]. A visual language is formed by a set of visual symbols from an alphabet and a set of

4

Chapter 1. Introduction

feasible visual sentences over these symbols. A feasible visual sentence of a language L is a spatial arrangement of visual symbols according to the syntax of L. The framework presented below can be used to describe many different classes of visual languages [13]. In this thesis we will mainly focus on its use to describe visual languages of software development methodologies.

1.2.1

Visual symbols

A visual symbol vs (vsymbol for short) is defined as a triple (M, S, L) where M is the physical component that is needed to materialize vs to our senses; S is the syntactic component used to relate vs to other vsymbols. This component depends on how the visual symbol is used in a sentence. L is the semantic interpretation of a vsymbol and is used to derive the meaning of the sentences in which it occurs. The three components are not necessarily disjoint. In particular, M is a set of attributes that specify the physical appearance of the vsymbol, including size, color, shape, etc.; S is a set of attributes, named syntactic attributes, whose values depend on the “position” of the vsymbol in the sentence; L is a conceptual structure defining the semantics of the vsymbol. When we say that the three components are not necessarily disjoint we mean that certain attributes may be part of more than one component of a vsymbol. In general, most syntactic attributes are also part of M. As an example, let us consider the vsymbol

of a flowchart. In this case, M

describes its graphical aspect, i.e., a rhombus and three little circles (attaching points); S keeps track of the links connected to each attaching point. According to the syntax of this language, in order not to cause syntactic errors, 1) the vsymbol must be connected to other vsymbols through links leaving the attaching points, and 2) no two attaching points in it can be connected. The semantic component L qualifies

as the condition block of

a conditional or loop statement in a flowchart.

1.2.2

Visual sentences

A visual alphabet S is a set of vsymbols. A visual sentence (vsentence for short) on S is a set of vsymbols {x1 , x2 ,. . ., xn } with their physical and syntactic components completely instantiated. Examples of vsentences are given in Fig. 1.1(i) and (ii). In Fig. 1.1(i) the vsymbols are the blocks of the “flowchart”. The syntactic attributes of each block correspond to its attaching points and are used to keep track of the connections among

Chapter 1. Introduction

5

the blocks. In the activity diagram of Fig. 1.1(ii) some of the vsymbols are characterized by syntactic attributes corresponding to attaching regions visualized, in this case, by thicker lines. It is easy to note that attaching points can be seen as a special case of attaching regions. Thus, definitions and discussions on attaching regions apply also to attaching points.

Start

Receive Order

Hear alarm

Send Invoice

Fill Order

Turn alarm OFF [rush order]

Do you want to get up?

no

Reset alarm

Go back to sleep

Overnight Delivery

[else]

Regular Delivery

Receive Payment

yes Get up Close Order

Stop

(i)

(ii)

Figure 1.1: A flowchart (i) and an activity diagram (ii). In general, the different types of syntactic attributes can be used to identify several classes of relations, which yield corresponding ways of modeling visual languages. In this thesis we will consider two classes of relations, namely the class of connection relations and the class of geometric relations [13]: 1. A connection relation is specified on a sequence of a finite number of attaching regions. Visual sentences can be built by connecting attaching regions of vsymbols through links or arrows. One or more links can be attached to an attaching region. As an example, the vsymbol On in Fig. 1.2, semantically representing a superstate in a statechart [34], has one attaching region represented by its border line, and the two arrows on and off are attached to it. 2. A geometric relation between vsymbols is specified on the coordinates of the upperleft and the lower-right vertices of their bounding boxes. Sentences can be built by

6

Chapter 1. Introduction composing vsymbols through relations such as containment, sibling, right-to, etc. As an example, the vsymbol NotOn in Fig. 1.2 is related to the two vsymbols Standby and Off in its bounding box through the containment relation. As another example, the vsymbol a in the string bacb is related to the vsymbol c through the right-to relation. In this case right-to is the visual counterpart of the string concatenation relation. As a consequence, in this framework we model string languages as a special case of visual languages.

Notice that some relations have an explicit visual representation in vsentences, whereas others have only an implicit representation.

The latter are called implicit relations,

whereas the formers are called explicit relations. For example, in Fig. 1.2 the connection relation is explicitly represented by edges, whereas the containment relation is implicitly represented. Moreover, a relation R is called univocal when given a vsymbol x, there exists at most one vsymbol y such that x R y. Otherwise, R is called non-univocal. As an example the right-to relation is univocal, while the connection relations are usual non-univocal as we will see in the following. On High

NotOn on Standby

Low

off

on Off

down

up

off plus

Worm

plus

minus minus Cool

Hot

Figure 1.2: A visual sentence with both connection and geometric relations. From the classes of relations connection and geometric it is possible to derive more specialized subclasses of relations, as shown in Fig. 1.3, [13]. In particular, the class Graph is suitable for modeling general graph-structured visual languages. Each vsymbol has a pre-defined set of attaching regions on its image. The relations are graph interconnections. Many of these languages have been used within software engineering methodologies. Examples include languages based on data flow graphs, state transition diagrams, Petri nets,

Chapter 1. Introduction

7

entity-relationship diagrams, SADT diagrams, Class and Object diagrams. The class Plex is suitable for modeling graph-structured visual languages, with the limitation that each vsymbol has a fixed number of attaching points [26], like flowcharts, chemical structures, Boolean and electric circuits, and so on. For instance, in flowchart languages each graphical object (boxes, diamonds, etc.) has a pre-defined number of attaching points (two for boxes, three for diamonds, etc.), and graphical objects can be connected only through links visualized as polylines. The vsymbols of the class Box are characterized by their bounding boxes. The relations are spatial compositions of three types: inclusion, intersection and spatial concatenation, which are all defined on the vertices of the bounding boxes of vsymbols. Most of the visual notations used in common software development methodologies can be successfully modeled through these relation classes. In fact, some notations can be modeled as pure geometric or pure connection, whereas others require a hybrid modeling, since they combine characteristics from both classes. As an example Statecharts [34] use arrows between nodes, which correspond to connection relations, and the containment relation, which is a typical geometric relation.

Relation Classes Connection

Geometric

Graph

Box

Plex

Iconic String

Figure 1.3: A hierarchy of relation classes.

A particular type of relation is annotation: a vsymbol of a visual language can be annotated with a visual sentence from the same or a different visual language to provide a detailed description of it. As an example, Fig. 1.4 describes a UML class diagram containing two classes Person and Traditional Person combined in a generalization/specialization hierarchy. Each class is annotated with a statechart diagram that describes the dynamic

8

Chapter 1. Introduction

behavior of the class. The vsymbols of the annotating language could be in turn annotated, yielding a hierarchy of visual languages. Note that the annotation need not be “visual”. As a matter of a fact, a vsymbol could also be annotated by a string language such as a high level textual specification.

getsMarried

Person

Single

Married getsDivorced

birthday getsMarried getsDivorced

birthday

birthday

Single getsEngaged

Traditional Person

NotEngaged

Engaged

birthday

getsEngaged

birthday

getsDivorced

getsMarried

Married

birthday

Figure 1.4: A visual sentence with two annotations.

The concept of annotation leads to the definition of hierarchical visual notations, [13]. More formally, a hierarchical visual notation is a visual language whose vocabulary contains vsymbols that are annotated with textual or visual sentences from the same or from another language, yielding homogeneous or heterogeneous hierarchies of visual languages, respectively. Hierarchical visual language support is a vital aspect in software engineering. In fact, most software development methodologies rely upon a hierarchical combination of different visual notations. As an example, SADT diagrams are used to specify the functional part of a software system. They use boxes to represent activities. Each box can be annotated in turn with another SADT diagram to describe the details of the activity it represents [55]. Similarly, visual symbols of a flowchart can be annotated by textual Pascal-like code.

Chapter 1. Introduction

1.2.3

9

Representation of visual sentences

A visual sentence can be represented externally by materializing all of its vsymbols or internally by taking into account the M, S and L components of its vsymbols. Fig. 1.5 classifies all the possible representations. Usually, the M component of each vsymbol is represented as a set of attributes describing the physical characteristics of the vsymbol. In the case of visual languages, the attributes of the M component of a vsymbol may correspond to its position, a zoom factor, a list of elementary graphical objects, a bitmap file, etc. The L component may be represented through a conceptual structure such as a semantic network. In this thesis we mainly focus on the syntactic component S. We consider three ways to syntactically represent a visual sentence: attribute-based, relationbased and linear. We will see that these representations can be converted one into the other, even though the conversions may not be 1-to-1. visual sentence representations external

internal M

attribute-based

S

relation-based

L

linear

Figure 1.5: A classification of visual sentence representations.

Attribute-based representation In the attribute-based case, a vsentence is represented by explicating the syntactic attributes of the vsymbols composing it. Let us examine how this representation method applies to the visual languages in the class hierarchy of Fig. 1.3. In the case of visual languages from the class String, we can explicit the value of the attribute position of a vsymbol, which represents the position index of the vsymbol in a string; in the case of Plex visual languages the attaching points of a vsymbol v are numbered and represented by an array ap[1],. . ., ap[n]. The value of ap[i] is given by a unique label assigned to the link plugged into attaching point i of v ; in the case of Graph visual languages the attaching regions of a vsymbol v are numbered and represented by an array aps[1],. . ., aps[n] of sets.

10

Chapter 1. Introduction

The value of aps[i] is the set of labels of the links plugged into attaching region i of v. Fig. 1.6(a) shows the attribute-based representation of an activity diagram by considering the link labeling provided in Fig. 1.6(b). name Start Activ1 Sync1 Activ2 Activ3 Cond Activ4 Activ5 Mux Activ6 Sync2 Activ7 Halt

aps[1] aps[2] aps[3] {a} {a} {b} {b} {c,d} {c} {e} {d} {f} {e} {g} {h} {g} {i} {h} {l} {i} {l} {m} {f} {n} {m,n} {o} {o} {p} {p} -

Start a Activ1 b Sync1

d

c

Activ3

Activ2 e

f g

Cond

Activ4

(a)

h Activ5

i

Activ6

l Mux

n

m Sync2 o Activ7 p Halt

(b)

Figure 1.6: Attribute-based representation (a) of the activity diagram in Fig. 1.1(ii) based on the link labeling in (b).

Relation-based representation Given a visual sentence vs, let us consider a set R of binary relation identifiers. A labeled graph on R and vs, GR,vs = (N, E), is defined as follows: • each node in N identifies a distinct vsymbol in the sentence vs • a labeled edge (x, y, REL) is in E iff REL ∈ R holds between subsets of syntactic attributes from the vsymbols x and y, respectively. Definition 1.1 Let R and vs be a set of binary relation identifiers and a visual sentence, respectively, a relation-based representation of vs with respect to R is any labeled graph GR,vs that is connected. In the following, we will denote a labeled graph GR,vs by listing its labeled edges in the format REL(x, y). As an example, let us consider the visual sentence in Fig. 1.6(b). It can be modeled according to the visual language class Graph, by using a class of relations

Chapter 1. Introduction

11

of type LINKi,j defined as follows: a vsymbol x is in relation LINKi,j with a vsymbol y iff attaching point i of x is connected to attaching point j of y, i.e., iff apsx [i] ∩ apsy [j] is not empty. Under these assumptions, the relation-based representation of the activity diagram of Fig. 1.6(b) is given by the set: {LINK1,1 (Start, Activ1), LINK1,2 (Activ1, Sync1), LINK2,1 (Sync1, Activ2), LINK2,1 (Sync1, Activ3), LINK2,1 (Activ2, Cond), LINK2,1 (Cond, Activ4), LINK3,1 (Cond, Activ5), LINK2,1 (Activ4, Mux), LINK2,2 (Activ5, Mux), LINK3,1 (Mux, Sync2), LINK2,1 (Activ3, Activ6), LINK2,1 (Activ6, Sync2), LINK2,1 (Sync2, Activ7), LINK2,1 (Activ7,Halt)}. Linear representation Definition 1.2 Given a visual sentence vs = {x1 ,. . ., xn } and a set of relation identifiers R, a linear representation of vs with respect to R is the pair (GR,vs , P) where: 1. GR,vs is a relation-based representation of vs with each relation in R invertible; 2. P is a permutation (y1 , y2 ,. . ., yn ) of the vsymbols in vs such that for each yi with 1< i ≤ n, there exists at least an index k such that 1 ≤ k < i and an edge in GR,vs on yk and yi . A linear representation (GR,vs , P) will be denoted in the following as the string y1 R1 y2 R2 y3 . . . Rn−1 yn where each Rj is a non-empty sequence of type: RELh1 1 , . . . , RELhi i , . . . , RELhmm 

with m ≥ 1.

Each RELhi i denotes the pair (RELi , hi ) where RELi ∈R or RELi = REL−1 with REL∈R, RELi labels the edge (yj−hi , yj+1 ) in GR,vs and it relates syntactic attributes of yj+1 with syntactic attributes of yj−hi , with 0 ≤ hi < j. In the rest of the thesis, we will denote REL01 simply as REL1 . Notice that, whenever the relations are invertible since GR,vs is connected, it is possible to find a linear representation for vs. As an example, let us consider the activity diagram in Fig. 1.6, if we consider its relation-based representation given above, and the following permutation of vsymbols: (Start, Active1, Active2, Active3, Active4, Active5, Active6, Active7, Sync1, Sync2, Cond, Mux, Halt) the corresponding linear representation is: Start LINK1,1  Activ1 LINK2,1  Sync1 LINK2,1  Activ2 LINK12,1  Activ3 LINK12,1 

12

Chapter 1. Introduction

Cond LINK2,1  Activ4 LINK13,1  Activ5 LINK12,1 , LINK2,2  Mux LINK42,1  Activ6 LINK13,1 , LINK2,1  Sync2 LINK2,1  Activ7 LINK2,1  Halt. This linear representation well fits the interpretation of an activity diagram. It follows the natural flow of the described activities. The same cannot be said if an alternative linear representation starting from Halt is chosen. Thus, the semantics of a visual sentence can drive the construction of one of its linear representations.

1.3

Thesis Outline

In Chapter 2, we present the eXtended Positional Grammars (XPG, for short), a grammar formalism for modeling visual notations, including those used in most software development methodologies. Then, in Chapter 3, we give an LR-based algorithm for the parsing of visual notations modeled through XPGs. This algorithm, named XpLR(0) parser, is also able to solve a number of conflicts previously arising in pLR parsing tables. XPG and XpLR extend Positional Grammars (PG) [15], and the associated pLR parsing methodology. The extension enables us to model diagrammatic notations used in software engineering and to efficiently parse them. In Chapter 4 we describe how to construct parsers for XPGs by exploiting standard compiler generation tools, like YACC. XPG and XpLR have been implemented within the last version of the Visual Language Compiler-Compiler [14], a system for implementing visual languages, which is described in Chapter 5. Such tool has been used for generating many practical visual environments, such as environments for the diagrammatic notations of UML, notations used in multimedia software engineering, and in workflow management. In Chapter 6 we propose an approach for the construction of meta-CASE workbenches based on the technology of visual language generation systems and on UML meta modeling. In Chapter 7, we review the related work. Finally, in Chapter 8 we present our closing remarks and discuss further directions for the research. Readers who are not familiar with the implementation of visual languages may find it helpful to glance over Chapter 7 before reading Chapters 2-6.

Chapter 2

Extended Positional Grammars

Extended Positional Grammars (XPGs) are a direct extension of Positional Grammars (PGs) [15]. The latter have been successfully used to model and implement several important visual languages, including languages from the classes Iconic, Plex, and Box. However, PG were not suitable to model some critical visual languages, such as those belonging to the class Graph. This was a considerable limitation, preventing the application of the PG methodology to some important application fields, such as software engineering. In fact, most of the visual notations used in software development methodologies are based on the class Graph. Examples are UML class diagrams, Petri nets, Statechart diagrams, Activity Diagrams, etc. XPG overcomes this limitation, also thanks to a new parsing technique. Moreover, we have also provided conflict handling techniques to simplify grammar design. In fact, as opposed to grammars for string languages, grammar formalisms modeling the two-dimensional space are more likely to run into ambiguities. Without efficient conflict handling techniques such grammar formalisms could not be effectively used for modeling many practical visual languages. In order to avoid conflicts, the designer would have to produce complex grammars, even for simple visual notations. An Extended Positional Grammar is the pair (G, PE), where PE is a positional evaluator, and G can be seen as a particular type of context-free1 string attributed grammar (N, T∪POS, S, P) where: • N is a finite non-empty set of non-terminal vsymbols; • T is a finite non-empty set of terminal vsymbols, with N∩T = ∅; 1

Here “context-free” means that the grammar productions are in “context-free” format and does not

refer to the computational power of the formalism.

14

Chapter 2. Extended Positional Grammars • POS is a finite set of binary relation identifiers, with POS∩N= ∅ and POS∩T = ∅; • S∈ N denotes the starting vsymbol; • P is a finite non-empty set of productions having the following format: A → x1 R1 x2 R2 . . . xm−1 Rm−1 xm , ∆, Γ where A is a non-terminal vsymbol, x1 R1 x2 R2 . . . xm−1 Rm−1 xm is a linear representation with respect to POS where each xi is a vsymbol in N ∪ T and each Rj is partitioned in two sub-sequences h

k+1 , . . . , RELhnn ) (RELh1 1 , . . . , RELhk k , RELk+1

with 1 ≤ k ≤ n

The relation identifiers in the first sub-sequence of an Rj are called driver relations, whereas the ones in the second sub-sequence are called tester relations. During syntax analysis driver relations are used to determine the next vsymbol to be scanned, whereas tester relations are used to check whether the last scanned vsymbol (terminal or non-terminal) is properly related to previously scanned vsymbols. Without loss of generality we assume that there are no useless vsymbols, and no unit and empty productions [1]. ∆ is a set of rules used to synthesize the values of the syntactic attributes of A from those of x1 , x2 ,. . ., xm ; Γ is a set of triples (Nj , Condj , ∆j )j=1,..,t , t≥0, used to dynamically insert new terminal vsymbols in the input visual sentence during the parsing process. In particular, – Nj is a terminal vsymbol to be inserted in the input visual sentence; – Condj is a pre-condition to be verified in order to insert Nj ; – ∆j is the rule used to compute the values of the syntactic attributes of Nj from those of x1 ,. . ., xm . Moreover, a property that guarantee the convergence of parsing algorithms, based on XPGs, is: “for each production A → x1 . . . xm , ∆, Γ the number of triples in Γ whose conditions can simultaneously evaluate to true must be less than m-1”. This means that

Chapter 2. Extended Positional Grammars

15

no more than m-2 vsymbols can be inserted in the input during the application of a production. Informally, a Positional Evaluator PE is a materialization function which transforms a linear representation into the corresponding visual sentence in the attribute-based representation and/or graphical representation. In the following we characterize the languages described by an extended positional grammar XPG = ((N, T ∪ POS, S, P), PE). We write α ⇐ β and say that β reduces to α in one step, if there exist δ, γ, A, η such that 1. A → η, ∆, Γ is a production in P, 2. β = δηγ, 3. α = δA’πγ, where A’ is a vsymbol whose attributes are set according to the rule ∆ and π results from the application of the rule Γ. i

We also write α ⇐ β to indicate that the reduction has been achieved by applying ∗

production i. Moreover, we write α ⇐ β and say that β reduces to α, if there exist α0 , α1 , . . ., αm (m ≥ 0) such that α = α0 ⇐ α1 ⇐ . . . ⇐ αm = β The sequence αm , αm−1 , . . ., α0 is called a derivation of α from β. ∗

• a positional sentential form from S is a string β such that S ⇐ β • a positional sentence from S is a string β containing no non-terminals and such that ∗

S⇐β • a visual sentential form (visual sentence, resp.) from S is the result of evaluating a positional sentential form (positional sentence, resp.) from S through PE. The language described by an XPG, L(XPG), is the set of the visual sentences from the starting vsymbol S of XPG. Without loss of generality, let us assume that XPG has no empty productions. Given the two pairs (x, k) and (y, j), where x ∈ N ∪ T, y ∈ T, k is a syntactic attribute of x, and j is a syntactic attribute of y, we say that (y, j) is reachable from (x, k) iff one of the following situations occurs:

16

Chapter 2. Extended Positional Grammars 1. x = y; 2. there exists a production x → x1 R1 x2 . . .xi . . .Rm−1 xm , ∆, Γ in P such that attribute k of x is synthesized from attribute h of x1 by means of ∆, and (y, j) is reachable from (x1 , h).

If (y, j) is reachable from (x, k), we also say that y is reachable from x. The new features of Extended Positional Grammars, as opposed to Positional Grammars, include the use of multiple driver relations and the introduction of Γ rules to dynamically modify the input visual sentence. It is easy to show that these features dramatically improve the expressive power of positional grammars. In the following we show three examples of XPG grammars, the first presenting a simple grammar to describe a plex visual sentence, the second describing a context-sensitive string language, and the third modeling a State Transition Diagram language. Example 2.1 Let us consider the following grammar: N = {A, B}; T = {a, b, d, e, f } POS = {LINKi,j }, where LINKi,j is defined as in Section 1.2.3, and will be denoted as h k to simplify the notation. All the vsymbols have two attaching regions as syntactic attribute except A that has no attributes. In the following, the notation V symi denotes the attaching point i of the vsymbol V sym. The set of productions P is: (1) A → a 1 1 B 11 1,2 2 b 11 1 d (2) B → e 2 2 f ∆: (B1 = e1 ; B2 = f2 ) Γ: {(d; true; d1 = e1 )}. Notice that d is a fictitious terminal vsymbol to be dynamically inserted in the input sentence during the parsing process. Fig. 2.1 shows how the picture described by the grammar is reduced to the starting non-terminal A by using productions 2 and 1. E@ A@

@

H

H

H

@ B@ @ @ C@ @ G

@ D@ @ G

H

G

@ ⇒@ G

A@

@

@

@ F@ @

H

H

H

G

@ D@ @

H

G

Figure 2.1: A reduction process.

H@ ⇒@

I@

Chapter 2. Extended Positional Grammars

17

Example 2.2 Let us consider the context-sensitive language L={ an bn cn | n ≥ 1}. It is generated by the string grammar with the following productions: (1) S → a B S c (2) S → a B c (3) B c → b c (4) B a → a B (5) B b → b b where the non-terminals are S and B, and the terminals are a, b and c. As a matter of fact, the sentence a2 b2 c2 is obtained through the following derivation: 1

2

3

4

5

S ⇒ aBSc ⇒ aBaBcc ⇒ aBabcc ⇒ aaBbcc ⇒ aabbcc The extended positional grammar which generates the context-sensitive language an bn cn can be obtained modifying this string grammar accordingly. In particular, the set of nonterminals is given by N = {S, B} where each vsymbol has two syntactic attributes, named head and tail, both specifying a position in the plane. The set of terminals is given by T = {a, b, c} and have one syntactic attribute (the pair of coordinates of their centroid), referred to as head or tail interchangeably. As described in section 1.2.2, the right-to relation is the visual counterpart of the string concatenation relation. Thus, the set of relations is given by POS = {right-to} and the right-to relation can be defined as: x right-to y if and only if ∃! y | yhead = xtail + 1 where x, y ∈ N ∪ T. The set of productions P is described below. (1) S → a right-to B right-to S right-to c ∆: (Shead = ahead ; Stail = ctail ) (2) S → a right-to B right-to c ∆: (Shead = ahead ; Stail = ctail ) (3) B → b right-to c ∆: (Bhead = bhead ; Btail = btail ) Γ: {(c ; true; chead = chead , ctail = ctail )} (4) B → a right-to B ∆: (Bhead = ahead ; Btail = atail ) Γ: {(a ; true; ahead = Bhead , atail = Btail )}

18

Chapter 2. Extended Positional Grammars

(5) B → b right-to b ∆: (Bhead = bhead ; Btail = btail ) Γ: {(b ; true; bhead = bhead , btail = btail )} Notice that the set superscripts are used to distinguish different occurrences of the same vsymbol and the terminals in the left-hand side of the string grammar productions are moved in the Γ rules of the XPG productions. These productions do not satisfy the property that guarantee the convergence of the parsing algorithm, but can be easily shown that it converges. Example 2.3 Let STD=((N, T ∪ POS, S, P), PE) be the XPG for State Transition Diagrams, characterized as follows. The set of non-terminals is given by N = {StateTD, Graph, Node} where each vsymbol has one attaching region as syntactic attribute, and StateTD is the starting vsymbol, i.e. S = StateTD. The set of terminals is given by T = {NODEI, NODEIF, NODEF, NODEG, EDGE, PLACEHOLD}. The terminal vsymbols NODEI, NODEIF, NODEF, NODEG have one attaching region as syntactic attribute. They represent, the initial, the initial and final, the final, and the generic node, respectively, of a state transition diagram. The terminal vsymbol EDGE has two attaching points as syntactic attributes corresponding to the start and end points of the edge. Finally, PLACEHOLD is a fictitious terminal vsymbol to be dynamically inserted in the input sentence during the parsing process. It has one attaching region as syntactic attribute. The tokens are graphically depicted in Fig. 2.2. Here, each attaching region is represented by a bold line and is identified by the number 1, whereas the two attaching points of EDGE are represented by bullets and are identified each by a number. 1

1

1

1

1 1

NODEI

NODEIF

NODEF

NODEG

EDGE

2 PLACEHOLD

Figure 2.2: The terminals for the grammar STD. The set of relations is given by POS = {LINKi,j , any }, where the relation identifier any denotes a relation that is always satisfied between any pair of vsymbols. Moreover, we

Chapter 2. Extended Positional Grammars

19

use the notation h k when describing the absence of a connection between two attaching areas h and k. Next, we provide the set of productions for describing State Transition Diagrams. (1) StateTD → Graph (2) Graph → NODEI ∆: (Graph1 = NODEI1 ) (3) Graph → NODEIF ∆: (Graph1 = NODEIF1 ) (4) Graph → Graph’ 1 1,1 2 EDGE 2 1 Node ∆: (Graph1 = Graph’1 - EDGE1 ) Γ: {(PLACEHOLD; |Node1 | >1; PLACEHOLD1 = Node1 - EDGE2 )} (5) Graph → Graph’ 1 1, 1 2 EDGE ∆: (Graph1 = (Graph’1 - EDGE1 ) - EDGE2 ) (6) Graph → Graph’ 1 2, 1 1 EDGE 1 1 Node ∆: (Graph1 = Graph’1 - EDGE2 ) Γ: {(PLACEHOLD; |Node1 | >1; PLACEHOLD1 = Node1 - EDGE1 )} (7) Graph → Graph’ any  PLACEHOLD ∆: (Graph1 = PLACEHOLD1 ) (8) Node → NODEG ∆: (Node1 = NODEG1 ) (9) Node → NODEF ∆: (Node1 = NODEF1 ) (10) Node → PLACEHOLD ∆: (Node1 = PLACEHOLD1 ) Notice that Graph1 = Graph’1 - EDGE1 indicates set difference and is to be interpreted as follows: “the attaching area 1 of Graph has to be connected to whatever is attached to the attaching area 1 of Graph’ except for the attaching point 1 of EDGE”. Moreover the notation |Node1 | indicates the number of connections to the attaching area 1 of Node.

20

Chapter 2. Extended Positional Grammars According to these rules, a State Transition Diagram is described by a graph (produc-

tion 1) defined as • an initial node (production 2) or as • an initial-final node (production 3) or, recursively, as • a graph connected to a node through an outgoing (production 4) or incoming (production 6) edge, or as • a graph with a loop edge (production 5). A node can be either a generic node (production 8) or a final node (production 9). The need for productions 7 and 10 will be clarified in the following example. Example 2.4 Suppose we are to model the help module of a software package. Fig. 2.3 shows the state transition diagram describing the behavior of this help module. When the user enters the software package, this goes into state 1. At this point a window pops up, letting the user access the help module by clicking on the Show Tips option, or refuse help support and directly go to the package functions (state 3) by clicking on the No Tips option. If he/she chooses the first option, the package enters state 2, where an help page is displayed. In this state the user can decide to exit tips and access other package functions (state 3), or to show the next help page and to remain in state 2. Next Tip 1

Enter Package

Show Tips

2

No Tips 3

Exit Tips

Figure 2.3: The state transition diagram for a help module. Figg. 2.4(a-i) show the steps to reduce the state transition diagram of this help module through the extended positional grammar STD shown above. In particular, dashed ovals indicate the handles to be reduced, and their labels indicate the productions to be used. The reduction process starts by applying production 2 to the initial state transition diagram. This causes the terminal NODEI representing state 1 to be reduced to the

Chapter 2. Extended Positional Grammars

21

non-terminal Graph. Due to the ∆ rule of production 2, Graph inherits all the connections of NODEI. Similarly, the application of production 8 replaces the unique NODEG of Fig. 2.4(a) with the non-terminal Node. Fig. 2.4(b) shows the resulting visual sentential form, and highlights the handle for the application of production 4. The vsymbols Graph, EDGE, and Node are then reduced to the new non-terminal Graph. Due to the ∆ rule of production 4, the new Graph is connected to all the remaining edges attached to the old Graph. Moreover, due to the Γ rule, since |Node| = 4 > 1, a new node PLACEHOLD is inserted in the input, and it is connected to all the remaining edges attached to the old Node. Fig. 2.4(c) shows the resulting visual sentential form. After the application of Production 2

Next Tip

Production 8

1

Show Tips

Production 4 Graph

2

Enter Package

Graph

Node

Graph No Tips

Node 3

Exit Tips

Production 9

(a)

(b)

Graph

Production 4

(c)

(d)

Graph ’ Graph

Production 5

Graph

StateTD

Production 7

(e)

Production 10

(f)

Node

(g)

Production 4

Production 1

(h)

(i)

Figure 2.4: The reduction process for a state transition diagram. productions 9 and 4 the visual sentential form reduces to the one shown in Fig. 2.4(e). Then, production 7 reduces the non-terminals Graph and PLACEHOLD to a new nonterminal Graph. By applying the ∆ rule of production 7, the new Graph inherits all the connections to PLACEHOLD (see Fig. 2.4(f)). The subsequent application of productions 10, 5, 4 and 1 reduces the original state transition diagram to the starting vsymbol in Fig. 2.4(i), confirming that the visual sentence associated to the initial state transition diagram belongs to the visual language L(STD). Ambiguous grammars An XPG may have two types of ambiguity. The first one says that an XPG G is structurally ambiguous if there exists at least a positional sentence having more than one derivation. This is similar to the definition of ambiguity for textual grammars. The second type of

22

Chapter 2. Extended Positional Grammars

ambiguity for an XPG G occurs when the application of PE to different positional sentences produces the same visual sentence. In this case G is said to be visually ambiguous. Obviously, an XPG may present both ambiguities simultaneously. As an example of visual ambiguity, let us consider the grammar for state transition diagrams given in example 2.3. It is easy to verify that the visual sentence in Fig. 2.5 may be described from the grammar using two different reduction processes. In particular, a reduction applies production 4 and then production 5, while the other applies production 5 and then production 4.

1

2

Figure 2.5: A visually ambiguous sentence for the grammar of example 2.3.

2.1

Modeling UML Statechart Diagrams through XPGs

In this section we show how Extended Positional Grammars can be used to specify UML Statechart Diagrams [17]. This language is derived from the original proposal by Harel [34], with modifications in order to include object-oriented features. It models the states of an object and how an object moves from state to state for its entire lifetime. In particular the OMG Unified Modeling Language Specification states that “statechart diagrams represent the behavior of entities capable of dynamic behavior by specifying its response to the receipt of event instances. Typically, it is used for describing the behavior of class instances, but statecharts may also describe the behavior of other entities such as use-cases, actors, subsystems, operations, or methods” [50]. UML statechart diagrams are a very rich graphical specification formalism obtained as an extension of conventional finite state machines with more powerful concepts such as hierarchy of states, orthogonality, interlevel transitions, etc. [34]. There are five different kinds of entities in a statechart diagram namely statevertexes, transitions (arcs), events, conditions and actions. As described in [50], statevertexes may be states, pseudostates, stub states or synch states. The states (shown as rectangles with rounded corners) may be composite (such as Alive in Fig. 2.6), simple (such as Created and Dead ) or final.

Chapter 2. Extended Positional Grammars

23

Composite states may be OR-States or AND-States. OR-States contain a set of other states (substates) that are related to each other by “exclusive-or”. AND-States contain at least two unnamed concurrent OR-States separated by dashed lines. Simple states are those that have no substates (they are at the bottom of the hierarchy). A final state (shown as a circle surrounding a small solid filled circle) represents the completion of an activity in the enclosing state. A state may have associated internal transitions that Alive new Thread

timeout, interrupt Start

Created

Runnable get scheduled [lock available if synchronized] Stop

yield, get scheduled, lock [lock available if synchronized]

Dead run returns

timeout, interrupt, Notify other Thread’s death, timeout interrupt

Joined

join (with another Thread)

Asleep

Waiting

sleep

wait / release lock

Running do/ execute code

Figure 2.6: Java Thread Life Cycle. depict what activities the object will be doing while it is in that state. The general format for the internal transitions is: event-signature ’[’ guard-condition ’]’ ’/’ actionexpression, where the event-signature describes an event that is an occurrence that may trigger an action, the guard-condition is a boolean expression, the action-expression is executed if and when the event occurred. As an example in Fig. 2.9 the state Working has two internal transitions. The first transition entry/i++ specifies that the counter i is incremented upon entry to the state Working, the second exit/i - - specifies that the counter i is decremented upon exit from the state Working. The pseudostates kinds are: initial, deepHistory, shallowHistory, join, fork, junction and choice. Each OR-State (as well as the statechart diagram itself) contains exactly one initial pseudostate, which is shown as a small solid filled circle. Transitions connect states and may be labeled by a transition string that has the same format of internal transitions of the states. As an example, in Fig. 2.6 the transition from the state Running to the state Waiting is labelled with the transition string formed by the event wait and the action release lock. To simplify the statechart diagram language description the deepHistory indicator, the stub states and synch states will not be considered in the following, moreover we consider statechart diagrams where the initial pseudostates are connected only to states.

24

Chapter 2. Extended Positional Grammars Thus, statechart diagrams are a hybrid visual notation since they use links between

nodes to model transitions and spatial relations such as “containment” to model the hierarchy of states. It is widely recognized that the main difficulties for modeling statecharts derive from the presence of interlevel transitions (like the transition labeled with the event run returns in Fig. 2.6), multiple source/multiple target transitions, join/fork connectors, and history connectors. In order to capture the interlevel transitions the extended positional grammar proposed in this section models statechart diagrams in two different phases. The first phase determines the hierarchy of states and the second phase analyzes the transitions amongst states. Thus after the first phase, the statechart diagram is transformed in a graph formed by the states and the transitions of the statechart. As an example Fig. 2.7 shows the graph correspondent to the statechart in Fig. 2.6. Created

Runnable

Asleep

Alive

Joined

Waiting

Dead Running End

Figure 2.7: The graph describing the transitions in the statechart diagram of Fig. 2.6.

Now, we focus our attention on the first phase. In order to describe how a state is contained in a superstate we introduce a spatial relation contains. To this aim we associate to the states a containment area as syntactic attribute. Such containment area corresponds to the rectangle area representing the state. Thus the containment relation is defined as: A contains B if and only if the containment area of A is the closest containment area that contains B, where A and B are states of a statechart diagram. As an example from Fig. 2.6 Alive contains Joined holds because there are no states that contain Joined and are contained in Alive. Moreover the states of a statechart diagram can be related through a sibling relationship whenever either the states are contained in the same superstate, or they are not contained in superstates. Such relationship can be described by a spatial relation, named sibling, defined as: A sibling B if and only if

Chapter 2. Extended Positional Grammars

25

1. there exists C such that C contains A and C contains B, or 2. there does not exist a C such that C contains A or C contains B where A, B and C are states of a statechart diagram. As an example, Running sibling Waiting holds because Alive contains Running and Alive contains Waiting, moreover Alive sibling Dead holds because the condition 2 is satisfied. The extended positional grammar SD = ((N, T∪POS, S, P), PE) for UML statechart diagrams has the following characteristics. The set of non-terminals is given by N = {StateDiagram, Hierarchy, HSeqState, SeqState, IState, Initial, Diagram, State, Connector, Component, Comp, MultiGraph, Graph, Sync, Node} where the first eleven vsymbols have one containment area as syntactic attribute, and the other have one attaching region as syntactic attribute. The set of terminals is given by T = {INITIAL, STATE, CONCURRENT, FINAL, HISTORY, JUNCTION, CHOICE, FORK JOINT, EDGE, NEWSTATE, NEW FJ, PLACEHOLD} and are graphically depicted in Fig. 2.8. 1

1

1

1

1

H INITIAL

STATE

CONCURRENT

HISTORY

1 2

FORK_JOINT

1

1 2

2

EDGE

JUNCTION 1

1

1

CHOICE

FINAL

NEWSTATE

NEW_FJ

PLACEHOLD

Figure 2.8: The terminals for the grammar SD. Here, each attaching region is represented by a bold line and identified by a number, each containment area is represented by a fill dotted area, while the attaching points are represented by bullets. The terminal vsymbol INITIAL has one attaching region as syntactic attribute and represents the initial pseudostate. The terminal vsymbol STATE has one attaching region and one containment area as syntactic attribute and represents the state of a statechart diagram. The terminal vsymbol CONCURRENT has one containment area as syntactic attribute, and it represents the concurrent substate contained in an AND-State. The terminal vsymbol FINAL has one attaching region as syntactic attribute and represents a final state.The terminal vsymbols HISTORY, JUNCTION and CHOICE have one attaching region as syntactic attribute and they represent the history

26

Chapter 2. Extended Positional Grammars

state indicator, the junction point (which is used to merge and split transitions), and the dynamic choice point, respectively. The terminal vsymbol FORK JOINT has two attaching regions as syntactic attributes corresponding to the incoming transitions and outgoing transitions, and it represents synchronization, forking, or both. The terminal vsymbol EDGE has two attaching points as syntactic attributes corresponding to the start and end points of the transition. Finally, NEWSTATE, NEW FJ and PLACEHOLD are fictitious terminal vsymbols to be dynamically inserted in the input sentence during the parsing process. The set of relations is given by POS = {LINKi,j , sibling , contains, any }, where LINKi,j is as defined in Section 2.3 and will be denoted as h k to simplify the notation. Next, we provide the set of productions of the first phase to determine the hierarchy of states in the statechart diagrams. xHy@

j_A_BzcAeaAs@→ {cBaAap]g@〈 〉 |wb_c}aAt]@

xHƒy@ j_A_B@→ j\I\@〈  〉 {cBaAap]g@

xGy@

j_A_BzcAeaAs@→ {cBaAap]g@

@

@ ∆€@xj_A_BAaBA@k@j\I\AaBAy@

xry@

{cBaAap]g → ~j_A_B@〈 〉 ΗjBj_A_B@

@@@@

@@@@Γ€@lxuŽj\I\@j\I\HŠ@uŽj\I\H@k@j\I\Hyn@

@

@ ∆€@x{cBaAap]gAaBA@k@~j_A_BAaBAy@

xH†y@ j_A_B@→ j\I\@

x‚y@

{cBaAap]g → ~j_A_B@

@

@ ∆€@xj_A_BAaBA@k@j\I\AaBAy@

@

@ ∆€@x{cBaAap]gAaBA@k@~j_A_BAaBAy@

@

@@@@Γ€@lxuŽj\I\@j\I\HŠ@uŽj\I\H@k@j\I\Hyn@

xƒy@

{jBj_A_B@→ jBj_A_B@〈 〉 {„j\i…„@

xH‡y@ j_A_B@→ j\I\@〈  〉 ‹`st`dBd_@

@

@ ∆€@x{jBj_A_BAaBA@k@jBj_A_BAaBAy@

@

@ ∆€@xj_A_BAaBA@k@j\I\AaBAy@

x†y@

{jBj_A_B@→ jBj_A_B@

@

@ Γ€@lxuŽj\I\@j\I\HŠ@uŽj\I\H@k@j\I\Hyn@

@

@ ∆€@x{jBj_A_BAaBA@k@jBj_A_BAaBAy@

xHˆy@ j_A_B@→ ’~uIŒ@ ∆€@xj_A_BAaBA@k@“y@

x‡y@

jBj_A_B@→ zcAeaAs@〈 〉 jBj_A_B@

@

@

@ ∆€@xjBj_A_BAaBA@k@zcAeaAsAaBAy@

@

xˆy@

jBj_A_B@→ zcAeaAs@

xH‰y@ ‹`ddBp_`a@→ ”•u‹\~iu@

@

@ ∆€@xjBj_A_BAaBA@k@zcAeaAsAaBAy@

x‰y@

~j_A_B@→ ~dc_cAb@〈  〉 {cBaAap]gҏҏ@

@

@

@@@@∆€@x~j_A_BAaBA@k@~dc_cAbAaBAy@

xGŠy@ ‹`ddBp_`a@→ ‹{i~‹@

@∆€@x‹`ddBp_`aAaBA@k@“y@

xHŠy@ ~j_A_B@→ ~dc_cAb@

@

@

@

@@@@∆€@x~j_A_BAaBA@k@~dc_cAbAaBAy@

xHHy@ ~j_A_B@→ @~dc_cAb 〈  〉 ‹`st`dBd_@ @

@@@@∆€@x~j_A_BAaBA@k@~dc_cAbAaBAy@

@

@ ∆€@x~dc_cAbAaBA@k@j\I\AaBAy@@

@

@ Γ€@lxuŽj\I\@j\I\HH@@

@

@ @ @@@@uŽj\I\H@k@j\I\H‘z}Gyn@

@

@ ∆€@xzcAeaAsAaBA@k@j_A_BAaBAy@

xH‚y@ zcAeaAs@→ ‹`ddBp_`a@ @

@ ∆€@xzcAeaAsAaBA@k@‹`ddBp_`aAaBAy@

@@@@

@ Γ€@lx@uŽj\I\@@”•u‹\~iuHŠ@uŽj\I\H@k@j_A_BHyn@ ∆€@x‹`ddBp_`aAaBA@k@“y@ @ Γ€@lxuŽj\I\@@‹{i~‹HŠ@uŽj\I\H@k@j_A_BHyn@

xGHy@ ‹`ddBp_`a@→ @’i…–—”i~u\@ @

xHGy@ ~dc_cAb@→ ~u~\~IŒ@〈〉@@z}@@〈〉 j\I\@

xHry@ zcAeaAs@→ j_A_B@

@ Γ€@lxuŽj\I\@@’~uIŒ@HŠ@uŽj\I\H@k@’~uIŒ@Hyn@

∆€@x‹`ddBp_`aAaBA@k@“y@

@

@ Γ€@lxuŽ—’”@’i…–—”i~u\HŠuŽ—’”H@k@’i…–—”i~u\Hm@

@

@ @@@@@@@@uŽ—’”G@k@’i…–—”i~u\Gyn@

xGGy@ ‹`st`dBd_@→ ‹`st@〈 〉 ‹`st`dBd_@ @

@ ∆€@x‹`st`dBd_AaBA@k@‹`stAaBAy@

xGry@ ‹`st`dBd_@→ ‹`st@〈 〉 ‹`st˜@ @

@ ∆€@x‹`st`dBd_AaBA@k@‹`stAaBAy@

xG‚y@ ‹`st@→ ‹iu‹•……u\@〈  〉 {cBaAap]g@ @

@ ∆€@x‹`stAaBA@k@‹iu‹•……u\AaBAy

Chapter 2. Extended Positional Grammars

27

Productions 1 and 2 describe a statechart diagram as a hierarchy of states (first phase of the reduction process) represented by the non-terminal Hierarchy and an optional set of transitions between states (second phase) represented by the non-terminal MultiGraph. A Hierarchy is formed by an initial state (Istate) and an optional sequence of states and/or connectors (HSeqState) as described in productions 3 and 4. An initial state is obtained by linking an initial pseudostate to a state (production 12), and it can be: • an initial simple state (production 10), or • an initial OR-State (production 9) which is formed by the non-terminal Initial with a Hierarchy in its containment area, or • an initial AND-State (production 11) which is formed by the non-terminal Initial with a non-terminal Component in its containment area. A Component is formed by at least two non-terminals Comp, where each one is a terminal CONCURRENT with a Hierarchy in its containment area (productions 22-24). The productions 5 and 6 define a HSeqState as formed by the non-terminal SeqState and an optional history indicator. SeqState is a sequence of states and/or connectors related through the sibling relation (productions 7-8). A state can be: a simple state (production 16), or an OR-State (production 15), or an AND-State (production 17), or a final state (production 18). A connector can be a junction connector (production 19), a dynamic choice connector (production 20), or a fork/join connector (production 21). During the first phase the terminals with incident links are reintroduced as described in the Γ rules and reanalyzed in the second phase. Fig. 2.9 shows the statechart diagram describing the behavior of a job processing environment. In the following we show the steps to reduce such a statechart diagram through the extended positional grammar SD. Figg. 2.10 (a-i) show the first phase of the reduction process. In particular, the process starts by applying production 12 to the initial state Working. This causes the reduction of vsymbols INITIAL, EDGE and STATE to the non-terminal IState. Due to the Γ rule of production 12, since |STATE1 |=3 a new state NEWSTATE is inserted in the input, and inherits all the connections of STATE except for the connection to EDGE. Similarly, the application of production 16 reduces the vsymbol STATE representing the state Waiting with the non-terminal State and the Γ rule inserts in the input sentence

28

Chapter 2. Extended Positional Grammars

Working

[ready] doWork(j:Job)/ p.tell(j)

Waiting

entry/i++ exit/i--

FinishedWork()

Holding

after (5s)

Sending

when (empty)

Awaiting confirm() confirmation

Figure 2.9: A statechart diagram that models the behavior of a job processing environment. a new state NEWSTATE. Fig. 2.10(b) shows the resulting visual sentential form, and highlights the handles for the application of productions 12, 16 and 18 respectively to the initial states, to the state Sending and to the final states contained into the AND-State, and for the applications of production 10 to the non-terminal Initial and of production 13 to the non-terminal State. The resulting visual sentential form is shown in Fig. 2.10(c). After the application of several productions the reduction process reduces the AND-State to the non-terminal Diagram that will be reduced with the non-terminals SeqState to the non-terminal HSeqState by applying productions 7 and 6. Fig. 2.10(g) shows the resulting visual sentential form with the handle for the application of production 3. Fig. 2.10(i) shows the visual sentential form resulting from the first phase of the reduction process of the statechart diagram in Fig. 2.9. It is composed by the non-terminal Hierarchy and an non-connected graph obtained reintroducing new states during the reduction. In general a node of the non-connected graph can be either the terminal NEWSTATE or the terminal NEW FJ. Now, we provide the set of productions of the second phase to analyze the transitions amongst such states. These can be easily obtained modifying the productions of the grammar for state transition diagrams introduced in example 2.3. As a matter of fact with productions 25 and 26 we describe a non-connected graph as formed by one or more graphs, where each one is a state transition diagram with no initial node and no final node (productions 27-31). Finally we add the productions for the fork/join connector (productions 32-37).

Chapter 2. Extended Positional Grammars xGƒy@@|wb_c}aAt]@→@@}aAt]@〈〉 |wb_c}aAt]@ @ @@ @∆€@x|wb_c}aAt]H@k@@}aAt]Hy@ xG†y@@|wb_c}aAt]@→@@}aAt]@@ @ @ @∆€@x|wb_c}aAt]H@k@@}aAt]Hy@ xG‡y@}aAt]@@→@uŽj\I\@ @ @ @∆€@x}aAt]H@k@@uŽj\I\Hy@ xGˆy@}aAt]@@→@}aAt]ž@@〈〈〉m〈〉〉@@z}@@@u`EB@ @ @ @∆€@x}aAt]H@k@}aAt]žH@‘@z}Hym@ @ @ @Γ€@l@xhŒI‹{iŒz@@u`EBHH@@ @ @ @ @@@@@hŒI‹{iŒzH@k@u`EBH@‘@z}Gyn@ xG‰y@}aAt]@@@→@@}aAt]ž@@〈〈〉m〈〉〉@@z}@ @ @ @∆€@x}aAt]H@k@x}aAt]žH@‘@z}Hy@‘@z}Gy@ xrŠy@}aAt]@@@→@@}aAt]ž@@〈〈〉m〈〉〉@@z}@@@u`EB@ @ @ @∆€@x}aAt]H@k@}aAt]žH@‘@z}Gym@@@@@ @ @ @Γ€@l@xhŒI‹{iŒz@@u`EBHH@@ @ @ @ @@@@@hŒI‹{iŒzH@k@u`EBH@‘@z}H@yn@ xrHy@}aAt]@@@→@@}aAt]ž@@〈〉 hŒI‹{iŒz@ @ @ @∆€@x}aAt]H@k@hŒI‹{iŒzHy@ xrGy@}aAt]@@@→@}aAt]ž@@〈〈〉m〈〉〉@@z}@@@jgdp@@ @ @@ @∆€@x}aAt]H@k@}aAt]žH@‘@z}Hy@ xrry@}aAt]@→@}aAt]ž@@〈〈〉m〈〉〉@@z}@@jgdp@

29

@ @ ∆€@x}aAt]H@k@}aAt]žH@Ÿ@z}Gy@ xr‚y@jgdp@@→@@uŽ—’”@@〈〈〉m〈〉〉 z}@@@u`EB@ @ @ ∆€@xjgdpH@k@uŽ—’”Hm@jgdpG@k@uŽ—’”@G@‘@z}Hy@ @ @ Γ€@l@xhŒI‹{iŒz@u`EBHH@@ @ @ @ @@@@@hŒI‹{iŒzH@k@u`EBH@‘@z}Gy@ xrƒy@jgdp@@→@@uŽ—’”@@〈〈〉m〈〉〉 z}@@@u`EB@ @ @ ∆€@xjgdpH@k@uŽ—’”H@Ÿ@z}Gm@jgdpG@k@uŽ—’”Gy@ @ @ Γ€@l@xhŒI‹{iŒz@u`EBHH@@ @ @ @ @@@@@hŒI‹{iŒzH@k@u`EBH@Ÿ@z}Hy@ xr†y@@jgdp@@→@@jgdp˜@〈〈〉m〈〉〉 z}@@@u`EB@ @ @ ∆€@xjgdpH@k@jgdp˜Hm@jgdpG@k@jgdp˜G@‘@z}Hy@ @ @ Γ€@l@xhŒI‹{iŒz@u`EBHH@@ @ @ @ @@@@@@hŒI‹{iŒzH@k@u`EBH@‘@z}Gy@ xr‡y@@jgdp@@→@@jgdp˜@〈〈〉m〈〉〉 z}@@@u`EB@ @ @ ∆€@xjgdpH@k@jgdp˜H@Ÿ@z}Gm@jgdpG@k@jgdp˜Gy@ @ @ Γ€@l@xhŒI‹{iŒz@u`EBHH@@ @ @ @ @@@@hŒI‹{iŒzH@k@u`EBH@Ÿ@z}Hy@ xrˆy@@u`EB@@→@@uŽj\I\@ @ @ ∆€@xu`EBH@k@uiz}Hy@ xr‰y@@u`EB@@@→@@hŒI‹{iŒz@ @ @ ∆€@xu`EBH@k@hŒI‹{iŒzHy@

Figg. 2.11(a-h) show the steps to reduce the visual sentential form of Fig. 2.9(i) through the productions shown above. The reduction process starts by applying production 27 to one node of each graph. This causes the terminals NEWSTATE to be reduced to the non-terminal Graph. Due to the ∆ rule of production 27, Graph receives all the connections of NEWSTATE. Similarly, the application of production 38 reduces the remaining NEWSTATE of Fig. 2.11(a) with the non-terminal Node. Fig. 2.11(b) shows the resulting visual sentential form, and highlights the handle for the application of productions 28 and 30. The vsymbols Graph, EDGE, and Node are then reduced to the new non-terminal Graph. Due to the ∆ rules of productions 28 and 30, the new Graph is connected to all the remaining edges attached to the old Graph. Moreover, due to the Γ rule, since |Node| = 2 > 1 for the graph on the left, a new node PLACEHOLD is inserted in the input, and it is connected to all the remaining edges attached to the old Node. Fig. 2.11(c) shows the resulting visual sentential form. After the application of productions 28 and 26 the visual sentential form reduces to the one shown in Fig. 2.11(d). Then, production 31 reduces the non-terminals Graph and PLACEHOLD to a new non-terminal Graph. By applying the ∆ rule of production 31, the new Graph receives all the connections to PLACEHOLD (see Fig. 2.11(e)). Moreover, production 25 reduces the non-terminals MultiGraph and Graph to a new non-terminal MultiGraph and production 39 reduces PLACEHOLD to a new non-terminal Node. The subsequent application of productions 30, 25 and 1 reduces the

30

Chapter 2. Extended Positional Grammars Production 16

Production 12

Production 10 Initial

Production 16 Production 12 State

Production 18 Production 12

Production 13 Production 18

(a)

(b)

Production 8 Diagram

IState

SeqState

IState

Productions 13 + 8 Production 13 Production 10

Initial

Initial

Production 10

Production 7

State

State

IState

SeqState

IState

State

Diagram

SeqState

Productions 13 + 8

Production 6

(c)

(d)

SeqState

IState

Production 6 + 3 IState

SeqState

IState

HSeqState

SeqState

IState

Production 24 Hierarchy

Hierarchy

Production 3

Production 24

(e)

IState

(f)

HSeqState Hierarchy

Production 3

•••

(g)

(i)

Figure 2.10: The first phase of the reduction process for the statechart diagram in Fig. 2.9.

original statechart diagram to the starting vsymbol in Fig. 2.11(h), confirming that the visual sentence associated to the initial statechart diagram belongs to the visual language L(SD). It is worth noting that non well-formed statechart diagrams are also in SD. The language L(SD) can be restricted to well-formed statechart diagrams by modifying and adding new productions to the grammar SD. As an example in a well-formed statechart diagram final states cannot have any outgoing transitions. Such property is captured by modifying the Γ rule in production 18 so that a new state, named NEWFS, is reintroduced in the input sentence. In particular the production becomes:

Chapter 2. Extended Positional Grammars

31

Production 27 Production 27

Hierarchy

Production 38 Hierarchy

Production 30

Production 28

Graph

Graph

Production 38 Node

Node Node

Production 38

Production 38

Production 27

Production 28

(b)

Production 28

Graph

Graph

Production 30

(a)

Hierarchy

Node

Node

Production 25 Hierarchy

Graph

Production 31

Node

Hierarchy Graph

Graph

Production 30

Node Production 39

Graph

Multi Graph

Node

Multi Graph

Graph

Production 26

(c)

(d)

(e)

Production 1

Hierarchy

Production 25

Multi Graph

Multi Graph

Hierarchy

StateDiagram

Graph

(f)

(g)

(h)

Figure 2.11: The second phase of the reduction process of the statechart diagram in Fig. 2.9.

(18’) State → FINAL ∆: (Statearea = Ø); Γ: {(NEWFS; | FINAL 1|>0; NEWFS1 = FINAL 1)}

Consequently new productions must be introduced for the second phase of the reduction process to manage the vsymbol NEWFS. The productions consider only incoming transitions for such vsymbol as shown in the following. (40) Graph → Graph' 〈〈1_1〉,〈 1_ 2 〉〉 EDGE 2_1 NEWFS ∆: (Graph1 = Graph'1 - EDGE1), Γ: { (NEWFS’; | NEWFS1|>1; NEWFS’1 = NEWFS1 - EDGE2)} (41) Sync → NEW_FJ 〈〈2_1〉,〈 2 _ 2 〉〉 EDGE 2_1 NEWFS ∆: (Sync1 = NEW_FJ1, Sync2 = NEW_FJ 2 - EDGE1); Γ: { (NEWFS’; | NEWFS1|>1; NEWFS’1 = NEWFS1 - EDGE2)} (42) Sync → Sync’ 〈〈2_1〉,〈 2 _ 2 〉〉 EDGE 2_1 NEWFS ∆: (Sync1 = Sync’1, Sync2 = Sync’2 - EDGE1); Γ: { (NEWFS’; | NEWFS1|>1; NEWFS’1 = NEWFS1 - EDGE2)}

Now we show how to modify the grammar in order to analyse the text of the transitions associated to the edges and to the states. In particular the string transitions can be modelled by using textual annotations, so the terminal vsymbols STATE and EDGE have also

32

Chapter 2. Extended Positional Grammars

one textual annotation as syntactic attribute and the set of relations POS is extended by adding the relations annotated-by and right-to. The former denotes a relation that is satisfied between a vsymbol and a sentence, whereas the latter is defined as in example 2.2. The productions must take into account the new annotation relation so the terminal vsymbol EDGE must be substituted by the non-terminal Edge which has two attaching points as syntactic attributes. As an example the production 12 becomes: (12’) Initial → INITIAL 〈1_1〉 Edge 〈2_1〉 STATE ∆: (Initialarea = STATEarea); Γ: {(NEWSTATE; |STATE1|>1; NEWSTATE1 = STATE1-Edge2)}

In the following we provide the subset of productions for describing the string transitions annotating the terminal vsymbol EDGE. (43) Edge → EDGE Label ∆: (Edge1 = EDGE1; Edge2 = EDGE2); (44) Edge → EDGE ∆: (Edge1 = EDGE1; Edge2 = EDGE2); (45) Label → Event Condition Action ∆: (Labeltext = Eventtext + Conditiontext + Actiontext); (46) Event → EVENT ∆: (Eventtext = EVENTtext, Eventhead = EVENThead, Eventtail = EVENTtail); (47) Event → EVENT ( PARAM ) ∆: (Eventtext = EVENTtext + ‘(‘ + PARAMtext + ‘)’ , Eventhead = EVENThead, Eventtail = PARAMtail+1); (48) Condition → [ COND ] ∆: (Conditiontext = ‘[‘ + CONDtext + ‘]’, Conditionhead = CONDhead-1, Conditiontail = CONDtail+1); (49) Action → / ACT ∆: (Actiontext = ‘/’+ ACTtext, Actionhead = ACThead-1, Actiontail = ACTtail);

Productions 43 and 44 describe a transition as formed by the terminal vsymbol EDGE and an optional annotating string transition (Label ). The general format for the transition strings is event-signature ’[’ guard-condition ’]’ ’/’ action-expression, where event-signature is event-name ’(’ comma-separated-parameter-list ’)’. Thus productions 45-49 describe a string transition as the string concatenation of event, condition and action. The nonterminal Label has a string as syntactic attribute referred to as text; the non-terminals Event, Condition and Action together with the terminals have three syntactic attributes: a string referred to as text, and two positions in the plane, named head and tail. Notice that Labeltext = Eventtext + Actiontext indicates string concatenation and is to be interpreted as follows: “the text of Label is obtained from the concatenation of the text of

Chapter 2. Extended Positional Grammars

33

Event and the text of Action”. In order to describe the internal transitions associated to the states, the terminal vsymbol STATE must be substituted by the non-terminal AnnState which has one containment area as syntactic attribute. As an example the production 27 becomes: (27’) State → AnnState 〈contains〉 Hierarchy ∆: (Statearea = AnnStatearea); Γ: {(NEWSTATE; | AnnState1|>0; NEWSTATE1 = AnnState1)}

The productions describing internal transitions annotated to the terminal vsymbol STATE are shown in the following: (50) AnnState → STATE SeqLabel ∆: (AnnStatearea = STATEarea); Γ: {(NEWSTATE; |STATE1|>0; NEWSTATE1 = STATE1)} (51) AnnState → STATE ∆: (AnnStatearea = STATEarea); Γ: {(NEWSTATE; |STATE1|>0; NEWSTATE1 = STATE1)} (52) SeqLabel → Label SeqLabel’ ∆: (SeqLabeltext = Labeltext + SeqLabel’text, SeqLabelhead = Labelhead, SeqLabeltail = SeqLabel’tail); (53) SeqLabel → Label ∆: (SeqLabeltext = Labeltext, SeqLabelhead = Labelhead, SeqLabeltail = Labeltail);

The non-terminal SeqLabel has three syntactic attributes: a string referred to as text, and two positions in the plane, named head and tail.

Chapter 3

The XpLR Methodology

The XpLR methodology is an extension of the pLR methodology as presented in [15]. It is a framework for implementing visual systems based upon XPGs and LR parsing. As in pLR parsing, an XpLR parser scans the input in a non-sequential way, driven by the relations used in the grammar. In particular, the XpLR methodology differs from the pLR one in that it handles the Γ rules, and provides algorithms to eliminate conflicts arising during the construction of a pLR parsing table.

3.1

The XpLR Parser

The components of an XpLR parser are shown in Fig. 3.1 and are detailed in the following.

3.1.1

The input

The input to the parser is a dictionary, named Dp , storing the attribute-based representation of a picture as produced by the visual editor. No parsing order is defined on the graphical objects in the dictionary. The parser retrieves the objects in the dictionary through a find operation, driven by the relations in the grammar. The parser implicitly builds and parses a linear representation from the input attribute-based representation. If the input picture contains explicit relations, i.e., the relations have a graphical representation, its attribute-based representation is augmented with an array COUNTER containing an entry for each explicit relation. The entry COUNTER(r ) for an explicit relation labeled r with degree n contains the value n-1. This value indicates the number of binary relations describing r in any relative representation of the picture. During the

36

Chapter 3. The XpLR Methodology

parsing phase, all the visited tokens, and the traversed explicit binary relations, are marked in order to guarantee that each object and each explicit relation be considered at most once. The marking of an explicit binary relation REL labeled r is done by decreasing the entry COUNTER(r ) by 1. The 0-entry of the dictionary always refers to the end-of-input symbol EOI. Similarly to the usual end-of-string marker, the end-of-input symbol EOI is returned to the parser if and only if the input has been completely visited, i.e., all the input tokens have been parsed, and all the explicit relations have been traversed. These conditions are signaled by having all the tokens marked and COUNTER(r ) = 0 for each explicit relation r, respectively. XpLR Parsing Table action goto next next vsymbol request

vsymbol

XpLR parsing program (driver program)

Input

Output

sm Xm ..... s1 X1 s0

Stack

Figure 3.1: The architecture of an XpLR parser.

3.1.2

The stack

An instance of the stack has the general format s0 X1 s1 X2 s2 . . . Xm sm , where sm is the stack top; Xi is a grammar vsymbol, and si is a generic state of the parsing table. The parsing algorithm uses the state on the top of the stack, and the vsymbol currently under examination, to access a specific entry of the parsing table in order to decide the next action to execute.

3.1.3

The XpLR parsing table

An XpLR parsing table (see Fig. 3.2) is composed of a set of rows and is divided in three main sections: action, goto, and next. The action and goto sections are similar to the

Chapter 3. The XpLR Methodology

37

ones used in LR parsing tables for string languages [1], whereas the next section is used by the parser to select the next vsymbol to be processed. An entry next[k] for a state sk contains a pair (Rdriver , x ), which drives the parser in selecting the next vsymbol y that is reachable from x, by using the sequence of driver relations Rdriver . Two special pairs in the column next are (start, S) and (end, EOI), where S is the starting vsymbol and EOI is the end-of-input marker. The first is used at the beginning of the parsing process to retrieve the first vsymbol to be parsed. This vsymbol depends on the nature of the language. The latter is used to check whether the whole input sentence has been parsed. If all the vsymbols have been analyzed and all the explicit relations have been considered, then the query returns the EOI marker. St. 0 1 2 3 4 5 6 7

a :sh2

Action d e

b

NEXT

Goto f

EOI

A :1

B

acc :sh6

:3

2_2:sh4 r1

r1

:sh5 r1

r1

r2

r2

r2

r2

r1 :sh7 r2

r1 r2

(start, A) (end, EOI) (1_1, B) (11_1, b) (11_1, d) (2_2, f) -

Figure 3.2: An XpLR(0) parsing table. An action entry has one of the following four values: 1. “Rtester : shift s” where Rtester is a possibly empty sequence of tester relations and s is a state. As an example, the entry (3, b) in Fig. 3.3 contains 2 2: sh4; 2. reduce by a grammar production (i) A→ β, shown in the table as ri; 3. accept; 4. error shown as an empty entry. A goto entry contains “Rtester : s”, where Rtester is a possibly empty sequence of tester relations and s is a state. A shift or goto action is executed only if all the relations in the corresponding Rtester are true, or if Rtester is empty. As an example, let us consider the XpLR(0) parsing table

38

Chapter 3. The XpLR Methodology

in Fig. 3.2. If the current state corresponds to row 3, and the vsymbol currently scanned is b, then the parser executes the action (2 2: sh4), that is, if the relation 2 2 holds between b and the vsymbol on the stack top, then the parser shifts b and goes to state 4. Once in state 4, the parser launches a query on the input, based on the entry next[4] = (1 11 , d ). The query will search the input for a terminal vsymbol d, such that d is in relation 1 1 with the first vsymbol below the stack top.

3.1.4

The XpLR parser

In order to illustrate the XpLR parsing program we define the two functions Fetch Vsymbol and Test. The former uses the stack and the input as global data structures and takes its arguments from the column NEXT of the parsing table. The latter is used to validate the tester relations between vsymbols. It takes in input an action condition from the action or goto part of the parsing table and returns a boolean value. Function Fetch Vsymbol (NEXT) begin case NEXT of NEXT = (start, S): return the row index in Dp to the first object to parse NEXT = (end, EOI): if all the objects have been marked as visited and COUNTER(r) = 0 for each explicit relation r then return the row index 0 in Dp pointing to the end-of-input symbol EOI else return null ; NEXT = (Rdriver , x), where Rdriver = RELh1 1 ,. . .,RELhnn  and each RELhi i acts on a syntactic attribute ki of x let zi be the hi -th object below the stack top for i = 1 to n let next seti = {b | b is in Dp , it is not marked as visited, it has an attribute j such that (b, j ) is reachable from (x, ki ), zi RELi b holds, and the relation RELi acts on a syntactic attribute of zi and the syntactic attribute j of b, respectively } if ∩i=1...n next seti contains exactly one object b

Chapter 3. The XpLR Methodology

39

then for each explicit relation RELi in Rdriver do decrease by 1 the entry in the array COUNTER corresponding to the explicit relation zi RELi b mark the corresponding entry in Dp as visited return the row index of b in Dp else if ∩i=1...n next seti contains more than one object b emit “run-time conflict” and exit else return null ; NEXT = null : return null ; endcase end

Let us describe how the function works on the table in Fig. 3.2. In particular the relations Rdriver in the NEXT column are as follows: • the special relation start: in this case Fetch Vsymbol returns the index in Dp of an instance of the vsymbol a in the visual sentence (in this simple example a plays the role of the starting symbol); • the special relation end: in this case Fetch Vsymbol returns the index in Dp of the EOI vsymbol only if all the vsymbols and all the explicit relations of the visual sentence have been visited; • a relation h k: this relation must hold between the vsymbol z on the stack top and exactly one non visited vsymbol b in Dp . In particular, when NEXT = (h k, x ): 1. if x is a terminal vsymbol, Fetch Vsymbol returns the index in Dp of a non visited vsymbol whose name is x and whose k-th syntactic attribute is linked to the h-th syntactic attribute of z. 2. if x is a non-terminal vsymbol, Fetch Vsymbol returns the index in Dp of a non visited terminal vsymbol b whose j-th syntactic attribute is linked to the h-th syntactic attribute of z. The couples (x, k ) and (b, j ) are such that b is a terminal that begins a positional sentence derived from x and the k-th syntactic

40

Chapter 3. The XpLR Methodology attribute of x is synthesized from j by successively applying the ∆ rules in the derivation. If no object is found then Fetch Vsymbol returns null. On the other hand, if more

than one vsymbol is found, then the parser cannot proceed because it cannot decide which token to analyze deterministically. As a consequence, the function issues a run-time conflict message and stops the execution of the parser. The occurrence of this type of conflict, named run-time conflict, might prevent the recognition of syntactically correct input visual sentences. In section 3.4 we analyze the run-time conflicts, and give some heuristics to solve such problem. The function Test shown below verifies that the grammar vsymbol to be pushed on the stack top is properly related to a grammar vsymbol already in the stack.

Function T est(COND) let COND = (RELi , x) where x is either a terminal or a non-terminal let z be the i-th object below the stack top if z REL x holds then begin if REL is an explicit relation then decrease by 1 the entry in the array COUNTER corresponding to the explicit relation z REL b return true end else return false

In the following, we give the complete XpLR(0) parsing algorithm.

Algorithm 3.1 The XpLR(0) parsing algorithm. Input: A visual sentence in attribute-based representation and an XpLR(0) parsing table. Output: A bottom-up analysis of the visual sentence if this is syntactically correct, an error message otherwise. Method: Start with the state s0 on the top of the stack.

Chapter 3. The XpLR Methodology

41

repeat forever let s be the state on the stack top set ip = Fetch Vsymbol (next[s]) if ip is not null then // there is only one next vsymbol let b the grammar vsymbol pointed by ip if action[s, b] = “accept” then “success” and exit; if action[s, b] is a conditioned shift of type “Rt : shift s ” then if Rt is empty or Test(RELh , b) is true for each RELh ∈ Rt then push b and then s’ on the stack; else emit “syntax error ” and exit; else emit “syntax error ” and exit; else if next[s] is empty then

// the state s is a reduce state

let b a terminal vsymbol if action[s, b] = reduce A → x1 R1 x2 R2 . . . Rm−1 xm , ∆, Γ then compute the syntactic attributes of the vsymbol A according to the synthesis rule ∆, apply rule Γ, if present, and pop 2*m symbols from the stack let s’ be the new state on the stack top if goto[s’, A] is a conditioned goto of type “Rt : s” then if Rt is empty or Test(RELh , A) is true for each RELh ∈ Rt then push A and then s on the stack and output the production A → x1 R1 x2 R2 . . . Rm−1 xm , ∆, Γ else emit “syntax error ” and exit; else emit “syntax error ” and exit; else emit “syntax error ” and exit; else emit “syntax error ” and exit; endrepeat At each step, the XpLR parsing program checks the entries next[s] of the parsing table corresponding to state s on the top of the stack. If Fetch Vsymbol (next[s]) is not null, then the resulting pointer ip points to the next terminal b to be processed. In this case,

42

Chapter 3. The XpLR Methodology

either the input picture is accepted (action[s, b] = “accept”) with b being the end-of-input marker EOI, or b is shifted on the stack top (action[s, b] = “R: shift s ”). Whenever a shift action is required and the action condition R is not empty, then each relation RELhi i in R is tested between the hi -th object below the stack top and the object b to be shifted. Otherwise, if next[s] is empty then a reduce action is required. In this case, the pointer ip is not updated and it points to the last terminal b shifted on the stack top. The reduction action[s, b] = “reduce A → x1 R1 x2 . . . Rm−1 xm , ∆, Γ” is accomplished by calculating the syntactic attributes of A as specified by ∆, applying the Γ rule, popping 2*m elements out of the stack, and pushing A on the stack top. If s is the state on the stack top after popping the 2*m elements, then the next state s of the parser is given by the entry goto[s , A]. Also in this case, the goto action may be triggered by an action condition to be verified between objects below the stack top and the object A.

3.1.5

Parsing time complexity

In this subsection we analyze the time complexity of the XpLR parsing algorithm. As described in Chapter 2 an XpLR parser may not converge in the analysis of a visual sentence, since the parser may get into a loop while reducing productions where the number of vsymbols introduced with Γ is greater or equal to the number minus one of vsymbols popped from the stack. Thus, we restrict the analysis to the class of convergent XpLR parsers. In particular, we analyze the time complexity to parse a visual sentence containing n vsymbols, and with nt vsymbols inserted during the parsing. The worst case complexity is achieved for correct input pictures when all the input vsymbols are visited. Given an extended positional grammar XPG, let - na be the maximum number of attributes of a vsymbol, let - no be the maximum number of vsymbols in the right-hand side of a production, let - nr be the maximum number of relations in a tester, and let - t be the maximum number of triples in the Γ rules. At each step, the parser performs a shift or a reduce action. Therefore, the total number of shifts will be n + nt, while the number of reductions will be O(n + nt). The parsing

Chapter 3. The XpLR Methodology

43

algorithm performs a shift action whenever next[s] is defined and a reduce action otherwise. Let us compute separately the time complexity for shift and reduce actions. To perform a shift action the parsing program must first access the input and then test the action condition, if any. Let tq be the time required to perform the function F etch V symbol (on next[s]). Moreover, if an action condition is to be performed, the conditioned shift depends on the number nr of relations in a tester and on the time tr to test each relation. As the push operation on the stack takes time na, the total time complexity to perform a shift action is O(tq + nr ∗ tr + na). To reduce a production, the parser has to perform the following steps: (i ) calculate the syntactic attributes of the left-hand side non-terminal; (ii ) apply the Γ rule; (iii ) pop the records corresponding to the right-hand side vsymbols from the stack; (iv ) test for conditioned gotos; (v ) push the record corresponding to the left-hand side non-terminal onto the stack. The cost of step (i ) depends on the particular function used to synthesize each syntactic attribute. Let O(t∆ (no)) be the time required to perform this task, then the time complexity for step (i ) will be O(na ∗ t∆ (no)). The cost of the step (ii ) depends on the time c to compute the conditions and the time na required to insert new terminal vsymbols in the sentence. The total time is O(t(c+na∗t∆ (no))). As the stack pop operation takes time na, step (iii ) will cost O(no∗na). Similarly to a conditioned shift, a conditioned goto has time complexity O(nr ∗ tr + na). The final push operation (step (v )) takes time na. Therefore, the total time complexity for a reduce action is O(na(t∆ (no) + t(c + na) + no) + nr ∗ tr). Then, the time complexity of the parser is O((n+nt)(na(t∆ (no)+t∗c+no)+nr∗tr+tq)). For a fixed grammar, na, nr, no, c and t are constants and the time complexity reduces to O((n + nt)(tq + tr + t∆ )). The parameters tq, tr, and t∆ depend on the particular class of visual languages. For example, for the graph languages the access time tq may vary from a constant to O(n), depending on the chosen implementation of the input dictionary, while the test time tr is constant. Finally, the time complexity t∆ for synthesizing the syntactic attributes of a vsymbol requires O(n) time. Thus, for a fixed grammar modelling a graph

44

Chapter 3. The XpLR Methodology

language the time complexity is O((n + nt)(n ∗ tq)). By using proper hashing techniques to implement the dictionary Dp , the expected time complexity reduces to O(n(n + nt)).

3.2

Constructing XpLR(0) Parsing Tables

In this section we present the algorithms for the construction of an XpLR(0) parsing table. Let us start by providing the notion of item. An XpLR(0) item of an extended positional grammar is a production without the ∆ and Γ rules, and with a dot at some position of the right-hand-side. However, a dot can never be placed between a relation identifier and the terminal or non-terminal vsymbol to its right. As an example, the production A → X R1 Y R2 Z, ∆, Γ leads to the following four types of XpLR(0) items: [A → · X R1 Y R2 Z] [A → X · R1 Y R2 Z] [A → X R1 Y · R2 Z] [A → X R1 Y R2 Z ·] Intuitively, an item indicates how much of a production has already been examined during the parsing process and what is yet to come. For instance, the item [Graph → Graph’ · 1 1,1 2 EDGE] from example 2.3 means that the non-terminal Graph’ has already been seen and a terminal EDGE in relation 1 1,1 2 with Graph’ is expected next. A collection of sets of XpLR(0) items provides the basis for constructing XpLR(0) parsers. To construct such collection for a grammar, we define an augmented grammar and two functions, closure and goto. Given an extended positional grammar G with start vsymbol S, its augmented extended positional grammar G’ is derived from G by adding the new start vsymbol S’ and the production S’ → S. The Closure Operation If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I by the two rules: 1. Initially, every item in I is added to closure(I).

Chapter 3. The XpLR Methodology

45

2. If A → α · R Bβ with α = . or A → · Bβ is in closure(I) and B → γ is a production, then add the item B → · γ to I, if it is not already there. We apply this rule until no more new items can be added to closure(I). Intuitively, given a set of items I containing an item with a dot before a non-terminal B, the function CLOSURE adds to I all the items with B in the left-hand side and the dot preceding the first object of the right-hand side. This means that if the non-terminal object B is expected next, then any object starting a positional sentential form from B is expected next. The function closure can be computed as follows. Function CLOSURE(I) begin J=I; repeat for each item [A → α · R Bβ] with α = . or [A → · Bβ] in J and each production B → γ in G’ such that B → · γ is not in J do add [B → · γ] to J until no more items can be added to J return J; end.

The Goto Operation The second useful function is goto(I, x, R) where I is a set of items, x is a grammar vsymbol and R is a sequence of tester relations. goto(I, x, Rtester ) is defined to be the closure of the set of all items [A → α Rdriver , Rtester  x · β] such that [A → α · Rdriver , Rtester  x β] is in I. Intutively, once a vsymbol x has been seen, the function GOTO determines the ordered sequence of sets of items containing the vsymbols that can be seen next. Function GOTO(I, x, Rtester ) begin if Rtester = ∅ then

46

Chapter 3. The XpLR Methodology let J = { [A → α Rdriver x · β] | α = . and [A → α · Rdriver x β] ∈ I } ∪ { [A → x · β] | [A → · x β] ∈ I } else let J = { [A → α Rdriver , Rtester  x · β] | α = ., and [A → α · Rdriver , Rtester  x β] ∈ I } return CLOSURE(J)

end The Set-of-Items Construction We are now ready to give the algorithm to construct C, the collection of sequences of XpLR(0) item sets for an augmented grammar G’; the algorithm is shown in the following.

Algorithm 3.2 Construction of the sets of XpLR(0) items. Input: An augmented extended positional grammar G’. Output: The collection of XpLR(0) item sets. Method: Item sets are constructed by the main procedure ITEMS, which in turn calls the two functions CLOSURE and GOTO. Procedure ITEMS(G’) begin let C = { CLOSURE({[S’ → · S]}) } repeat for each set of items I in C, each vsymbol x such that there exists [A → α · Rdriver  x β] ∈ I or [A → · x β] ∈ I and GOTO(I, x, ∅) is not included in C do C = C ∪ GOTO(I, x, ∅) for each set of items I in C, each vsymbol x and each sequence of tester relations Rtester = ∅ such that [A → α · Rdriver , Rtester  x β] ∈ I and GOTO(I, x, Rtester ) is not included C do C = C ∪ GOTO(I, x, Rtester ) until no more sets of items can be added to C end

Chapter 3. The XpLR Methodology

47

The collection of XpLR(0) item sets of an augmented extended positional grammar G’ are incrementally constructed by the main procedure ITEMS, starting from the initial set containing the item [S’ → · S]. Similarly to the LR case, the sets of XpLR(0) items correspond to the states of a finite automaton for viable prefixes [1] where the transitions are determined by the function GOTO. In the following we give an example of construction of the sets of XpLR(0) item sets. Example 3.1 The collection of XpLR(0) item sets for the grammar of example 2.1 is described in the following. The notation (goto j) to the right hand side of an item K=[A → α · Rdriver , Rtester  x β] indicates the item sets Ij returned by GOTO(K, x, Rdriver ). I0 = { S’ → · A A → · a 1 1 B 11 1,2 2 b 11 1 d

(goto 1) (goto 2)}

I1 = { S’ → A · } I2 = { A → a · 1 1 B 11 1,2 2 b 11 1 d B → · e 2 2 f

(goto 3) (goto 6)}

I3 = { A → a 1 1 B · 11 1,2 2 b 11 1 d

(goto 4)}

I4 = { A → a 1 1 B 11 1,2 2 b · 11 1 d

(goto 5)}

I5 = { A → a 1 1 B 11 1,2 2 b 11 1 d ·} I6 = { B → e · 2 2 f

(goto 7)}

I7 = { B → e 2 2 f ·} The XpLR parsing table Construction Now we shall show how to construct an XpLR parsing table from the collection of item sets. Algorithm 3.3 Constructing an XpLR(0) parsing table. Input: An augmented extended positional grammar G’. Output: The XpLR(0) parsing table for G’. Method: 1. Construct C = {I0 , I1 ,. . ., Im }, the collection of sets of XpLR(0) items as described in Algorithm 3.2. 2. State i of the parser is constructed from the set of items Ii . The entries for state i of the parsing table action and next parts are determined as follows:

48

Chapter 3. The XpLR Methodology

SHIFT ENTRIES • If [A → α · Rdriver a β] or [A → · a β] is in Ii and GOTO(Ii , a, ∅) = Ij then set action[i, a] = “T: shift j” (a is required to be a terminal) where T stands for a condition which returns always true. • If [A → α · Rdriver , Rtester  a β] is in Ii and GOTO(Ii , a, Rtester ) = Ij then set action[i, a] = “Rtester : shift j” (a is required to be a terminal). REDUCE ENTRIES • If [A → α · ] is in Ii then set action[i, a] = “reduce A → α” for each terminal a. NEXT and ACCEPT ENTRIES • Whenever [A → α · Rdriver , Rtester  x β] is in Ii insert (Rdriver , x) in next[i]. • If [S’ → · S] is in Ii then insert (Start, S) in next[i]. If [S’ → S .] is in Ii then insert (end, EOI) in next[i] and “accept” in action[i, EOI]. 3. The entries for state i and the non-terminals X of the goto part are determined as follows: • If [A → α · Rdriver , Rtester  X β] is Ii and GOTO(Ii , X, Rtester ) = Ij then insert “Rtester : j” in goto[i, X]. • If [A → α · Rdriver X β] or [A → · X β] is in Ii and GOTO(Ii , X, ∅) = Ij then insert “T: j” in goto[i, X]. The action and goto parts of the XpLR parsing table are constructed as in the LR parsing tables. The action conditions and the entries in the column next are constructed as follows: • a shift or goto action in state i has a sequence of tester relations Rtester as an action condition if and only if the set of items Ii corresponding to state i contains an item with a dot preceding a sequence Rtester . • the entry next[i] contains the pair (Rdriver , x ) if and only if the set of items Ii corresponding to state i contains an item with a dot preceding a sequence Rdriver , Rtester , and the vsymbol x.

Chapter 3. The XpLR Methodology

3.3

49

XpLR parsing table conflicts

A conflict in an XpLR parsing table arises when multiple actions are contained in a single entry of the action, goto or positional parts. An XpLR parsing table may present shift/shift, goto/goto and positional conflicts, besides the classical shift/reduce, reduce/reduce conflicts. A shift/shift conflict occurs whenever multiple shift actions are present in a single entry of the action part. Analogously, a goto/goto conflict occurs whenever multiple goto actions appear in a single entry of the goto part. Shift/shift or goto/goto conflicts are generated whenever a set of XpLR(0) items contains two or more items with the dot preceding the same grammar object with the same sequence of driver relations, but with different tester relations. As in the LR methodology a shift/reduce (reduce/reduce, resp.) conflict occurs whenever a single entry of the action part contains both shift and reduce (multiple reduce, resp.) actions. A positional conflict occurs whenever multiple values (REL, x) are present in a single entry of the next column. This conflict is generated whenever a sequence contains a set of XpLR(0) items with two or more items with the dot preceding pairs with different driver relations or with the same driver relation but different grammar objects. An extended positional grammar for which it is possible to construct an XpLR parsing table without conflicts is said to be an XpLR grammar. As an example, the grammar of example 2.1 is an XpLR grammar, as shown by its parsing table in Fig. 3.2. As for LR parsing, every ambiguous grammar G fails to be XpLR. Indeed, if G is visually ambiguous then the corresponding parsing table has a positional conflict, whether if G is structurally ambiguous then the parsing table may present any conflict.

3.3.1

Handling parsing table conflicts

Ambiguities in non XpLR grammars are handled by exploiting heuristics. In particular, positional conflicts are solved by partitioning the conflicting state into a sequence of substates on the base of the driver relations, and ordering the values (RELh , x) in the same entry of the column next. In this case, the XpLR parsing table has a slightly different structure with respect to the pLR parsing table.

50

Chapter 3. The XpLR Methodology As an example, Fig. 3.3 shows the parsing table of the ambiguous grammar STD of

example 2.3. Note that state 4 is partitioned in four ordered substates. Thus when the parser is in state 4, it has recognized a non-terminal Graph and proceeds with the parsing of the visual sentence by looking for: 1. an outgoing edge or a self-edge (corresponding to state 4.1) of Graph as shown in Figg. 3.4 (a)-(b), or 2. an incoming edge of Graph (state 4.2) as shown in Fig. 3.4 (c); or 3. a reintroduced vsymbol PLACEHOLD (state 4.3) as shown in Fig. 3.4 (d). If no one of such vsymbols are found then the parser executes a reduce operation (state 4.4). Action Goto NODEI NODEIF NODEF NODEG EDGE PLACEHOLD EOI StateTD Graph :sh2 :sh3 :1 :4 0 acc 1 r2 r2 r2 r2 r2 r2 r2 2 r3 r3 r3 r3 r3 r3 r3 3 1 1 _ 2 : sh5 1_2: sh6 42 1 _1 : sh7 3 :sh8 4 r1 r1 r1 r1 r1 r1 r1 :sh11 :sh10 :sh12 5 r5 r5 r5 r5 r5 r5 r5 6 :sh11 :sh10 :sh12 7 r7 r7 r7 r7 r7 r7 r7 8 r4 r4 r4 r4 r4 r4 r4 9 r8 r8 r8 r8 r8 r8 r8 10 r9 r9 r9 r9 r9 r9 r9 11 r10 r10 r10 r10 r10 r10 12 r10 r6 r6 r6 r6 r6 r6 r6 13

NEXT

St.

Node (start, StateTD) (end, EOI) (1_1, EDGE) (1_2, EDGE)

:9 :13

(any, PLACEHOLD) (2_1, Node) (1_1, Node) -

Figure 3.3: An XpLR(0) parsing table with ordered substates. The order of the substates in a state depends on the syntax of the language to be parsed. In general, the language implementer may need to modify the order of the substates accordingly. It is easy to show that partitioning a state in a sequence of ordered substates allows us to avoid all the conflicts caused by the introduction of Γ rules in the XpLR grammars, and also some of the conflicts that could occur when using the XpLR parsing table construction algorithm, as shift/reduce and reduce/reduce conflicts.

Chapter 3. The XpLR Methodology State 4.1

State 4.2

1_1 Graph

(a)

51 State 4.3

1_2 Graph 1_1, 1_2

(b)

Graph

Graph

(c)

(d)

Figure 3.4: A graphical representation of state 4. The remaining shift/reduce and reduce/reduce conflicts are solved by using disambiguating rules such as those used by tools like YACC [39]. In particular, a shift/reduce is resolved in favor of shift, and a reduce/reduce is resolved by choosing the conflicting production listed first in the grammar specification. Finally, shift/shift and goto/goto conflicts are solved by ordering the conditioned actions present in the same entry. The parser tests the action conditions sequentially and executes the first action whose condition is verified. Similarly to YACC, the order of multiple values in the same entry of the parsing table depends on the order of the items in the same set. It is easy to reproduce the reduction process in Fig. 2.4 by applying Algorithm 3.1 modified with the previous heuristics on the XpLR(0) parsing table in Fig. 3.3. Let us observe that the parsing time complexity on the grammar STD is O(n2 ), where n is the number of vsymbols in the sentence, since the number of vsymbols introduced during the parsing is limited by the number of edges in the sentence.

3.3.2

Building parsing tables with ordered substates

In the following we show how to modify the algorithm for the construction of the collection of XpLR(0) item sets in order to obtain a parsing table where the states can be partitioned into substates. To this aim, we introduce a new function called Partition in addition to Closure and Goto functions.

The Partition Operation If J is a set of XpLR(0) items then Partition(J) splits J in an ordered sequence of XpLR(0) item sets. The function groups the items having the same driver relation following the dot, so each set of the sequence can be identified by a driver relation. The order of the

52

Chapter 3. The XpLR Methodology

sets in the sequence depends on the syntax of the language to be parsed, and the language implementer may need to modify it accordingly. Moreover if J contains one or more complete items (i.e., of type [A → X R1 Y R2 Z · ]) then the function inserts the item whose production is first listed in the XPG specification at the end of the sequence. The function Partition can be computed as follows. Function PARTITION(J) begin let D be any ordered sequence Rdriver1 ,. . ., Rdrivern  of all the different driver relations following the dots in the items in J if n ≥ 1 then for i=1 to n do Ji = {items | items = [A → α’ x · Rdriveri , Rtester  β’] ∈ J} let m the number of complete items in J if m > 0 then Jn+1 = {[A → α’ x · ] | [A → α’ x · ] ∈ J, and A → α’ x is the conflicting production listed first in the grammar specification} return J1 ,. . ., Jn+1 ; if n ≤ 1 then return {J} else return J1 ,. . ., Jn ; end This function is invoked by the function Goto as described in the following.

The new version of the Goto Operation Goto(I, x, R) is defined to be the closure of the sequence of sets of items obtained by applying the partition operation to the set of all items [A → α Rdriver , Rtester  x · β] such that [A → α · Rdriver , Rtester  x β] is in I. Function GOTO(I, x, Rtester ) begin if Rtester = ∅ then let J = { [A → α Rdriver x · β] | α = . and [A → α · Rdriver x β] ∈ I } ∪ { [A → x · β] | [A → · x β] ∈ I }

Chapter 3. The XpLR Methodology

53

else let J = { [A → α Rdriver , Rtester  x · β] | α = ., and [A → α · Rdriver , Rtester  x β] ∈ I } set J1 ,. . ., Jm  = PARTITION(J) where m is the length of the returned sequence return CLOSURE(J1 ),. . ., CLOSURE(Jm ) end Finally, the function Items can be easily modified in order to take into account sequences of item sets instead of item sets. In the following we give an example of construction of the sets of XpLR(0) item sets. Example 3.2 The collection of sequences of XpLR(0) item sets for the grammar of example 2.3 is described in the following. I0 =  J01 = { S’ → · StateTD

(goto 1)

StateTD → · Graph

(goto 4)

Graph → · NODEI

(goto 2)

Graph → · NODEIF

(goto 3)

Graph → · Graph’ 1 1,1 2 EDGE 2 1 Node

(goto 4)

Graph → · Graph’ 1 1,1 2 EDGE

(goto 4)

Graph → · Graph’ 1 2,1 1 EDGE 1 1 Node

(goto 4)

Graph → · Graph’ any  PLACEHOLD

(goto 4)}

I1 =  J11 = { S’ → StateTD · } I2 =  J21 = { Graph → NODEI · } I3 =  J31 = { Graph → NODEIF · } I4 =  J41 = { Graph → Graph’ · 1 1,1 2 EDGE 2 1 Node Graph → Graph’ · 1 1,1 2 EDGE

(goto 5) (goto 6)}

J42 = { Graph → Graph’ · 1 2,1 1 EDGE 1 1 Node

(goto 7)}

J43 = { Graph → Graph’ · any  PLACEHOLD

(goto 8)}

J44 = { StateTD → Graph · }

54

Chapter 3. The XpLR Methodology

I5 =  J51 = { Graph → Graph’ 1 1,1 2 EDGE · 2 1 Node

(goto 9)

Node → · NODEG

(goto 10)

Node → · NODEF

(goto 11)

Node → · PLACEHOLD

(goto 12)}

I6 =  J61 = { Graph → Graph’ 1 1,1 2 EDGE · } I7 =  J71 = { Graph → Graph’ 1 2,1 1 EDGE · 1 1 Node

(goto 13)

Node → · NODEG

(goto 10)

Node → · NODEF

(goto 11)

Node → · PLACEHOLD

(goto 12)}

I8 =  J81 = { Graph → Graph’ any  PLACEHOLD · } I9 =  J91 = { Graph → Graph’ 1 1,1 2 EDGE 2 1 Node · } I10 =  J101 = { Node → NODEG · } I11 =  J111 = { Node → NODEF · } I12 =  J121 = { Node → PLACEHOLD · } I13 =  J131 = { Graph → Graph’ 1 2,1 1 EDGE 1 1 Node · } Sequence I4 is the only one formed by more than one set. The function Partition has split the set of items in four subsets using the ordered sequence of driver relations D= { 1 1, 1 2, any }.

3.4

Applicability of XpLR parsing

In this subsection we show the properties that an extended positional grammar must satisfy in order to obtain a correct and complete XpLR parser. Moreover, we show that

Chapter 3. The XpLR Methodology

55

the XpLR methodology provides means to handle also grammars whose associated XpLR parsers are not correct and/or complete. Theorem 3.1 gives the conditions under which an XpLR parser is correct. The proof is derived from theorem 7.1 of [15]. Theorem 3.1 (Correctness) Let XPG be an XpLR grammar and P(XPG) its XpLR parser. If Π is a visual sentence accepted by P(XPG) then Π ∈ L(XPG). Vice versa, the absence of conflicts in an XpLR parsing table for a language L does not guarantee that any visual sentence in L is accepted by the corresponding XpLR parser. Let Π be a visual sentence in L(XPG) where XPG=(G,PE), generated by applying PE to a positional sentence s ∈L(G). At each step of the parsing process, the function Fetch Vsymbol takes as argument the pair (RELhi i , x) from the column next of the parsing table to inquire the input dictionary. For the parsing program to execute correctly in a deterministic way, there must be a single terminal xi reachable from x that is detected and returned by Fetch Vsymbol. However, in the case Fetch Vsymbol detects more than one terminal on the pair (RELhi i , x), a “run-time conflict” message is returned and the parsing program halts. In this case, we say that a run-time conflict occurred. As an example, let us consider the visual sentential form in Fig. 3.5 obtained during the reduction process described in example 2.3 (see Fig. 2.4(e)). The parser produces this form by reaching state 4.3 containing the set of items I43 ={Graph → Graph’ · any  PLACEHOLD}. In the next step, the execution of F etch V symbol on the pair (any , PLACEHOLD) retrieves two occurrences of the terminals PLACEHOLD and, as a consequence, detects a run-time conflict.

Graph

Figure 3.5: A visual sentential form.

Definition 3.1 (XpLR parsable) Let P(XPG) be the XpLR parser of an XpLR grammar XPG. XPG is said XpLR parsable

56

Chapter 3. The XpLR Methodology

if for each visual sentence Π ∈ L(XPG), each execution of F etch V symbol invoked by P(XPG) during the parsing of Π detects and returns one and only one terminal. In other words, the parser P(XPG) of an XpLR parsable grammar XPG does never incur in run-time conflicts. The following theorem gives the conditions for completeness of the XpLR parsing algorithm. The proof is an extension of theorem 7.2 of [15]. Theorem 3.2 (Completeness) Let XPG be an XpLR parsable grammar and P(XPG) its XpLR parser. If Π ∈ L(XPG) then Π is accepted by P(XPG). It is obvious that grammars that exhibit run-time conflicts are undesiderable because they are not suitable for XpLR parsing. In [16] an algorithm has been introduced to statically verify, during the construction of the parsing table, whether or not a positional grammar would produce run-time conflicts (such algorithm can also be applied to XpLR grammars). In particular, whenever the algorithm detects a conflict it returns the set of items causing the conflict. Therefore, this technique allows a designer to have a feedback in the early phases of the syntax definition of the visual language and gives him/her information on how and where to intervene in order to solve the conflict. As a matter of fact, whenever the algorithm detects a run-time conflict the grammar designer analyzes the relation R causing the conflict and verifies if the scanning order of the vsymbols producing the conflict, i.e.

belonging to the set detected by

F etch V symbol(NEXT), is not relevant to the correct parsing of the sentence. In this case any of the detected vsymbols can be chosen as the next input. It is easy to show that the relation any in the set of items I43 is such that every PLACEHOLD can be chosen as the next vsymbol to be parsed. In general, when the algorithm statically detects a “non relevant run-time conflict” produced by a relation of this type in a particular set of items, the grammar designer must explicitly tag such relation in the XPG with a ’*’. In order to support such approach, the function F etch V symbol must be modified to take into account the tagged relations. The modification consists in the addition of the following new case: ∗ ∗ , x), where Rdriver = RELh1 1 ,. . ., RELhnn ∗ and each RELhi i acts on NEXT= (Rdriver

Chapter 3. The XpLR Methodology

57

a syntactic attribute ki of x let zi be the hi -th object below the stack top for i = 1 to n do let next seti = { b | b is in Dp , it is non marked as visited, it has an attribute j such that (b, j) is reachable from (x, ki ), zi RELi b holds, and the relation RELi acts on a syntactic attribute of zi and the syntactic attribute j of b, respectively } if ∩i=1...n next seti is non-empty then randomly select an object b from ∩i=1...n next seti for each RELi in Rdriver that is an explicit relation do decrease by 1 the entry in the array COUNTER corresponding to the explicit relation zi RELi b mark the corresponding entry in Dp as visited return the row index of b in Dp else return null ; Although the relations used to model many popular visual languages are applied in contexts such that the relations are tagged, this technique cannot always be applied. In these cases, the grammar designer must modify the grammar in order to solve the conflict analogously to what happens when using traditional compiler-compiler tools such as YACC [39]. Fig. 3.6 describes the steps that the designer follows to construct an XpLR parser. XpLR methodology 1. Design or modify an extended positional grammar G 2. Construct the parsing table for G 3. if G has no conflicts then exit; else apply the following heuristics a. positional conflicts are handled by partitioning states in substates b. s/r and r/r are handled as with YACC c. s/s and g/g are handled by sorting the conflicting entries according to the associated tester relations. 4. if the modified parsing table suites the designer’s needs then exit; else goto 1;

Figure 3.6: The steps for the construction of an XpLR parser. It is worth noting that for some classes of non XpLR grammars the application of the

58

Chapter 3. The XpLR Methodology

heuristics leads to deterministic parsers. In particular, the conflicts that preserve the determinism are: the positional conflicts that involve a univocal relation, and the shift/shift (goto/goto, resp.) conflicts with mutually exclusive conditioned shift (goto, resp.) actions. In the first case, the univocal relation may be satisfied by at most one vsymbol thus Fetch Vsymbol may succeed at most on one conflicting entry. Similarly, in the second case only one condition is true, this guarantees that only one shift (goto, resp.) action can be performed. As an example, the grammar for the context-sensitive language an bn cn given in example 2.2 is non XpLR (see Fig. 3.7) but the positional conflicts involve the relation right-to that is a univocal relation. It is easy to prove that the time complexity of the parser is O(n2 ) by analyzing the structure of the derivation steps needed to generate the word an bn cn . Thanks to the application of heuristics we could easily generate parsers for recognizing many practical visual languages that would otherwise require the specification of a complex grammar. As an example, the grammar for state transition diagrams given in example 2.3 is non XpLR. Nevertheless, we preserve its simple structure by handling the conflicts through the partitioning and ordering of the parser states into substates. St. 0 1 2 3

Action a :sh2

NEXT

Goto c

EOI

S :1

B

acc :sh5

:sh4

:sh2

:3 :sh10

:sh7

4 5 6 7 8 9 10 11

b

:sh5 r3 r5

:sh4 r3 r5

r4 r2 r1

r4 r2 r1

:8

:sh6 :9 r3 r5 :sh11 r4 r2 r1

r3 r5 r4 r2 r1

(start, S) (end, EOI) (right-to, B) (right-to, S) (right-to, c) (right-to, c) (right-to, b) (right-to, B) (right-to, c) -

Figure 3.7: The XpLR(0) parsing table for the grammar of example 2.2.

Chapter 4

Building LR(0) Parsers for XPG Grammars

Given an extended positional grammar XPG, we can build an XpLR(0) parser that recognizes L(XPG) by using the algorithms described in the previous chapter. In this chapter, we show that it is also possible to construct a parser for an XPG grammar by using a translation scheme directly derived from XPG by means of special mapping rules. A translation scheme is a context-free string grammar in which attributes are associated with the grammar vsymbols and semantic actions enclosed between braces {} are inserted within the right sides of productions [1]. In the following we denote with map(XP G) the translation scheme SG derived by applying mapping rules to XPG, G(SG) the context-free grammar underlying SG, P (SG) the corresponding parser, and L(SG) the language recognized by P(SG). The conversion of an XPG into an “equivalent” translation scheme allows us to use standard and well-known compiler generation tools, like YACC [39], for the rapid implementation of compilers for visual languages. We also prove that given a translation scheme SG=map(XPG), 1. G(SG) is LR(0) iff XPG is XpLR(0), and 2. the LR(0) parser built on SG recognizes the same set of visual sentences as the XpLR(0) parser built on XPG. However, in an attempt to keep the grammars simple, visual language designers often prefer to leave ambiguities within the grammar, and to solve them later by using conflict handling techniques to be specified when generating the parser. This means that we might have to frequently deal with ambiguous XPGs. Thus, P(SG) needs heuristics for

60

Chapter 4. Building LR(0) Parsers for XPG Grammars

conflict solving to preserve the equivalence between L(SG) and L(XPG). In particular, we will prove that to each type of conflict in an XPG corresponds a precise type of conflict in map(XPG). In this way, we devise conflict handling techniques for map(XPG) simulating the behavior of the techniques used for XPG, so that L(XPG) is still equivalent to L(map(XPG)). Fig. 4.1 graphically illustrates our approach. Let us observe that, if the translation scheme obtained from the conversion is non-LR(0) then it does not present shift/reduce conflicts but only reduce/reduce.

§h}@

p`dfBa_Ba@

~d_BasBEcA_B@ \aAd^bA_c`d@ jp]BsB@

ui@ AaB@_]BaB@ aBEwpBœaBEwpB p`dCbcp_^¦@

Œ…@ \aAd^bA_c`d@ jp]BsB@

„j@

IttbcpA_c`d@@`C@@ …œ…@‹`dCbcp_@@ …B^`bw_c`d@ \Bp]dcwB^@

Figure 4.1: An approach for the construction of LR grammars from XPGs. Next section describes the mapping process, whereas section 4.2 shows the equivalence between L(XPG) and L(map(XPG)) for an XpLR grammar XPG. Finally, section 4.3 provides techniques for constructing a translation schema map(XPG) for a non-XpLR grammar XPG, such that L(P(map(XPG)))=L(P(XPG)). In other words, we describe how to construct an LR parser simulating P(XPG), including its heuristics.

4.1

Converting an XPG into a translation scheme based on string grammars

In this section we define the mapping rules to convert a generic extended positional grammar into a translation scheme. The generated translation schemes have synthesized attributes, i.e. each grammar production “A → α” is associated with an action that calculates the attributes of the non-terminal A from the values of the vsymbols in the right

Chapter 4. Building LR(0) Parsers for XPG Grammars

61

hand side α. Let us consider a generic production of an extended positional grammar XPG: A → α xi Rdriveri , Rtesteri  xi+1 β, ∆, Γ

(4.1)

where xi , xi+1 are either terminals or non terminals and Γ = {(N1 , Cond1 , ∆1 ),. . ., (Nt , Condt , ∆t )} with t≥0. The syntactic attributes of each vsymbol in the production will be left unchanged in the final translation scheme SG. The ∆ and Γ rules will be emulated within the action sections of SG. In order to complete the mapping we need to introduce new non-terminals, productions and actions within SG to simulate the behavior of each sequence of relations Rdriveri and Rtesteri . The conversion of XPG in SG is accomplished through the four mapping rules given below, which are applied to the productions of XPG to derive the set of productions and semantic actions of SG. In them we refer to Dp as the dictionary storing the vsymbols. Moreover, the functions Fetch Vsymbol and Test have the same behavior of the corresponding functions defined in subsection 3.1.4, with the only difference that they ignore some non-terminals when accessing the stack. In particular, they ignore non-terminals added during the conversion process, which did not belong to XPG. The four mapping rules follow: Rule 1. Replace each sequence of driver relations Rdriveri with a new unique non-terminal DRki . Furthermore, build an empty production on DRki with an action emulating the fetching of the next vsymbol to parse. Such an action calls the function Fetch Vsymbol on arguments Rdriveri and xi+1 , where xi+1 is the vsymbol following Rdriveri in the XPG production. When DRki is reduced, the action retrieves the next vsymbol to be processed from Dp . In particular, the added production is: DRki → .

{ ip = Fetch Vsymbol (Rdriveri , xi+1 ); if ip is not null then next vsymbol = Dp [ip] else {emit “syntax error ”; exit;} }

Rule 2. Replace each sequence of non-empty tester relations Rtesteri with a sequence formed by a new unique non-terminal TRki followed by a fictitious unique terminal

62

Chapter 4. Building LR(0) Parsers for XPG Grammars aki . Such a sequence must be placed after the vsymbol xi+1 following Rtesteri in the XPG production. Moreover, introduce a new empty production for TRki with an action emulating the tester relations. In particular, the action invokes the function Test for each relation RELh in Rtesteri to verify whether RELh holds between xi+1 and the previously scanned vsymbols. If Test returns always true, then the fictitious terminal aki is returned as the next vsymbol to be processed. The successful parsing of aki signals the correct recognition of xi+1 .

The following productions are the result of applying rules 1. and 2. to Rdriveri and Rtesteri in the XPG production 4.1. A → α xi DRki xi+1 TRki aki β  { ∆; for j=1 to t do if Condj is true then { insert(Dp , Nj ); ∆j ; } } DRki → .

{ ip = Fetch Vsymbol (Rdriveri , xi+1 ); if ip is not null then next vsymbol = Dp [ip] else {emit “syntax error ”; exit;} }

TRki → .

{ if Test(RELh , xi+1 ) is true for each RELh in Rtesteri then next vsymbol = Dp [ip] else {emit “syntax error ”; exit;} }

Rule 3. Add the following two productions to SG in order to calculate the first vsymbol to be processed, and to verify that all the vsymbols in the input sentence have been processed: S’ → SP S

{ ip = Fetch Vsymbol (end); if ip is not null then {emit “syntax error ”; exit;} else {emit “parsing ok ”; exit;} }

Chapter 4. Building LR(0) Parsers for XPG Grammars SP → .

63

{ ip = Fetch Vsymbol (start); if ip is not null then next vsymbol = Dp [ip] else {emit “syntax error ”; exit;} }

Here, S is the starting vsymbols of XPG, whereas S’ is the starting vsymbol of SG. The following rule aims to reduce the number of productions and non-terminals in SG, so that the corresponding parser will have a reduced number of states. Rule 4. Merge empty productions with identical actions to form a single production. This entails the elimination of the non-terminals on the LHSs of merged productions, and the introduction of a new non-terminal as the LHS of the resulting production. Moreover, merge empty productions having the same parameters in the Test function into a single production. This entails that the LHSs and the fictitious terminals of merged productions need to be replaced by a single non-terminal and a single fictitious terminal for the resulting production. This renaming process needs to be propagated to all the productions referring to the renamed vsymbols. The application of these mapping rules to an extended positional grammar XPG without empty productions produces a translation scheme SG in which the productions have two possible formats: 1. B → y1 A1 y2 A2 . . . An−1 yn

with n≥1

where B is a non-terminal from XPG, each Ai is either a DR or a TR non-terminal vsymbol, and each yi is either a terminal or a non-terminal vsymbol from XPG or a unique fictitious terminal. Moreover a TR vsymbol can only be followed by a fictitious terminal. 2. A → . where A is either a DR or a TR non-terminal vsymbol. In the following, productions of type 1 will be referred to as ordinary productions, and productions of type 2 as DR or TR productions depending on whether A is of type DR or TR.

64

Chapter 4. Building LR(0) Parsers for XPG Grammars

Example 4.1 Given the XPG=((T, N∪POS, P, S), PE) for state transitions diagrams shown in example 2.3, the application of mapping rules 1-4 yields the translation scheme SG=(T’, N’, P’, S’), where T’=T∪{A1, A2, A3}, N’=N∪{S’, SP, r1 1, r2 1, r1 2, r1 1b, tn1 2, t1 2, tn1 1, rany} and P’ is the set of productions with actions described in the following. 6 →636WDWH7'^ LS )HWFKB9V\PERO RST  UVLSLVQRWQXOOWXRS^RYUWV\QWD[HUURURZUW` R[\R^RYUWSDUVLQJRNRZUW` `  63→ε^ LS )HWFKB9V\PERO \W]^W  UVLSLVQRWQXOOWXRSQH[WBV\PERO 'S>LS@ R[\R^RYUWV\QWD[HUURURZUW` `  6WDWH7'→*UDSK^`  *UDSK→12'(,^  *UDSK 12'(, ` *UDSK→12'(,)^  *UDSK 12'(,) `  *UDSK→*UDSK UB('*(WQB$UB1RGH^ *UDSK *UDSK ('*( V_^L W_WT_ UV _1RGH_! WXRS^   LQVHUW 3/$&(+2/'    3/$&(+2/' 1RGH('*( ` ` UB→ε^ LS )HWFKB9V\PERO B('*(   UVLSLVQRWQXOOWXRSQH[WBV\PERO 'S>LS@ =======R[\R^RYUWV\QWD[HUURURZUW` `  WQB→ε^ UV7HVW  B  ('*( LVWUXH WXRSQH[WBV\PERO $ ==

========R[\R^RYUWV\QWD[HUURURZUW`

`

 UB→ε^ LS )HWFKB9V\PERO B1RGH   UVLSLVQRWQXOOWXRSQH[WBV\PERO 'S>LS@ ==========================================R[\R^RYUWV\QWD[HUURURZUW` `  *UDSK→*UDSK UB('*(WB$^ *UDSK  *UDSK ('*( ('*( `

WB→ε ^ UV7HVW B('*( LVWUXH =======================WXRSQH[WBV\PERO $ ========================R[\R^RYUWV\QWD[HUURURZUW` `  *UDSK→*UDSK UB('*(WQB$UBE1RGH^ *UDSK *UDSK ±('*( V_^L W_WT_ UV _1RGH_! WXRS^   LQVHUW 3/$&(+2/'    3/$&(+2/' 1RGH±('*( ` `  UB→ε^ LS )HWFKB9V\PERO B1RGH  UVLSLVQRWQXOOWXRSQH[WBV\PERO 'S>LS@ R[\R^RYUWV\QWD[HUURURZUW` `  WQB→ε^ UV7HVW  B  ('*( LVWUXH =============================WXRSQH[WBV\PERO $ =============================R[\R^RYUWV\QWD[HUURURZUW` `  UBE→ε^ LS )HWFKB9V\PERO B1RGH  UVLSLVQRWQXOOWXRSQH[WBV\PERO 'S>LS@ R[\R^RYUWV\QWD[HUURURZUW` `  *UDSK→*UDSK UDQ\3/$&(+2/'^ *UDSK 3/$&(+2/' `  UDQ\→ε^ LS )HWFKB9V\PERO DQ\3/$&(+2/'  UVLSLVQRWQXOOWXRSQH[WBV\PERO 'S>LS@ === = ===R[\R^RYUWV\QWD[HUURURZUW` `  1RGH→12'(*^ 1RGH 12'(* ` 1RGH→12'()^ 1RGH 12'() `

 1RGH→3/$&(+2/'^ 1RGH 3/$&(+2/' `

Chapter 4. Building LR(0) Parsers for XPG Grammars

4.2

65

Comparing the recognized languages

In this section we compare L(SG) and L(XPG) and analyze the circumstances under which they are equivalent. In particular, we prove that the grammar G(SG) is LR(0) iff XPG is XpLR(0), and that if G(SG) is LR(0) then L(SG) is equivalent to L(XPG). Let XPG = ((N, T ∪ POS, S, P), PE) be an extended positional grammar, and let Rd and Rt be sets of sequences of relations from POS. Sequences in Rd represent driver relations, whereas those in Rt represent tester relations. Moreover, let SG = map(XPG) and G’ = G(SG) = (N’, T’, S’, P’). From the mapping rules 1-4 seen above we know that N ⊆ N’ and T ⊆ T’. In particular, N’ = N ∪ DR ∪ TR and T’ = T ∪ FICT ∪ {.}, where DR is the set of non-terminal vsymbols introduced in G’ by Rule 1; TR and FICT are in order the set of non-terminal and fictitious terminal vsymbols introduced in G’ by Rule 2. Furthermore, the following regular expressions will be also used throughout the proofs: N is the regular expression denoting the set N of non terminals in XPG; T is the regular expression denoting the set T of terminals in XPG; Rd is the regular expression denoting the set Rd of sequences of driver relations; Rt is the regular expression denoting the set Rt of sequences of tester relations; DR is the regular expression denoting the set DR of non-terminals resulting from Rules 1 and 4; TR is the regular expression denoting the set TR of non-terminals resulting from Rules 2 and 4; a is the regular expression denoting the set FICT of fictitious terminals resulting from Rules 2 and 4; x = (N | T) denotes the set of grammar vsymbols from XPG; PREF = x(Rd, Rt? x)∗ denotes the set of non empty prefixes of the right hand side of a production in XPG; SUFF = (Rd, Rt? x)∗ denotes the set of suffixes of the right hand side of a production in XPG;

66

Chapter 4. Building LR(0) Parsers for XPG Grammars

PREF’ = x (DR x (TR a)?)∗ denotes the set of prefixes of the right hand side of a production in G’; SUFF’ = (DR x (TR a)?)∗ denotes the set of suffixes of the right hand side of a production in G’. If r is a regular expression we will use the standard notation L(r) to refer to the language defined by r. In the following we define a correspondence between the set of items constructed from an XPG by using the algorithms of section 3.2 and the set of items constructed from the grammar G’. Definition 4.1 (map-equivalence) Let I be a set of XpLR(0) items derived from XPG and I’ a set of LR(0) items derived from G’. The sets I and I’ are map-equivalent iff 1. the number of kernel items is the same, and 2. for each kernel item A → α · β  in I’ there exists a kernel item A → α · β in I such that: (a) the production A → α β  in G’ is derived by the application of the Rules 1-4 to the production A → α β in XPG, and (b) α is the result of the translation process applied to α, whereas β  is the result of the translation process applied to β. In future, whenever we use a greek letter α to denote a generic sequence on the RHS of some production in XPG, we will denote with α the sequence obtained as a result of the translation process applied to α. An interesting property of the mapping process is given in the following proposition. Proposition 4.1 Let I be a set of XpLR(0) items derived from an XPG and I’ a set of LR(0) items derived from G’ and map-equivalent to I. For each shift/goto transition of P(XPG) from I to an adjacent set of items Ix there exists a set of items Ix ’ map-equivalent to Ix that can be reached through 1, 2, or 4 consecutive transitions of P(G’) from I’. Proof: The items derived from XPG that are subject to shift/goto transitions can be of three types:

Chapter 4. Building LR(0) Parsers for XPG Grammars

67

1. Kernel items with a non-empty sequence of tester relations following the dot. 2. Kernel items with an empty sequence of tester relations following the dot. 3. Items with a dot at the beginning of the right hand side. Since we assume that XPG generates no conflicts, there can only be three cases that are worthwhile to examine. CASE 1. I contains one or more items of type 1 with the same driver sequence, tester sequence, and grammar vsymbol immediately following the dot. In this case there is a single transition from the set of items I: A → α · Rdriver , Rtester  x β, B → δ · Rdriver , Rtester  x γ, ........................... to the set of items Ix : A → α Rdriver , Rtester  x · β, B → δ Rdriver , Rtester  x · γ, ........................... where α, δ ∈ L(PREF), Rdriver ∈ Rd, Rtester ∈ Rt, x ∈ N∪T, β, γ ∈ L(SUFF). Given the nature of the mapping rules 1-4 above it is easy to prove that there exists a sequence of four consecutive transitions in P(G’) starting from the set of items I’ (mapequivalent to I): I’: A → α · DRi x TRi a β  , B → δ  · DRi x TRi a γ  , ..................... and ending to the set of items Ix ’: A → α DRi x TRi a · β  , B → δ  DRi x TRi a · γ  , ..................... α , δ  ∈ L(PREF’), DRi ∈ DR , TRi ∈ TR , a ∈ FICT, β  , γ  ∈ L(SUFF’). The execution of the actions associated to the empty productions involving the non terminals DRi and TRi , together with the recognition of the vsymbol a reproduces the same effect as the XpLR parser invocations of the Fetch Vsymbol and Test algorithms on (Rdriver ,

68

Chapter 4. Building LR(0) Parsers for XPG Grammars

x) and (Rtester , x), respectively. CASE 2. As case 1, but here Rtester is empty. If x is the vsymbol following the dot, there might exist items of type 3 derived by closure that happen to have x as the first vsymbol on their RHS. Thus, in P(XPG) there is a shift/goto transition from the set of items: I: A → α · Rdriver ,  x β, B → δ · Rdriver ,  x γ, X → σ · Rdriver , Rtester  C τ , C → · x λ, ........................ to the set of items Ix : A → α Rdriver ,  x · β, B → δ Rdriver ,  x · γ, C → x · λ, ........................ and there exists a sequence of two consecutive transitions in P(G’) starting from the set of items I’ I’: A → α · DRi x β  , B → δ  · DRi x γ  , X → σ  · DRi C TRi a τ  , ........................ crossing set of items I” I”: A → α DRi · x β  , B → δ  DRi · x γ  , X → σ  DRi · C TRi a τ  , C → · x λ , ........................ and ending to an item set Ix ’ map-equivalent to Ix Ix ’: A → α DRi x · β  B → δ  DRi x · γ  C → x · λ , ........................

Chapter 4. Building LR(0) Parsers for XPG Grammars

69

with α , δ  , σ  ∈ L(PREF’), DRi ∈ DR , TRi ∈ TR , a ∈ FICT, β  , γ  , τ  , λ ∈ L(SUFF’). The execution of the actions associated to the empty production involving the non-terminal DRi reproduces the same effect as the XpLR parser invocation of the Fetch Vsymbol on (Rdriver , x). CASE 3. In this case I contains a certain number of items of type 3 followed by the same grammar vsymbol x. This means that there is a shift/goto transition in P(XPG) from the set of items I: A → · x β, B → · x γ, ............ to the set of items Ix : A → x · β, B → x · γ, ............ with x ∈ N ∪ T, β, γ ∈ L(SUFF), and there exists a transition in P(G’) from the set of items I’: A → · x β  , B → · x γ, ............ to the set of items Ix ’: A → x · β  , B → x · γ, ............ with β  , γ  ∈ L(SUFF’). In the next proposition we prove that G(map(XPG)) can only have reduce/reduce conflicts, for any XPG. Proposition 4.2 Let G’=G(map(XPG)). If G’ is non-LR(0) then the corresponding parsing table can never present a shift/reduce conflict. Proof: Let us contradict the thesis by supposing that the parsing table derived from G’ has a shift/reduce conflict. In order for G’ to produce a shift/reduce conflict there must

70

Chapter 4. Building LR(0) Parsers for XPG Grammars

exist a set of items K containing at least one complete item and one with the dot preceding a terminal vsymbol. The latter can be of the following three possible types: 1. The dot is between a TR non-terminal and a fictitious terminal A → α DRki xi+1 TRki · aki β  α ∈ L(PREF’), DRki ∈ DR, TRki ∈ TR, aki ∈ FICT, β  ∈ L(SUFF’). Since the set of items K must have been reached through a goto operation on TRki and a TR non-terminal has to be followed by a fictitious terminal, then there cannot exist any complete item in K. Hence, shift/reduce conflicts cannot involve an item with a dot between a TR non-terminal and a fictitious terminal. 2. The dot is between a DR non-terminal and a terminal from XPG A → α DRki · b ρ β  α ∈ L(PREF’), b ∈ T, DRki ∈ DR, ρ ∈ L((TR a)?), β  ∈ L(SUFF’). The set of items K must have been reached by a goto operation on DRki . Moreover a DR non-terminal has to be followed by a vsymbol b ∈ N ∪ T. Thus, if b ∈ T there cannot exist any complete item in K; if b ∈ N the closure on b cannot generate complete items in K because no empty ordinary productions are allowed. Hence, shift/reduce conflicts cannot involve an item with a dot between a DR non-terminal and a terminal. 3. The dot is at the beginning of the right hand side of a non-empty production A → · b α b ∈ T, α ∈ L(SUFF’) Thus, the set of items K should contain the following item: B → δ  DRki · Y ρ γ  ∗

δ  ∈ L(PREF’), Y ∈ N, Y⇒ Aλ , λ , γ  ∈ L(SUFF’), ρ ∈ L((TR a)?). This corresponds to case 2. Hence, shift/reduce conflicts cannot involve an item with a dot at the beginning of the right hand side of a non empty production. It is then proved that G’ can never produce an LR(0) parsing table with shift/reduce conflicts, independently from the XpLR(0) property of XPG. Now we consider the case in which the translation scheme SG obtained from an XPG by applying the mapping rules 1-4 is LR(0). We prove that G(SG) is LR(0) if and only if XPG is XpLR(0). Theorem 4.1 Let G’ = G(map(XPG)). G’ is LR(0) iff XPG is XpLR(0). Proof: ⇒) If G’ is LR(0), we need to prove that XPG is XpLR(0). Let us contradict

Chapter 4. Building LR(0) Parsers for XPG Grammars

71

the thesis by supposing that XPG is not XpLR(0). Then we need to prove that also the hypothesis is contradicted, i.e., G’ is not LR(0). Thus, let us suppose that XPG is not XpLR(0). This implies that its XpLR(0) parsing table has at least one of the following types of conflicts: shift/shift case, goto/goto case, shift/shift othercase, goto/goto othercase, and positional conflicts. In the following we analyze each XpLR(0) conflict, detect the XpLR(0) items raising them, and use proposition 4.1 to prove that there must exist a corresponding set of LR(0) items generated from G’ yielding conflicts in the associated LR(0) parsing table. Shift/Shift Case (Goto/Goto Case, respectively) There is only one way for an XpLR(0) set of items K to present a shift/shift case conflict (goto/goto case conflict, resp.). The set of items K must contain at least two kernel items k1 and k2 with the dot preceding a terminal (non-terminal, respectively) vsymbol. The sequences of driver relations right after the dot must be equal, whereas the sequences of tester relations must not be mutually exclusive to have a conflict. Thus, K contains the following items: K: A → α · Rdriveri , Rtesteri  xi+1 β,

(k1)

B → γ · Rdriveri , R’testeri  xi+1 δ,

(k2)

........................... with α, γ ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri , R’testeri ∈ Rt, β, δ ∈ L(SUFF), xi+1 ∈ T (xi+1 ∈ N, respectively). Then, the map-equivalent set of items K’ derived from G’ will contain the following items: K’: A → α’ · DRi xi+1 TRi ai β’,

(k1’)

B → γ’ · DRi xi+1 TR’i a’i δ’,

(k2’)

DRi → · ........................... with α’, γ’ ∈ L(PREF’), DRi ∈ DR, TRi , TR’i ∈ TR, ai , a’i ∈ FICT, β’, δ’ ∈ L(SUFF’). By executing the goto operation twice, on DRi first and xi+1 then, we reach a set of items I’ containing the following items: I’: A → α’ DRi xi+1 · TRi ai β’,

(i1’)

B → γ’ DRi xi+1 · TR’i a’i δ’,

(i2’)

TRi → · TR’i → ·

72

Chapter 4. Building LR(0) Parsers for XPG Grammars ...........................

which presents a reduce/reduce conflict involving two different TR productions. As a consequence, if the parsing table on XPG contains a shift/shift case conflict or a goto/goto case conflict, then the parsing table built on G’ must contain a reduce/reduce conflict. This leads to a contradiction since the hypothesis states that G’ is LR(0). Shift/Shift Othercase (Goto/Goto Othercase, respectively) There are two ways for an XpLR(0) set of items K to present a shift/shift othercase conflict (goto/goto othercase conflict, resp.). CASE 1 In the first case, the set of items K must contain two kernel items k1 and k2 with the dot preceding a terminal (non-terminal, respectively) vsymbol. The two items must have equal sequences of driver relations right after the dot, and exactly one of the two must have an empty sequence of tester relations right after the dot. K: A → α · Rdriveri , Rtesteri  xi+1 β, B → γ · Rdriveri ,  xi+1 δ,

(k1) (k2)

........................... where α, γ ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, β, δ ∈ L(SUFF), xi+1 ∈ T (xi+1 ∈ N, respectively). Then, the map-equivalent set of items K’ derived from G’ will contain the following items: K’: A → α’ · DRi xi+1 TRi ai β’, B → γ’ · DRi xi+1 δ’,

(k1’) (k2’)

DRi → · ........................... where α’, γ’ ∈ L(PREF’), DRi ∈ DR, TRi ∈ TR, ai ∈ FICT, β’, δ’ ∈ L(SUFF’). By executing the goto operation twice, on DRi and xi+1 successively, we reach a set of items I’ containing the following items: I’: A → α’ DRi xi+1 · TRi ai β’, B → γ’ DRi xi+1 · δ’,

(i1’) (i2’)

TRi → · ........................... If δ’ is empty, then the set of items contains a reduce/reduce conflict on the TRi production and the item i2’, otherwise δ’ starts with the vsymbol DRi+1 , so that also the

Chapter 4. Building LR(0) Parsers for XPG Grammars

73

item “DRi+1 → ·” must have been added to I’ by closure. In this case there would be a reduce/reduce conflict involving the DR and the TR productions. CASE 2 In the second case, the set of items K must contain a kernel item k1 and a nonkernel item k2, both with the dot preceding a terminal (non terminal, respectively) vsymbol. K: A → α · Rdriveri , Rtesteri  xi+1 β, B → · xi+1 δ,

(k1) (k2)

........................... where α ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, β, δ ∈ L(SUFF) xi+1 ∈ T (xi+1 ∈ N, respectively). In this case K must also contain at least another kernel item k0, from which k2 is derived by closure. The two kernel items k0 and k1 must have equal sequences of driver relations right after the dot, otherwise there would not be the conflict. X → σ · Rdriveri , ρ Y γ,

(k0) ∗

where σ ∈ L(PREF), Y ∈ N, Y ⇐ Bλ, λ ∈ L(SUFF), Rdriveri ∈ Rd, ρ ∈ L(Rt?), γ∈ L(SUFF). Then, the map-equivalent set of items K’ derived from G’ will contain the following items: K’: X → σ’ · DRi Y ρ’ γ’, A → α’ · DRi xi+1 TRi ai β’,

(k0’) (k1’)

DRi → · where σ’, α’ ∈ L(PREF’), DRi ∈ DR, TRi ∈ TR, ρ’ ∈ L((TR a)?), ai ∈ FICT, β’, γ’ ∈ L(SUFF’). By executing the goto operation on DRi we reach a set of items J’ containing the following items: J’: X → σ’ DRi · Y ρ’ γ’,

(j0’)

A → α’ DRi · xi+1 TRi ai β’,

(j1’)

B → · xi+1 δ’,

(j2’)

Notice the presence of the item B → · xi+1 δ’. We prove that it is generated by closure ∗

from Y. In fact, we know from the hypothesis that Y ⇐ Bλ with λ ∈ L(SUFF’), this means that there exists in XPG a sequence possibly empty of non-terminal vsymbols B1 , B2 ,. . ., Bn , such that: Y ⇐ B1 λ1 ⇐ B2 λ2 ⇐. . .⇐ Bn λn , with λ1 , λ2 ,. . ., λn ∈ L(SUFF),

74

Chapter 4. Building LR(0) Parsers for XPG Grammars

and there exist productions Bn → Bλn+1 , and B → xi+1 δ, with λn+1 , δ ∈ L(SUFF). According to the mapping rules this means that there must be a similar derivation generated by G’: Y ⇒ B1 λ1 ’ ⇒ B2 λ2 ’ ⇒ . . . ⇒ Bn λn ’, with λ1 ’, λ2 ’, . . ., λn ’∈ L(SUFF’), and productions Bn → B λn+1 ’, and B → xi+1 δ’, with λn+1 ’, δ’ ∈ L(SUFF’), so that when the dot precedes the string Y ρ’ γ’ we can derive the item B → · xi+1 δ’ by closure through the n+2 productions seen above. By executing the goto operation on xi+1 we reach a set of items I’ containing the following items: I’: A → α’ DRi xi+1 · TRi ai β’,

(i1)

B → xi+1 · δ’,

(i2)

TRi → · Again, if δ’ is empty then I’ contains a reduce/reduce conflict on the TRi production and the item i2, otherwise δ’ starts with the vsymbol DRi+1 , so that also the item “DRi+1 → ·” must have been added to I’ by closure. Thus, also in this case there is a reduce/reduce conflict involving the DR and the TR productions. Positional conflicts There is only one way for an XpLR(0) set of items K to present a positional conflict. The set of items K must contain at least two incomplete kernel items k1 and k2 having equal sequences of driver relations and different vsymbols following the dot. No constraints are imposed on the sequences of tester relations. Thus, K must contain the following items: K: A → α · Rdriveri , ρ xi+1 β,

(k1)

B → γ · Rdriveri , φ yi+1 δ,

(k2)

where α, γ ∈ L(PREF), Rdriveri ∈ Rd, ρ, φ ∈ L(Rt?), β, δ ∈ L(SUFF), xi+1 , yi+1 ∈ T (xi+1 , yi+1 ∈ N, respectively). The map-equivalent set of items K’ derived from G’ will then contain the following items: K’: A → α’ · DRi xi+1 ρ’ β’,

(k1’)

B → γ’ · DR’i yi+1 φ’ δ’,

(k2’)

DRi → · DR’i → · α’, γ’ ∈ L(PREF’), DRi , DR’i ∈ DR, ρ’, φ ∈ L((TR a)?), β’, δ’ ∈ L(SUFF’) with a reduce/reduce conflict involving two different DR productions. Again, this contra-

Chapter 4. Building LR(0) Parsers for XPG Grammars

75

dicts the hypothesis. Thus, we can conclude that if G’ is an LR(0) grammar obtained through the application of the mapping rules 1-4 to an extended positional grammar XPG, then we can state that XPG is an XpLR(0) grammar. ⇐) Let XPG be an XpLR(0) grammar, we need to prove that the grammar G’=G(map(XPG)) is LR(0). We will prove this by assuming that G’ is not LR(0) and by showing that such an hypothesis leads to a contradiction. In order for G to be non LR(0) its parsing table must contain at least a shift/reduce or a reduce/reduce conflict. From proposition 4.2 the parsing table derived from G’ can never present a shift/reduce conflict, thus in the following we show that each different type of reduce/reduce conflict leads to a particular conflict in the parsing table derived from XPG. We distinguish different types of reduce/reduce conflicts produced by G’ depending on the types of productions involved. Let us recall that G’ contains three types of productions, namely, ordinary, TR and DR productions. Therefore, the number of possible reduce/reduce types of conflicts caused by G is given by the six pair wise combinations of them. Fig. 4.2 summarizes all the correspondences between conflicts caused by G’ and XPG that we intend to prove. As an example, the first row of the table can be read as follows: “a reduce/reduce involving an ordinary production and a DR production in G’ implies a shift/reduce conflict caused by XPG”. The correctness of such table implies that if XPG is an XpLR(0) grammar, then the grammar G’ is an LR(0) grammar. In what follows we will prove the six cases singularly. Type of reduce/reduce conflict caused by G’

Type of conflict caused by XPG

(ordinary, DR)

shift/reduce

(ordinary, ordinary)

reduce/reduce

(TR, TR)

shift/shift or goto/goto case

(TR, ordinary) shift/shift or goto/goto othercase (TR, DR) (DR, DR)

positional

Figure 4.2: Correspondence between conflicts caused by G’ and XPG.

76

Chapter 4. Building LR(0) Parsers for XPG Grammars

1. (ORDINARY, DR) Since XPG does not contain empty productions, this case occurs when G’ generates a set of items I’ that contains at least one complete item like i1, and at least one item like i2 with the dot preceding a DR vsymbol on the RHS: I’: A → α · B→

β

· DRi xi+1

(i1) ρ

γ

(i2)

DRi → · ........................... xi+1 ∈ N ∪ T, α , β  ∈ L(PREF’), ρ ∈ L((TR a)?), γ  ∈ L(SUFF’) There must exist a map-equivalent set of items I generated from XPG containing the following items: I: A → α · B → β · Rdriveri , ρ xi+1 γ ........................... α, β ∈ L(PREF), Rdriveri ∈ Rd, ρ ∈ L(Rt?), γ ∈ L(SUFF). Hence, there should have been a shift/reduce conflict generated by XPG, which would contradict the hypothesis. 2. (ORDINARY, ORDINARY) This case occurs when the grammar G’ generates a set of items I’ containing two or more complete items. As an example, let us suppose that I’ contains two complete items i1 and i2. They should be terminated by the same vsymbol, because it is the last scanned vsymbol in both of them and from the hypothesis the two items are not empty. I’: A → α xi ·

(i1)

B → β  xi ·

(i2)

........................... xi ∈ N ∪ T, α , β  ∈ L(PREF’(DR))∪{.}. Thus, there must exist a map-equivalent set of items I generated from XPG containing the following items: I: A → α xi · B → β xi · ............... α, β ∈ L(PREFRd, )∪ {.}.

Chapter 4. Building LR(0) Parsers for XPG Grammars

77

Hence, there should have been a reduce/reduce conflict generated by XPG, which would contradict the hypothesis. 3. (TR, TR) This case occurs when I’ contains two or more items with the dot preceding different TR vsymbols. By following similar arguments as above, I’ will have the following structure: I’: A → α DRi xi+1 · TRi ai δ 

(i1)

B → β  DRi xi+1 · TRj aj γ 

(i2)

TRi → · TRj → · ........................... xi+1 ∈ N ∪ T, α , β  ∈ L(PREF’), DRi ∈ DR, TRi , TRj ∈ TR, ai , aj ∈ FICT, δ  , γ  ∈ L(SUFF’). Thus, there must exist the following set of items J’ generated by the grammar G’, such that goto(J’, xi+1 ) = I’: J’: A → α DRi · xi+1 TRi ai δ 

(j1)

B → β  DRi · xi+1 TRj aj γ 

(j2)

........................... Moreover, there must exist the following set of items K’ generated by the grammar G’, such that goto(K’, DRi ) = J’: K’: A → α · DRi xi+1 TRi ai δ 

(k1)

B → β  · DRi xi+1 TRj aj γ 

(k2)

DRi → · ........................... The map-equivalent set of items K generated from XPG contains the following conflicting items: K: A → α · Rdriveri , Rtesteri  xi+1 δ B → β · Rdriveri , Rtesterj  xi+1 γ ........................... α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri , Rtesterj ∈ Rt, δ, γ ∈ L(SUFF). Thus, there should have been a shift/shift case conflict generated by XPG, which would contradict the hypothesis. 4. (TR, ORDINARY)

78

Chapter 4. Building LR(0) Parsers for XPG Grammars

This case occurs when G’ generates a set of items I’ containing one or more complete items like i1 and one or more items like i2 with the dot preceding a vsymbol TRi ∈ TR: I’: A → α xi+1 ·

(i1)

B → β  DRi xi+1 · TRi ai γ 

(i2)

TRi → · ........................... xi+1 ∈ N ∪ T, α ∈ L(PREF’(DRi ))∪{.}, β  ∈ L(PREF’), DRi ∈ DR, TRi ∈ TR, ai ∈ FICT, γ  ∈ L(SUFF’). We must distinguish two cases, according to the following two alternatives: α = . or α = .. In the first case, there must exist the following set of items J’ such that goto(J’, xi+1 ) = I’: J’: A → · xi+1

(j1)

B → β  DRi · xi+1 TRi ai γ 

(j2)

........................... This means that there must exist the following item in J’ from which j1 is derived by closure: X → σ  DRi · Y ρ τ  σ

(j0) ∗

∈ L(PREF’), DRi ∈ DR, Y ∈ N, Y ⇐ Aλ , λ , τ  ∈ L(SUFF’), ρ ∈ L((TRi ai )?), TRi

∈ TR, ai ∈ FICT. Notice that there must be the same vsymbol DRi preceding the dot in j0 and j2. Moreover, Y cannot be the vsymbol S following SP, otherwise DRi = SP and this is not possible since SP can only occur once in a set of items. Thus, there must exist the following set of items K’ such that goto(K’, DRi ) = J’: K’: X → σ  · DRi Y ρ τ  B → β  · DRi xi+1 TRi ai γ 

(k0) (k2)

........................... The map-equivalent set of items K generated from XPG contains the following items: K: X → σ · Rdriveri , ρ Y τ B → β · Rdriveri , Rtesteri  xi+1 γ A → · xi+1 ........................... α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, ρ ∈ L(Rt?), τ , γ ∈ L(SUFF).

Chapter 4. Building LR(0) Parsers for XPG Grammars

79

We can notice that the set of items K presents a shift/shift or goto/goto othercase conflict depending on whether xi+1 is a terminal or non-terminal vsymbol, which would contradict the hypothesis. If α = . there must exist the following set of items J’ such that goto(J’, xi+1 ) = I’: J’: A → α · xi+1 B→

β

DRi · xi+1 TRi ai

(j1) γ

(j2)

........................... and the following set of items K’ such that goto(K’, DRi ) = J’: K’: A → α · DRi xi+1 B→

β

· DRi xi+1 TRi ai

(k1) γ

(k2)

........................... Thus, the map-equivalent set of items K generated from XPG contains the following items: K: A → α · Rdriveri ,  xi+1 B → β · Rdriveri , Rtesteri  xi+1 γ ........................... α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, γ ∈ L(SUFF). Thus, also in this case there is a shift/shift or goto/goto othercase conflict. 5. (TR, DR) In this case the set of items I’ generated from G’ must contain at least an item i1 with the dot preceding a vsymbol DRdriverj ∈ DR and at least an item i2 with the dot preceding a vsymbol TRdriveri ∈ TR. By the nature of the mapping rules, I’ must have the following structure: I’: A → α xi · DRj xj ρ λ B → β  DRi xi · TRi ai γ 

(i1) (i2)

DRj → · TRi → · ........................... xi , xj ∈ N ∪ T, α ∈ L(PREF’(DRi ))∪{.}, DRi , DRj ∈ DR, TRi ∈ TR, ai ∈ FICT, ρ ∈ L((TR a)?), β  ∈ L(PREF’), λ , γ  ∈ L(SUFF’). Also here we must distinguish two cases in correspondence of the two alternatives: α = ., and α = .. CASE 1.

80

Chapter 4. Building LR(0) Parsers for XPG Grammars I’: A → xi · DRj xj ρ λ

(i1)

B → β  DRi xi · TRi ai γ 

(i2)

DRj → · TRi → · ........................... There must exist the following set of items J’ such that goto(J’, xi ) = I’: J’: A → · xi DRj xj ρ λ

(j1)

B → β  DRi · xi TRi ai γ 

(j2)

........................... This means that there must exist the following item in J’, from which j1 is derived by closure: X → σ  DRi · Y ρ τ 

(j0) ∗

σ  ∈ L(PREF’), DRi ∈ DR, Y ∈ N, Y ⇐ Aλ , λ , τ  ∈ L(SUFF’), ρ ∈ L((TRi ai )?), TRi ∈ TR, ai ∈ FICT. Again, Y can be the vsymbol S, but not the one following SP, otherwise DRi = SP and this is not possible since SP can only occur once in an item set. Thus, there must exist the following set of items K’ such that goto(K’, DRi ) = J’: K’: X → σ  · DRi Y ρ τ  B → β  · DRi xi TRi ai γ 

(k0) (k2)

DRi → · ........................... Consequently, the map-equivalent set of items K generated from XPG contains the following items: K: X → σ · Rdriveri , ρ Y τ B → β · Rdriveri , Rtesteri  xi γ A → · xi ........................... α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, ρ ∈ L(Rt?), τ , γ ∈ L(SUFF). The derivation of the item A → · xi can be proven by similar arguments used in the (TR,ORDINARY) case. We can notice that the set of items K presents a shift/shift or goto/goto othercase conflict, which would contradict the hypothesis. CASE 2.

Chapter 4. Building LR(0) Parsers for XPG Grammars I’: A → α xi · DRj xj ρ λ

81

(i1)

B → β  DRi xi · TRi ai γ 

(i2)

DRj → · TRi → · ........................... xi , xj ∈ N ∪ T, α = α DRi , α , β  ∈ L(PREF’), DRi , DRj ∈ DR, TRi ∈ TR, ai ∈ FICT, ρ ∈ L((TR a)?), λ , γ  ∈ L(SUFF’). There must exist the following set of items J’ such that goto(J’, xi ) = I’: J’: A → α · xi DRj xj ρ λ B→

β

DRi · xi TRi ai

γ

(j1) (j2)

........................... and the following set of items K’ such that goto(K’, DRi ) = J’: K’: A → α · DRi xi DRj xj ρ λ B→

β

· DRi xi TRi ai

γ

(k1) (k2)

DRi → · ........................... Consequently, the map-equivalent set of items K generated from XPG contains the following items: K: A → α · Rdriveri ,  xi Rdriveri , ρ Y λ B → β · Rdriveri , Rtesteri  xi γ ........................... α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, ρ ∈ L(Rt?), λ, γ ∈ L(SUFF). Thus, also in this case K presents a shift/shift or goto/goto othercase conflict, which would contradict the hypothesis. 6. (DR, DR) This case occurs when I’ contains two or more items with the dot preceding different DR vsymbols. As an example, let us consider the following two conflicting items i1 and i2: I’: A → α xi · DRj xj ρ λ

(i1)

B → β  xi · DRk xk θ γ 

(i2)

DRj → · DRk → · ...........................

82

Chapter 4. Building LR(0) Parsers for XPG Grammars

xi , xj , xk ∈ N ∪ T, α , β  ∈ L(PREF’(DRi ))∪{.}, DRj , DRk , DRi ∈ DR, ρ , θ ∈ L((TR a)?), λ , γ  ∈ L(SUFF’). The map-equivalent set of items I generated from XPG contains the following items: I: A → α xi · Rdriverj , ρ xj ρ λ B → β xi · Rdriverk ,  xk θ γ ........................... α, β ∈ L(PREF(Rdriveri , )), Rdriveri , Rdriverj , Rdriverk ∈ Rd, ρ, θ ∈ L(Rt?), λ, γ ∈ L(SUFF). Thus, there would also be a positional conflict generated from XPG, which would contradict the hypothesis. Now, we also prove that if the translation scheme SG=map(XPG) is LR(0), then P(SG) and P(XPG) are equivalent. Theorem 4.2 If SG=map(XPG) is LR(0) then the parser built on SG recognizes the same set of visual sentences as the XpLR(0) parser built on XPG. Proof: Let VS = {t1 , t2 ,. . ., tn } be the set of terminals forming an input visual sentence. By using induction on the number of parsed terminals m (equals to n + nt, where nt is the number of vsymbols introduced during the parsing) we prove that the two parsers produce equivalent results after reading j ≤ m vsymbols. By equivalent results we mean that either they both successfully scan the sub-sentence ti1 , ti2 ,. . ., tij , in the same order, or they both reject it after reading the same vsymbol tix , 1 ≤ x ≤ j. From Proposition 4.1 P(SG) might do this by performing more transitions than P(XPG), although these do not produce further effects, since they only simulate the effects of driver and tester relations of XPG. Moreover, the hypothesis and Theorem 4.1 ensures that both XPG and G(SG) are conflict free. Induction Base (n = 1) Let us examine the steps executed by P(XPG) when scanning VS. We know that the starting set of items I0 derived from XPG must contain the item S’ → · S, which generates by closure in I0 a certain number of nonkernel items of type A → · xi β, β ∈ L(SUFF), which in turn might generate items of the same type within I0 . The parser P(XPG) starts by reading ti1 from the input. If I0 does not contain any nonkernel item of type A → · ti1 β, then the parser returns error and VS is rejected. Otherwise, there exist k > 0 nonkernel

Chapter 4. Building LR(0) Parsers for XPG Grammars

83

items of type A → · ti1 β, so that P(XPG) performs a shift to a set of items Ix containing at least the k items of type A → ti1 · β. The parser P(SG) will have a similar behavior on VS. In particular, the starting set of items I0 ’ derived from SG must contain the following items: I0 ’: S’ → · SP S, SP → · with SP ∈ DR. There are no more items in I0 ’ generated by closure, but there exists a set of items I1 ’ derived from SG which contains the item S’ → SP · S, and generates by closure in I1 ’ the same number of non kernel items generated from S’ → · S in I0 . In particular, for each item in I0 of type A → · xi β, β ∈ L(SUFF), there exists in I1 ’ a corresponding item A → · xi β’ , β’ ∈ L(SUFF’). Therefore, it is easy to verify that the terminal vsymbols starting the right hand sides (RHS) of items in I1 ’ are the same as those in I0 . Moreover, I0 ’ contains no items starting with a terminal vsymbol, hence P(SG) will necessarily reduce with SP → · and will execute the associated positioning action to scan the vsymbol ti1 . Then, it performs a transition from I0 ’ to I1 ’. If P(XPG) rejected VS it means that ti1 does not start any RHS of items in I1 , and from what said above, it cannot start a RHS of items in I1 ’. Thus, also P(SG) rejects VS and returns a parse error. Vice versa, if P(XPG) scanned ti1 successfully, it means that I1 ’ contains exactly k>0 items of type A → · ti1 β’ as I0 . Thus, also P(SG) performs a transition to a state Ix ’ map-equivalent to Ix , containing k items of type A → ti1 · β’, and perhaps some empty productions of type “DR → ·”. Induction Hypothesis/Step If the two parsers P(XPG) and P(SG) produce equivalent results after reading j < m vsymbols, then they produce equivalent results after reading the (j+1)th vsymbol. Obviously, if both P(XPG) and P(SG) returned a parse error there would be no (j+1)th step. Vice versa, if they produced equivalent results reading j vsymbols, it means that they have reached map-equivalent set of items Ij and Ij ’. We distinguish two cases according to the different structures of Ij and Ij ’. Case 1. Ij contains one or more kernel items like A → α tij · Rdriveri , Rtesteri  xi β,

(k1)

where α ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt ∪ {.}, β ∈ L(SUFF), xi ∈ N ∪ T.

84

Chapter 4. Building LR(0) Parsers for XPG Grammars

From the induction hypothesis Ij ’ contains similar kernel items like A → α’ tij · DRi xi TRi ai β’,

(k1’)

DRi → · with α’∈ L(PREF’) ∪ {.}, DRi ∈ DR , TRi ∈ TR ∪ {.}, ai ∈ FICT ∪ {.}, β’ ∈ L(SUFF’). For each xi ∈ N, xi will generate one or more items of type xi → · yi β, β ∈ L(SUFF), by closure in Ij . The application of the transitive closure to each xi ∈ N yields a certain number of terminal vsymbols b1 , b2 ,. . ., bn , each appearing as the first vsymbol of one or more RHSs of items in Ij . Thus, starting from the last scanned vsymbol tij , P(XPG) executes the driver relations in Rdriveri in order to scan the next vsymbol tij+1 from the input. If tij+1 does not coincide with any of the vsymbols xi ∈ T following Rdriveri in Ij or the vsymbols b1 , b2 ,. . ., bn , then P(XPG) returns parse error and VS is rejected. Otherwise, P(XPG) successfully scans tij+1 and performs a shift to a state Ix . The latter contains all the items of type X → σ tij+1 · δ, σ ∈ L(PREF) ∪ {.}, δ ∈ L(SUFF), such that X → σ · tij+1 δ was in Ij , plus those they generate by closure. Similarly, the same situations will also have occurred in P(SG). In fact, from the assumption that there are no conflicts, P(SG) can only reduce with DRi → · in Ij ’, and the execution of the associated action reproduces the same positioning effects of Rdriveri starting from the last scanned vsymbol tij . Thus, P(SG) will have transited in a set of items Ij ” containing items of type A → α’ tij DRi · xi TRi ai β’. Thus, for each xi ∈ N, xi will generate by closure similar items as those generated from xi in Ij , with the same set of terminal vsymbols b1 , b2 ,. . ., bn , starting their RHSs. Therefore, it is easy to see that if tij+1 was successfully scanned by P(XPG), then it will also be scanned by P(SG), which will transit to a state Ix ’ map-equivalent to Ix . Conversely, if P(XPG) returned parse error then also P(SG) returns parse error. Case 2. Ij contains one complete kernel item like A → α tij · ,

(k2)

with α ∈ L(PREF) ∪ {.}. From the induction hypothesis Ij ’ contains a similar complete kernel item like A → α’ tij · ,

(k2’)

with α’∈ L(PREF’) ∪ {.}. This means that P(XPG) performs a reduction and will return to a set of items Ih con-

Chapter 4. Building LR(0) Parsers for XPG Grammars

85

taining an item with the dot preceding the non terminal A: X → σ · ρ A δ,

(k3)

with σ ∈ L(PREF) ∪ {.}, ρ ∈ L((Rd, Rt?)?), δ ∈ L(SUFF), and σ = . ⇔ ρ = .. Analogously, P(SG) will reduce to A and will return to the following set of items Ih ’: X → σ’ · A δ’,

(k3’)

............... with σ’ ∈ L(PREF’(DR)) ∪ {.}, δ’ ∈ L(SUFF’), DR ∈ DR. It is easy to prove that both parsers perform a goto on A, transiting to map-equivalent states Ih+1 and Ih+1 ’, respectively. If δ and δ’ are not empty we run into case 1, so we can apply the same arguments. If they are both empty and A is not S, then we are in case 2 again, so we apply the same reasoning until we run into case 1 or A becomes S. In the last case, both check if there are vsymbols in the input which have not been examined. Since from the inductive hypothesis both parsers have scanned the same vsymbols, in the same order, it means that P(XPG) accepts VS if and only if P(SG) accepts VS. In the next subsection we consider the non-LR(0) translation scheme generated through mapping rules 1-4.

4.3

Resolving conflicts in non-LR(0) translation schemes

A grammar G’=G(map(XPG)) may not be LR(0), hence P(SG) needs heuristics for conflict solving to preserve the equivalence between L(SG) and L(XPG). To this sake, previously we have proved that conflicts in G’ are introduced by conflicts in XPG. In particular, we have proved that each conflict in XPG always yields one reduce/reduce conflict in G’. This is an important property because it enable us to develop conflicts solving heuristics in G’ simulating the heuristics adopted on XPG (see section 3.3.1), so that L(XPG) is still equivalent to L(G’). In this way, we can use the parsing implementation technique presented in this paper even in those cases when the XPG grammar is not XpLR(0). As shown in Fig. 4.1, initially we ignore the non-LR problem and use the transformation algorithms of section 4.1 to generate what we call an intermediate translation scheme. Successively, we apply ad hoc transformation techniques to the intermediate grammar in order to eliminate the conflicts eventually caused by the original non-XpLR grammar XPG.

86

Chapter 4. Building LR(0) Parsers for XPG Grammars In order to devise conflict handling techniques for G’=map(XPG), we must identify the

possible reduce/reduce conflicts on the grammar G’ and modify it according to resolution techniques preserving the property L(P(G’))=L(P(XPG)). As shown in Theorem 4.1, the possible reduce/reduce conflicts in a set of items are given by the possible combinations of ordinary, DR and TR productions. In the following we describe these conflicts and the transformation techniques that we use to eliminate them from the grammar. These techniques follow the heuristics defined in the XpLR methodology. Case 1. Ordinary This case occurs when the grammar G’ generates a set of items I’ containing two or more complete items. As an example, let us suppose that I’ contains two complete items i1 and i2. I’: A → α xi ·

(i1)

B → β  xi ·

(i2)

........................... xi ∈ N ∪ T, α , β  ∈ L(PREF’(DR))∪{.}. In the XpLR methodology this type of conflict is solved by choosing the conflicting production listed first in the grammar specification. This approach can be simulated with the introduction of the non-terminal ’next1’ in the two conflicting productions followed by new fictitious terminals. A → α xi next1 ai B → β  xi next1 aj then we introduce the empty production next1 → . { next vsymbol = ak ;} where ak is ai if the production associated to (i1) precedes the production associated to (i2) in the XPG specification, otherwise ak is aj . Case 2. DR This case occurs when I’ contains two or more items with the dot preceding different DR vsymbols. As an example, let us consider the following two conflicting items i1 and i2: I’: A → α xi · DRj xj ρ λ B→

β

xi · DRk xk

DRj → · DRk → ·

θ

γ

(i1) (i2)

Chapter 4. Building LR(0) Parsers for XPG Grammars

87

........................... xi , xj , xk ∈ N ∪ T, α , β  ∈ L(PREF’(DRi ))∪{.}, DRj , DRk , DRi ∈ DR, ρ , θ ∈ L((TR a)?), λ , γ  ∈ L(SUFF’). In the XpLR methodology the set of items is partitioned according to the driver relations. This establishes an evaluation order of the driver relations. We can resolve this conflict by introducing the non-terminal ‘next1’ in the two conflicting productions: A → α xi next1 xj ρ λ B → β  xi next1 xk θ γ  and this empty production: next1 → . { let Rseq be the ordered sequence of parameters of Fetch Vsymbol in the conflicting DR productions do { let R the first element in Rseq ; ip = Fetch Vsymbol (R); if ip is not null then next vsymbol = Dp [ip]; else delete R from Rseq ; } while(ip is null and Rseq is not empty); if (ip is null ) { emit “syntax error ”; exit; } } Case 3. TR This case occurs when I’ contains two or more items with the dot preceding different TR vsymbols. Thus, I’ will have the following structure: I’: A → α DRi xi+1 · TRi ai δ 

(i1)

B → β  DRi xi+1 · TRj aj γ 

(i2)

TRi → · TRj → · ........................... xi+1 ∈ N ∪ T, α , β  ∈ L(PREF’), DRi ∈ DR, TRi , TRj ∈ TR, ai , aj ∈ FICT, δ  , γ  ∈ L(SUFF’). This conflict is generated from a shift/shift or goto/goto case conflict in the XPG grammar. The heuristics used by the XpLR methodology to eliminate these ambiguities is to order the tester relations and to execute the first shift or goto whose condition is true. This heuristic can be simulated by introducing a new non-terminal ‘next1’ and by defining

88

Chapter 4. Building LR(0) Parsers for XPG Grammars

it through the following empty production: next1 → . { let Rseq be the ordered sequence of tester relation that are parameters of Test in the conflicting TR productions do { let R the first element in Rseq ; if Test(RELh , xi+1 ) is true for each RELh in R then next vsymbol = ak ; // where ak is the fictitious vsymbol following R exit; else delete R from Rseq ; } while(Rseq is not empty); if (ip is null ) { emit “syntax error ”; exit; } } Finally we introduce the non-terminal ’next1’ in the two conflicting productions: A → α DRi xi+1 next1 ai δ  B → β  DRi xi+1 next1 aj γ  Case 4. (ordinary, DR) This case occurs when G’ generates a set of items I’ that contains at least one complete item, and at least one item with the dot preceding a DR vsymbol on the RHS: I’: A → α · B → β  · DRi xi+1 ρ γ 

(i1) (i2)

DRi → · ........................... xi+1 ∈ N ∪ T, α , β  ∈ L(PREF’), ρ ∈ L((TR a)?), γ  ∈ L(SUFF’) This conflict is generated by a shift/reduce conflict in the XpLR parsing table. In this case, I is split by the function Partition in an ordered sequence of items sets on the base of the driver relations, and the parser gives priority to the shift of xi+1 . So we can tackle this conflict by introducing a new non-terminal ‘next1’ and by defining it through the following empty production: next1 → . { ip = Fetch Vsymbol (Rdriveri , xi+1 ); if ip is not null then next vsymbol = Dp [ip]; else next vsymbol = ak ; } then we introduce the non-terminal ‘next1’ in the two conflicting productions:

Chapter 4. Building LR(0) Parsers for XPG Grammars

89

A → α next1 ak B → β  next1 xi+1 ρ γ  Case 5. (Ordinary, TR) This case occurs when G’ generates a set of items I’ containing one or more complete items and one or more items with the dot preceding a vsymbol TRi ∈ TR: I’: A → α xi+1 ·

(i1)

B → β  DRi xi+1 · TRi ai γ 

(i2)

TRi → · ........................... xi+1 ∈ N ∪ T, α ∈ L(PREF’(DRi ))∪{.}, β  ∈ L(PREF’), DRi ∈ DR, ai ∈ FICT, γ  ∈ L(SUFF’). Moreover, let us suppose that TRi → . { if Test(RELh , xi+1 ) is true for each RELh in Rtesteri then next vsymbol = ai ; else {emit ”syntax error”; exit;} } We must distinguish two cases, according to the following two alternatives: α = . or α = .. In the first case, there is a shift/shift or goto/goto othercase conflict in XPG depending on whether xi+1 is a terminal or non-terminal vsymbol. We can tackle this conflict by introducing a new non-terminal ‘next1’ and by defining it through the following rule: next1 → . { let j: X → σ  DRi · Y ρ τ  the kernel item such that i1 is in Closure(j) if ρ = TRj a’ then if in the order sequence of conditioned actions, the condition verified by Tj precedes the condition verified by TRi then if Test(RELh , xi+1 ) is true for each RELh in Rtesterj then { next vsymbol = aj ; exit; } if Test(RELh , xi+1 ) is true for each RELh in Rtesteri then next vsymbol = ai ; else {emit “syntax error”; exit;} } then we introduce the non-terminal ’next1’ and a fictitious vsymbol aj in the two conflict-

90

Chapter 4. Building LR(0) Parsers for XPG Grammars

ing production: A → xi+1 next1 aj B → β  DRi xi+1 next1 ai γ  Also in the case α = . there is a shift/shift or goto/goto othercase conflict. By following similar arguments as above, we can tackle this conflict by introducing a new non-terminal ‘next1’ and by defining it through the following empty production: next1 → . {if Test(RELh , xi+1 ) is true for each RELh in Rtesteri then next vsymbol = ai ; else next vsymbol = aj ; } then we introduce the non-terminal ‘next1’ and a fictitious vsymbol aj in the two conflicting production: A → α xi+1 next1 aj B → β  DRi xi+1 next1 ai γ  Case 6. (TR, DR) In this case the set of items I’ generated from G’ must contain at least an item i1 with the dot preceding a vsymbol DRj ∈ DR and at least an item i2 with the dot preceding a vsymbol TRi ∈ TR: I’: A → α xi · DRj xj ρ λ B → β  DRi xi · TRi ai γ 

(i1) (i2)

DRj → · TRi → · ........................... xi , xj ∈ N ∪ T, α ∈ L(PREF’(DRi ))∪{.}, DRi , DRj ∈ DR, TRi ∈ TR, ai ∈ FICT, ρ ∈ L((TR a)?), β  ∈ L(PREF’), λ , γ  ∈ L(SUFF’). This conflict is generated by a shift/shift or goto/goto case conflict in XPG. We can tackle this conflict by introducing a new non-terminal ‘next1’ and define it through the following empty production: next1 → . {if Test(RELh , xi ) is true for each RELh in Rtesteri then next vsymbol = ai ; else {ip = Fetch Vsymbol (Rdriveri , xj ); if ip is not null then next vsymbol = Dp [ip];

Chapter 4. Building LR(0) Parsers for XPG Grammars

91

else emit “syntax error”; exit;} } then we introduce the non-terminal ‘next1’ in the two conflicting production: A → α xi next1 xj ρ λ B → β  DRi xi next1 ai γ  Conflict resolution algorithm Fig. 4.3 shows the algorithm for the resolution of reduce/reduce conflicts that uses the approach proposed in the previous conflicts classification. @[e_^UWXY=fgh= ASijWk=I@_aAd^bA_c`d@^p]BsB@}@AdE@_]B@p`aaB^t`dEBd_@Œ…xŠy@tAa^cde@_ADbB@\@oc_]@aBEwpBœaBEwpB@p`dCbcp_^q@ 7jWijWk@I@_aAd^bA_c`d@^p]BsB@oc_]@d`@aBEwpB‘aBEwpB@p`dCbcp_^q@ Hq@

…BtBA_@^_Bt^@G‘†@wd_cb@\@]A^@d`@aBEwpBœaBEwpB@p`dCbcp_^q@

Gq@

’`a@BAp]@^_A_B@^@cd@\@p`d_Acdcde@A_@bBA^_@_o`@p`stbB_B@c_Bs^m@paBA_B@_]B@^BwBdpB@4*56)78x^y@k@¨\…Hm@qqm\…m@z…©Hm@qqm@z…›m@ I›©Hm@qqm@Id@xd@≥@Gy@`C@_]B@bBC_@]AdE‘^cEB@d`d‘_BascdAb^@AttBAacde@cd@_]B@p`stbB_B@c_Bs^m@oc_]@\…A∈\…m@z…D∈z…m@Ip∈um@ H≤A≤m@©H≤D≤›m@›©H≤p≤dq@@

rq@

’`a@BAp]@^_A_B@^@cd@\@paBA_B@A@eaAt]@d`EBq@’`a@BAp]@tAca@`C@^_A_B^@^H@AdE@^G@cd@\@paBA_B@Ad@Aap@cC@4*56)78x^Hy@AdE@4*56)78x^Gy@ p`d_Acd@A_@bBA^_@A@p`ss`d@p`stbB_B@c_Bsq@@

‚q@

ŒB_@l‹uHm@qqm@‹u›n@DB@_]B@^B_@`C@p`ddBp_BE@eaAt]@p`st`dBd_^@AdE@^B_—`C—d_x‹uy@DB@_]B@^B_@`C@d`d‘_BascdAb^@AttBAacde@cd@A@ ^BwBdpB@4*56)78x^y@o]BaB@^@c^@A@d`EB@cd@‹uq@

ƒq@ V_^@c@k@Hm@ªm@›@T_@@ = = ==l^R]WR@d`d‘_BascdAb@u§\c@ @ @ @@V_^@BAp]@d`d‘_BascdAb@§@cd@^B_—`C—d_x‹ucy@T_@ @@@@@@@@@@@@@@@@@@@@@@@UV@x§@∈@H:@`a@§@∈