OMCCp: A MetaModelica Based Parser Generator Applied to ... - DiVA

5 downloads 16750 Views 3MB Size Report
May 31, 2011 - aging me to give my best in every step of this journey. .... 4.3.2 Parser Generator . ...... website2 and the tutorial website by Mills [2005].
Institutionen f¨ or Datavetenskap Department of Computer and Information Science

Master’s thesis

OMCCp: A MetaModelica Based Parser Generator Applied to Modelica by

Edgar Alonso Lopez-Rojas

LIU-IDA/LITH-EX-A–11/019–SE 2011-05-31 '

$

&

%

Link¨ opings universitet SE-581 83 Link¨ oping, Sweden

Link¨opings universitet 581 83 Link¨oping

Institutionen f¨ or Datavetenskap Department of Computer and Information Science

Master’s thesis

OMCCp: A MetaModelica Based Parser Generator Applied to Modelica by

Edgar Alonso Lopez-Rojas

LIU-IDA/LITH-EX-A–11/019–SE 2011-05-31

Supervisors: Martin Sj¨ olund and Mohsen Torabzadeh-Tari Dept. of Computer and Information Science Examiner:

Prof. Peter Fritzson Dept. of Computer and Information Science

Upphovsr¨ att Detta dokument h˚ alls tillg¨ angligt p˚ a Internet ˆa eller dess framtida ers¨ attare ˆ a under en l¨ angre tid fr˚ an publiceringsdatum under f¨ oruts¨ attning att inga extra-ordin¨ ara omst¨andigheter uppst˚ ar. Tillg˚ ang till dokumentet inneb¨ ar tillst˚ and f¨or var och en att l¨asa, ladda ner, skriva ut enstaka kopior f¨ or enskilt bruk och att anv¨anda det of¨ or¨ andrat f¨ or ickekommersiell forskning och f¨or undervisning. overf¨ ¨ oring av upphovsr¨ atten vid en senare tidpunkt kan inte upph¨ava detta tillst˚ and. All annan anv¨ andning av dokumentet kr¨aver upphovsmannens medgivande. F¨ or att garantera ¨aktheten, s¨akerheten och tillg¨ angligheten finns det l¨ osningar av teknisk och administrativ art. Upphovsmannens ideella r¨ att innefattar r¨att att bli n¨amnd som upphovsman i den omfattning som god sed kr¨aver vid anv¨andning av dokumentet p˚ a ovan beskrivna s¨ att samt skydd mot att dokumentet andras eller presenteras i s˚ ¨ adan form eller i s˚ adant sammanhang som ¨ ar kr¨ ankande f¨ or upphovsmannens litter¨ara eller konstn¨arliga anseende eller egenart. F¨ or ytterligare information om Link¨ oping University Electronic Press se f¨ orlagets hemsida http://www.ep.liu.se/

Copyright The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Link¨oping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/ c

Edgar Alonso Lopez-Rojas

To Isabella, my new project in life

Abstract The OpenModelica Compiler-Compiler parser generator (OMCCp) is an LALR(1) parser generator implemented in the MetaModelica language with parsing tables generated by the tools Flex and GNU Bison. The code generated for the parser is in MetaModelica 2.0 language which is the OpenModelica compiler implementation language and is an extension of the Modelica 3.2 language. OMCCp uses as input an LALR(1) grammar that specifies the Modelica language. The generated Parser can be used inside the OpenModelica Compiler (OMC) as a replacement for the current parser generated by the tool ANTLR from an LL(k) Modelica grammar. This report explains the design and implementation of this novel Lexer and Parser Generator called OMCCp. Modelica and its extension MetaModelica are both languages used in the OpenModelica environment. Modelica is an Object-Oriented EquationBased language for Modeling and Simulation.

v

vi

Acknowledgements It is an honor for me to be able to culminate this work with the guidance of remarkable computer scientists. This thesis would not have been possible unless the clear vision of my examiner, professor Peter Fritzson. As the director of the Open Source Modelica Consortium (OSMC) he presented this great opportunity to me. Together with him, I have to thank my supervisors Martin Sj¨ olund and Mohsen Torabzadeh-Tari. Martin has made available his support and guidance in a number of ways that I cannot count and Mohsen has always been keeping track of my progress and helping me with the difficulties I found. I am pleased to be part, learn and contribute to this great open source project called OpenModelica. Nevertheless, To IDA (Department of Computer and Information Science) for offering its locations and resources for my daily work. I cannot forget to thank my family. My parents Jesus and Soledad for supporting me since the beginning in this project to become a Master in Computer Science. My fianc´ee Helena, who has all the time been encouraging me to give my best in every step of this journey. I am delighted to include my future daughter Isabella here; who is been my biggest motivation to complete this work before the day she step for the first time in this world. Last, but not less important my financial sponsors from Colombia: Fundacion Colfuturo1 and EAFIT University2 . They believed in my talent and provided the financial resources to achieve this goal.

1 http://www.colfuturo.org/ 2 http://www.eafit.edu.co/

vii

viii

Contents 1 Introduction 1.1 Background . . . 1.2 Project Goal . . 1.3 Methodology . . 1.4 Intended Readers 1.5 Thesis Outline .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1 1 2 2 3 3

2 Theoretical Background 2.1 Compilers . . . . . . . . . . . . . . . 2.1.1 Fundamentals . . . . . . . . . 2.1.2 Lexical Analysis . . . . . . . 2.1.3 Syntax Analysis . . . . . . . 2.1.4 Parser LALR(1) . . . . . . . 2.2 Error Handling in Syntax Analysis . 2.2.1 Error Recovery . . . . . . . . 2.2.2 Error Messages . . . . . . . . 2.3 The OpenModelica Project . . . . . 2.3.1 The Modelica Language . . . 2.3.2 MetaModelica extension . . . 2.3.3 Abstract Syntax Tree - AST

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

5 5 6 8 10 13 15 16 17 17 18 18 21

3 Existing Technologies 3.1 OpenModelica Compiler (OMC) . . . 3.1.1 Architecture and Components . 3.1.2 ANTLR . . . . . . . . . . . . . 3.1.3 Current state . . . . . . . . . . 3.2 Flex . . . . . . . . . . . . . . . . . . . 3.2.1 Input file lexer.l . . . . . . . . 3.2.2 Output file lexer.c . . . . . . . 3.3 GNU Bison . . . . . . . . . . . . . . . 3.3.1 Input file parser.y . . . . . . . 3.3.2 Output file parser.c . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

23 23 23 24 26 27 27 27 28 29 29

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

ix

. . . . .

. . . . .

. . . . .

x

CONTENTS

4 Implementation 4.1 Proposed Solution . . . . . . . . . . . . . . . . . . . 4.2 OMCCp Design . . . . . . . . . . . . . . . . . . . . . 4.2.1 Lexical Analyser . . . . . . . . . . . . . . . . 4.2.2 Syntax Analyser . . . . . . . . . . . . . . . . 4.3 OpenModelica Compiler-Compiler Parser (OMCCp) 4.3.1 Lexer Generator . . . . . . . . . . . . . . . . 4.3.2 Parser Generator . . . . . . . . . . . . . . . . 4.4 Error handling . . . . . . . . . . . . . . . . . . . . . 4.4.1 Error recovery . . . . . . . . . . . . . . . . . 4.4.2 Error messages . . . . . . . . . . . . . . . . . 4.5 Integration OMC . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

5 Discussion 5.1 Analysis of Results . . . . . . . . . . . 5.1.1 Lexer and Parser . . . . . . . . 5.1.2 OMCCp Construction . . . . . 5.1.3 Implementation of a subset of Modelica grammar . . . . . . . 5.2 OpenModelica Compiler . . . . . . . . 5.3 Limitations . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modelica and Meta. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33 33 34 35 39 44 44 46 49 49 50 54 57 57 57 58 61 64 64

6 Related Work 66 6.1 OpenModelica Development . . . . . . . . . . . . . . . . . . . 66 6.2 Compiler-Compiler Construction . . . . . . . . . . . . . . . . 67 7 Conclusions 69 7.1 Accomplishments . . . . . . . . . . . . . . . . . . . . . . . . . 69 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Bibliography

73

Appendices

80

A OMC Compiler Commands A.1 Parameters - MetaModelica Parser Generator A.1.1 Generate compilerName . . . . . . . . A.1.2 Run compilerName, fileName . . . . . A.2 OMC Commands . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

B Lexer Generator B.1 Lexer.mo . . . . . B.2 LexerGenerator.mo B.3 LexerCode.tmo . . B.4 Types.mo . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

83 . 83 . 92 . 100 . 102

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

80 80 80 80 80

CONTENTS

xi

C Parser Generator 107 C.1 Parser.mo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 C.2 ParserGenerator.mo . . . . . . . . . . . . . . . . . . . . . . . 126 C.3 ParseCode.tmo . . . . . . . . . . . . . . . . . . . . . . . . . . 143 D Sample Input 146 D.1 lexer10.l . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 D.2 parser10.y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 E Sample Output E.1 ParseTable10.mo E.2 ParseCode10.mo E.3 Token10.mo . . . E.4 LexTable10.mo . E.5 LexerCode10.mo

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

152 152 157 168 168 171

F Modelica Grammar 176 F.1 lexerModelica.l . . . . . . . . . . . . . . . . . . . . . . . . . . 176 F.2 parserModelica.y . . . . . . . . . . . . . . . . . . . . . . . . . 180 G Additional Files 205 G.1 SCRIPT.mos . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 G.2 Main.mo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Glossary

209

Acronyms

211

xii

CONTENTS

List of Figures 2.1 2.2 2.3 2.4

Compiler Phases . . . . . . Compiler Front-End . . . . Parser components . . . . . OpenModelica Environment

. . . . . . . . . . . . . . . . . . . . . . . . . . . [Fritzson et al.,

. . . .

6 8 12 18

3.1 3.2

OMC simplified overall structure [Fritzson et al., 2009] . . . . OMC Language Grammars . . . . . . . . . . . . . . . . . . .

24 24

4.1 4.2 4.3 4.4 4.5

OMCCp (OpenModelica Compiler - Compiler) Lexer and Parser Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OMCCp Lexer and Parser Generator Architecture Design . . OMC-Lexer design . . . . . . . . . . . . . . . . . . . . . . . . OMC-Parser design . . . . . . . . . . . . . . . . . . . . . . . . OMC-Parser LALR(1) . . . . . . . . . . . . . . . . . . . . . .

34 36 37 39 40

5.1

OMCCp - Time Parsing . . . . . . . . . . . . . . . . . . . . .

63

xiii

. . . . . . . . . . . . 2009]

. . . .

. . . .

. . . .

. . . .

. . . .

xiv

LIST OF FIGURES

List of Tables 2.1 2.2 2.3

LR(1) parsing table [Aho et al., 2006] . . . . . . . . . . . . . LR(1) parsing table rearranged [Aho et al., 2006] . . . . . . . LALR(1) parsing table [Aho et al., 2006] . . . . . . . . . . . .

14 14 15

5.1 5.2 5.3

OMCCp Files Implementation . . . . . . . . . . . . . . . . . . Test Suite - Compiler . . . . . . . . . . . . . . . . . . . . . . OMCCp - Time Parsing . . . . . . . . . . . . . . . . . . . . .

60 63 63

xv

xvi

LIST OF TABLES

Listings 2.1 2.2 2.3 3.1 3.2 3.3 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 A.1 A.2 A.3 A.4 A.5 B.1 B.2 B.3 B.4 C.1 C.2 C.3 D.1 D.2

MetaModelica uniontype . . . . . . . . . . . MetaModelica matchcontinue . . . . . . . . MetaModelica list . . . . . . . . . . . . . . ANTLR grammar file structure . . . . . . . Flex file structure . . . . . . . . . . . . . . . Bison file structure . . . . . . . . . . . . . . Lexer.mo function scan . . . . . . . . . . . . Parser.mo function parse . . . . . . . . . . . MultiTypedStack AstStack . . . . . . . . . ParseCode.mo case reduce action . . . . . . ParseCode.mo function getAST . . . . . . . Modifications in the Bison Epilogue . . . . Modifications in the Rules section in Bison List of semantic values of tokens . . . . . . Constants for error handling . . . . . . . . . Custom error messages in OMCCp . . . . . Error messages in OMCCp . . . . . . . . . program.mo with errors . . . . . . . . . . . Parser.mo original function . . . . . . . . . Parser.mo modified function . . . . . . . . . Compile Flex and Bison . . . . . . . . . . . OMCC.mos . . . . . . . . . . . . . . . . . . OMCCP Command . . . . . . . . . . . . . SCRIPT.mos debug mode . . . . . . . . . . OMCCP debug mode . . . . . . . . . . . . Lexer.mo . . . . . . . . . . . . . . . . . . . LexerGenerator.mo . . . . . . . . . . . . . . LexerCode.tmo . . . . . . . . . . . . . . . . Types.mo . . . . . . . . . . . . . . . . . . . Parser.mo . . . . . . . . . . . . . . . . . . . ParserGenerator.mo . . . . . . . . . . . . . ParseCode.tmo . . . . . . . . . . . . . . . . lexer10.l . . . . . . . . . . . . . . . . . . . . parser10.y . . . . . . . . . . . . . . . . . . .

xvii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19 20 20 25 27 29 37 41 43 43 44 46 47 48 50 53 53 54 54 55 81 81 81 82 82 83 92 100 102 107 126 143 146 147

xviii

E.1 E.2 E.3 E.4 E.5 F.1 F.2 G.1 G.2

LISTINGS

ParseTable10.mo ParseCode10.mo Token10.mo . . . LexTable10.mo . LexerCode10.mo lexerModelica.l . parserModelica.y SCRIPT.mos . . Main.mo . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

152 157 168 169 171 176 180 205 206

Chapter 1

Introduction 1.1

Background

The OpenModelica project develops a modeling and simulation environment based on the Modelica language [Fritzson, 2004]. The effort is supported by the Open Source Modelica Consortium (OSMC). It uses the OpenModelica Compiler (OMC) [Fritzson et al., 2009] to generate either C, C++ or C code that runs simulations which are written in the Modelica language. OpenModelica currently makes use of the tool called Another Tool for Language Recognition (ANTLR) to generate the Parser for the OpenModelica Compiler (OMC). The work presented in this master’s thesis offers an alternative for the ANTLR parser. We present a novel Compiler-Compiler implemented completely on MetaModelica. MetaModelica is an extension of the Modelica language intended for modeling the semantics of languages. One large example is the modeling of the whole Modelica language together with its MetaModelica extensions in the OpenModelica bootstrapped compiler version [Sj¨ olund et al., 2011]. The ANTLR parser generator [Parr and Quong, 1995], which is already used in the OpenModelica project since several years, has well known disadvantages including memory overhead, bad error handling, lack of type checking, and not generating MetaModelica code for building the Abstract Syntax Tree (AST). Since the AST nodes are initially generated in C (for later conversion into MetaModelica) without strong type checking, small errors in the semantic actions in the grammar are not detected at genera-

1

2

CHAPTER 1. INTRODUCTION

tion time, and can give rise to hard-to-find bugs in the generated C code. When the semantic actions can be specified in MetaModelica and the AST builder generated in MetaModelica, this source of errors can be completely eliminated. Currently ANTLR generated parsers connect with OMC by an external C interface. It is also built as an integrated Lexer and Parser that hide behind a considerable amount of libraries. These libraries handle as a black-box the complexity of the syntax analysis process in the compiler. ANTLR is only suitable for parsing LL grammars.

1.2

Project Goal

The goal of this master’s thesis is to write a parser generator using MetaModelica language that can replace the current parser ANTLR and generates MetaModelica code instead of C-code. The results expected from this thesis are: • A Lexer and a Parser for Modelica grammar including its MetaModelica extension that outputs the Abstract Syntax Tree (AST) for the language processed. • Lexer and Parser generator written in the MetaModelica language. • Improvements in the error handling messages compared with ANTLR; specifically the messages concerning error correction hints of malformed syntax.

1.3

Methodology

The methodology used for the construction of the OpenModelica CompilerCompiler parser generator (OMCCp) is based on a literature study of compiler construction techniques. There are various projects that offer lexer and parser generators but there are none for the Modelica language. A literature review is the base for the initiation of this project on compiler construction. Different literature from the OSMC is available. This contributes for a better understanding on the OpenModelica project. Besides the literature reviewed, we include the experience of the supervisor Martin Sj¨orlund. He built the first bootstrapping compiler for the Modelica Language [Sj¨olund et al., 2011]. Various papers and books from the examiner are available.

1.4. INTENDED READERS

3

The examiner has a clear vision of the next steps in the development of the compiler due to his involvement in the project since it started several years ago. There are exercises available for learning the MetaModelica, including online courses. The exercises are important for familiarisation with the MetaModelica language. A guide of MetaModelica is also provided to address the most common built in functions and limitations of the language. After the literature review, existing technologies that can support the project are addressed. A review of the techniques they use and the benefits is performed. This will lead the architectural decisions towards the implementation of the parser and lexer generator. Finally the implementation of a subset of the Modelica grammar for the existing parser generator is addressed. This will finalise the project and prove the validity of the proposed solution.

1.4

Intended Readers

The reader of this document is someone who wants to understand more about compiler construction and more specifically the syntax analyser phase of the OpenModelica Compiler. This document has important information for the OpenModelica developer who wants to work on the OMC compiler design and construction.

1.5

Thesis Outline

This thesis gives an overview of the OpenModelica project and the architecture of the OpenModelica compiler. In the Chapter 2, Theoretical Background, it familiarise the reader with the topic of Compiler Construction. More specifically the Lexer Analysis and Syntactic Analysis and different basic concepts about grammars. This thesis covers the topic of existing technologies in chapter 3 as a basic understanding for the Implementation. Finally the Chapters 5 Discussion and 6 Related Work explains different parts of the project analysing the results of the implementation. The conclusions review the achievement of the goals and analyse the implemented solution. Further work provides the reader who intends to continue this

4

CHAPTER 1. INTRODUCTION

work more information about desired extensions and improvements over this project. The appendices cover the source code of the entire project including the sample generated files from the exercise 10 of the MetaModelica exercises available in [Fritzson and Pop, 2011a,b]. A large subset of Modelica 3.2 grammar [Modelica-Association, 2010] is also included. It was used to prove the usability of the parser generator.

Chapter 2

Theoretical Background “The world as we know it depends on programming languages” Aho et al. [2006] We required a strong knowledge of compilers construction theory for implementing the solution for this thesis. For a better understanding of this project; the reader must be familiar with some of the fundamental terms and basic algorithms used for the construction of the lexical analyser and the parser during this project. This chapter covers the main topics of compiler construction that are used on the implementation of the solution presented in Chapter 4. The next part of this chapter addresses an important topic for this project; which is the improvement on error handling during the compiler parsing phase. The last part presents an overview of the current OpenModelica project including the Modelica and MetaModelica languages and the OMC.

2.1

Compilers

Aho et al. [2006] is a mandatory book for anyone who intends to understand the concepts of Compilers. Most of the compiler’s theory covered on this part is based on this book. Other sources such as Kakde [2002] and Terry [2000] have been reviewed and is addressed in the different subsections. This section intends to give the reader an introduction of the compiler terms and techniques used during the design and development of this project.

5

6

CHAPTER 2. THEORETICAL BACKGROUND

2.1.1

Fundamentals

Programming languages rely strongly for their evolution and massive use on compilers. These languages exist due to the limitations for developers in building complex systems in machine-language; which only identify sequences of binary instructions. However, in a more general view, a Compiler is a software tool that serves as a translator from one language into another.

Figure 2.1: Compiler Phases If we see a compiler as a process we can identify the source language as the input and the target language as the output of this process. For example in languages such as C, the input language is the C code and the output language is machine code for a specific architecture and operative system.

2.1. COMPILERS

7

There are several types of compilers used today, and the classification depends on different purposes of the compilers. We distinguish between native-compilers, cross-compilers, interpreters and source-to-source compilers translators. Native-compilers are used for the generation of machine-specific code (binary code). The cross-compilers generate machine-specific code too; but they generate the code for a different machine as the one they are running. The interpreters for languages are similar to the Java virtual machine (JVM)1 . They receive as an input two parameters: the source program and the input for the program. The interpreter simulates the result of the compiled source program executed directly in machine-language code. It outputs the expected result of the source program over the input used as a parameter. We are addressing the source-to-source compilers in this report; which are commonly used for translating one high level language such as Modelica into another high level language such as C. This technique is common due to the difficulty of generating low level language code such as Assembler or directly binary code. The complexity of a compiler is showed in the figure 2.1, inside a compiler there are two main parts that can be recognised, the Analysis (FrontEnd ) and the Synthesis (Back-End ). The Analysis phase is handled by the Front-End of the compiler. The Front-End is divided into three steps: Lexical Analysis, Syntax Analysis and Semantic Analysis. These steps as presented in figure 2.2 The Lexical Analysis task is performed by a component called Lexer. The main function of the Lexer is to get as an input the source code and recognised different sequences of characters into a unit called token. The Front-End is the part of the compiler that we focus on this implementation. During the analysis phase, the source code is processed by the Lexical Analyser, Syntax Analyser and Semantic Analyser to output an intermediate representation of the input code called Abstract Syntax Tree (AST). 1 http://www.java.com/

8

CHAPTER 2. THEORETICAL BACKGROUND

Compiler Front-End tokens

input program Lexer

abstract syntax tree Parser

three-address code Intermediate code generator

Symbol Table

Figure 2.2: Compiler Front-End

2.1.2

Lexical Analysis

The Lexical Analysis, also called scanning, receives the source code as a character stream. It identifies the special tokens specified by a language making it more simple for the next phase of the compiler. The programming language’s tokens are often specified by the use of Regular Expressions. A Lexer is a program that runs a Finite Automata which recognises a valid language based on a regular language. As mentioned above, the regular languages are described by the use of regular expressions. The Lexical Analysis is the first part of the compiler. It simplifies the complexity of recognising a complete grammar, by giving a simple transformation of the source code into a list of tokens. In the next step of the compiler the syntax analysis uses only the tokens to accept or reject the source code provided. For better understanding on how the Lexical Analysis works; we introduce the basic concepts of Finite Automata and Regular Languages in the next section. We present later a description of what a Lexer specifically does. Finite Automata and Regular Languages Sipser [2005] presents the use of a Finite Automata, also known as Finite State Machines, to recognize the regular languages. He defines a Finite Automata as a collection of states (Q), an alphabet (Σ), a transition function(δ : QxΣ → Q), a start state (q0 ) and a set of accept states.

2.1. COMPILERS

9

To describe briefly how a Finite Automata works, the use of state diagrams are broadly used. There are two types of Finite Automata, the first one is Deterministic Finite Automata (DFA) and the other is NonDeterministic Finite Automata (NFA). The Lexer For the construction of the Lexer it is preferred to use the DFA. However, a NFA can also be converted into a DFA. A lexer can also simulate the non-deterministic behaviour of a NFA. The main reason for using a DFA is that we want to have a transition function δ that allows the Lexer to decide only one path over a specific char input in the character stream from the source code. All the regular expressions of the set of tokens are summarised during the construction of a Lexer. It often happens that one sequence of characters can be recognised as two or more different tokens. Therefore, the lexer must have extra rules that prioritise longer strings over shorter ones. Other instructions can also be added to order the rules in an accepting sequence to avoid ambiguity. The Lexical Analysis phase filters some tokens that are used only by the programmer such as comments, different kind of spacing and indentation on the code. This task simplifies the complexity of the code by converting all the characters in a list of tokens. If the Parser has to deal with this task the amount of terminal tokens will increased, making the Rules more complex and decreasing the overall performance of the compiler. The compiler gains performance when the Lexical Analysis is kept separated from the Syntax Analysis. This performance can be achieved by applying specialised techniques in the handling of the character stream, such as buffering for reading certain amount of characters at the time. A structure called Symbol Table is used to store all the identifiers with their names or values, this structure avoids duplication and efficiency of the code through all the phases of the compiler as represented in the Figure 2.2. A token is usually represented by a tuple, consisting in an identifier of the token and a reference to the Symbol Table, e.g. T OKEN < IDEN T, x > where IDENT is the identifier of the token and x is the value found by the Lexer of the identifier.

10

CHAPTER 2. THEORETICAL BACKGROUND

Flex a Lexer Generator There are some programs that automate the labour of constructing the transition rules to identify the tokens for a Lexer. Flex is one example of a Lexer Generator. It is based on the Lexer generator Lex. It uses as an input a file with the definition of the rules for the recognition of the tokens. These rules are defined using regular expressions. Flex also allows the developer to specify the return token that matches each pattern. Some tokens such as white spaces and line feeds are ignored as explained above. In the next chapter we cover the existing technologies used for this report and we explain in a technical detail Flex.

2.1.3

Syntax Analysis

The Syntax Analysis phase is performed by a program called Parser. The Parser requires a more powerful language than regular expression to specify the programming language constructions. The rules are commonly expressed using Context Free Grammars (CFG). The CFG can be recognised by the use of a PushDown Automata. PushDown Automata and Context Free Languages Sipser [2005] defines a Context Free Grammar (CFG) as a 4-tuple (V, Σ, R, S), where V is the set of variables, Σ is a set of terminals, R is a set of Rules and S is the Start Variable. A Push-Down Automata (PDA) starts by reading an input set of tokens. The PDA uses the tokens and a stack to store and decide the next state and action of the PDA, this action can be to reduce from the stack or to store any state or token into the stack. By doing this it keeps running until it finds an accept state and then ends. Several situations can happen including an infinite loop of the machine. This explains why the grammar should be constructed in such a way that it avoids these problems. Similar to the DFA, there are PDA that are deterministic, and those are the ones we consider for building the Parser.

2.1. COMPILERS

11

The Parser The Parser is in charge of determining if the source code that has been tokenised by the Lexer is constructed according to the rules of the grammar. By doing this, it executes a PDA that outputs “accept”if the input belongs to a valid construction of the grammar, otherwise it outputs an error message identifying the token that does not fit the construction rules. The work done by the Lexer in the first phase of the compiler allows the Parser to ignore tokens such as white spaces and line feed and consider all the tokens as terminals of the grammar that describes the language. This simplifies the rules of the Parser making it more efficient and fast in the process of syntax analysing. The Parser validates the rules of the grammar from list of tokens received from the Lexer. However at the same time, a Parser can executes an additional task, which is the construction of a structure called Abstract Syntax Tree (AST). The AST resembles a tree and it is the representation of all the source code in a set of three instructions. This is the input for the compiler Back-End. The Back-End uses this AST for the optimisation and generation of machine-specific code. A Parser is composed by a predictive table, a stack of states, a list of tokens as an input and a parsing algorithm that runs over the list of tokens. Figure 2.3 shows these components and their interactions. A Parser uses the predictive tables, also called parsing tables, to determine the next action and the new state of the machine. The next state is queried from the parsing tables depending on the lookahead token and the current state of the stack. Parsers are commonly classified by the algorithm used for performing the parsing operation. There are three known types: Top Down Parser, Bottom Up Parser and Universal Parser. However for programming languages only the first two are utilised due to the inefficiency of the Universal Parser. A Top-Down Parser builds the parse tree from the top to the bottom. Bottom Up Parser works in the opposite direction as the Top Down Parser. Top Down Parsers only work for grammars called Left to right, Left most derivation parser (LL). A LL(k) Parser is a top descendant parser with k lookahead tokens. LL(k) Parsers utilise a predictive table to decide the next state. Bottom-Up Parsers are based on grammars Left to right, Right most

12

CHAPTER 2. THEORETICAL BACKGROUND

Tokens

lookahead

PARSER AST

State Stack

Parsing Tables

Figure 2.3: Parser components derivation parser (LR). Knuth [1965] introduced first the concept of LR parsing. The most common parsers are LR(k), Simple LR parser (SLR) and Look Ahead LR parser (LALR). The Parser LALR(1) uses a simplification of the parsing tables used by the LR(1) parser. In general a Bottom Up Parser builds the AST by performing two types of task: Shift and Reduce. Shift allocates the variable or terminal symbol found while the machine goes through the list of tokens. It utilise a table called Action Table; which contains all terminals and rules for calculating the next state. The table called GOTO Table is used for calculating the next state. When the result is calculated it pushes these values back into the state stack. Reduce pops a certain number of values from the stack to apply later a push with a new value also using the GOTO Table. While reducing a LALR parser can build up the AST and push the new value into another stack called Semantic Stack which also follows the rules of shift and reducing performed by the algorithm. Blasband [2001] made an effort in parsing grammars that do not perfectly fit into the classification of LL and LALR grammars. On this report we briefly look at Top Down Parsers. We are more interested in the LALR(1) Bottom Up Parser; which is the type of parser used in this implementation. The parser LALR(1) is explained in more detail in the next section.

2.1. COMPILERS

2.1.4

13

Parser LALR(1)

The LALR(k) parsers were first introduced by DeRemer [1969]. They are the most commonly parsers used in programming languages due to the speed and size of the parsing tables and the advantages over its predecessors the LR(0) and SLR(0) parsers. Kakde [2002] and Aho et al. [2006] explain very well how the bottom up algorithm works. We are interested here in understand the basic principles of the LALR(1) algorithm. parsing tables There are two tables in an LALR parser: The first one is the ACTION table, the second is the GOTO table. The theoretical construction of these tables can be found almost in any compiler literature such as Aho et al. [2006], Kakde [2002], Terry [2000]. There are two methods for constructing LALR(1) parsing tables from the LR(1) parsing tables: The first one, the easy but space consuming method, is presented here. The other method differentiates from the former by checking in every step of the construction of the LR(1) the simplification of the common rules, reducing significantly the number of states in the LR(1) parsing table. We explain the construction of the LALR(1) parsing tables and the content of the LR(1) through this example. Lets take this sample from the grammar from Aho et al. [2006]. Simple Grammar Sample S’ → S S → CC C → cC|d From this grammar the parsing table LR(1) 2.1 is constructed according to the algorithm presented in Aho et al. [2006]. This is called the canonical LR(1) collection. The symbol r in the table identifies a REDUCTION operation and the symbol s identifies a SHIFT operation. The keyword acc identifies the acceptance valid state. From the table 2.1 we can observe that about half of the entries in the table are blank spaces. The LR(1) parsing tables have the disadvantage of

14

CHAPTER 2. THEORETICAL BACKGROUND

Table 2.1: LR(1) parsing table [Aho et al., 2006]

state 0 1 2 3 4 5 6 7 8 9

ACTION c d $ s3 s4 acc s6 s7 s3 s4 r3 r3 r1 s6 s7 r3 r2 r2 r2

GOTO S C 1 2 5 8

9

growing considerably large, even for small grammars, due to the redundancy of productions for similar states with different lookahead symbol. If we rearranged the rows in the way presented in the parsing table 2.2. We can notice that there are similarities between the productions for different lookahead and the states (3 and 6 and 8 and 9). There are states that share the same core production for different lookahead symbols. Table 2.2: LR(1) parsing table rearranged [Aho et al., 2006]

state 0 1 2 3 6 4 7 5 8 9

ACTION c d $ s3 s4 acc s6 s7 s3 s4 s6 s7 r3 r3 r3 r1 r2 r2 r2

GOTO S C 1 2 5 8 9

The LALR(1) parsing table 2.3 is constructed in based on the one above first by identifying the common core of each set and replacing the sets with

2.2. ERROR HANDLING IN SYNTAX ANALYSIS

15

an union. For better understanding of this construction the reader can address the literature [Aho et al., 2006, Section 4.7.4]. Table 2.3: LALR(1) parsing table [Aho et al., 2006]

state 0 1 2 36 47 5 89

ACTION d $ s4 acc s6 s7 s36 s47 r3 r3 r3 r1 r2 r2 r2 c s3

GOTO S C 1 2 5 89

LALR(1) Algorithm Both the LR(1) and the LALR(1) perform the same algorithm, the only difference is the parsing tables used by LALR(1) contain different states that will be Shifted or Reduced in the Stack. The parsing algorithm starts by finding the right action in the ACTION table for a given terminal symbol a and a current state i denoted ACTION[i,a]. This value can have either a REDUCE (r), SHIFT (s), ACCEPT (acc) or error (blank) action. The GOTO table is used to find the next state j and the non-terminal I denoted GOT O[Ii , A] = Ij . A REDUCE action takes a certain number of symbols from the parsing stack, apply a transformation and puts back the result and the next state back into the stack. When an error is detected (blank entry in the parse table), several correcting actions can be performed. This topic is covered with more detail in Section 2.2.

2.2

Error Handling in Syntax Analysis

The Error handling techniques in the Front-End are more relevant during the Syntax Analysis phase and the Semantic Analysis phase than in the

16

CHAPTER 2. THEORETICAL BACKGROUND

Lexical Analysis phase. Only a few errors can be detected by the Lexical Analysis, such as nonterminated comments, invalid characters used or unrecognised token. One possible error-recovery strategy implemented in a Lexer is to ignore invalid characters from the input and keep the process. The Error handling techniques can be divided into two topics: Error recovery techniques and Error Message display. Error recovery techniques are concerned on how the parser can keep parsing after an error token is found. Error Message displays are related with how to present useful hints for the developer in order to correct the source code. In this section we will present the two topics named above for error handling techniques during the Syntax Analysis phase.

2.2.1

Error Recovery

For LALR parsers several error recovery techniques have been developed as in [Burke and Fisher Jr, 1982, Bilos, 1983, Burke and Fisher, 1987, McKenzie et al., 1995, Degano and Priami, 1998, Corchuelo et al., 2002] and more recent researches as in [Kats et al., 2009, de Jonge et al., 2010]. Error recovery techniques try to improve the quality of the parser by different techniques such as primary recovery or secondary recovery. The first condition to start the recovery is to access the configuration obtained when the token preceding the error token was shifted onto the stack. Techniques for deferring the reduce actions after a shift have been developed in Burke and Fisher Jr [1982]. Primary techniques are related with single token modification from the list of tokens. Single modification is only possible when the error is classified as simple. This modification can be either insertion, deletion, substitution or merging. Every attempt to perform a repair is known as a trial. A common technique for searching the trials is to attempt to repair the error token by performing one of these operations: merging, insertion, substitution, scope recovery and finally deletion. In the case of insertion or substitution a set of possible candidates should be generated and then from there a single candidate or none should be selected. When the error requires more than a simple modification, the list of

2.3. THE OPENMODELICA PROJECT

17

tokens needs to be reduced. This can be done by discarding tokens that precedes, follows or surround the error token. This is known as secondary recovery.

2.2.2

Error Messages

In simple recovery the error messages are classified in 5 different types: merging, misspelling, insertion, deletion, substitution. In secondary recovery the error messages are classified in 2 types. Type 1 error messages are displayed when the discarded tokens are present in a single line. Type 2 errors are displayed when multiple lines need to be discarded. In addition there are 3 other types. The first refers to different candidates for a recovery. The second type is displayed when the end of file is reached but not expected. The third is used when all error recovery routines fail; and then the parser displays a generic unrecoverable syntax error message.

2.3

The OpenModelica Project

OpenModelica2 is an open source project leaded by the Open Source Modelica Consortium (OSMC)3 . At the moment of writing this report OpenModelica is on version 1.7.0 launched on April 2011. OpenModelica contains different tools that contribute with the design and construction of simulation projects in OpenModelica. These tools are classified into: Compiler tools, Graphic interface tools, Eclipse-based environment. The OpenModelica environment consists in several tools such as OMEditor, UML-Modelica, OMShell, OMNotebook, DrControl under OMNotebook and Modelica Development Tooling (MDT). There are some other resources such as documentation, OMDev (tools for building the compiler), and auxiliary tools for the OpenModelica Developer that have been used during the development of this project. Figure 2.4 shows the architecture of the OpenModelica environment. 2 OpenModelica: 3 OSMC:

http://www.openmodelica.org http://www.openmodelica.org/index.php/home/consortium

18

CHAPTER 2. THEORETICAL BACKGROUND

Graphical Model Editor/Browser

Eclipse Plugin Editor/Browser OMOptim Optimization Subsystem DrModelica NoteBook Model Editor

Interactive session handler

Textual Model Editor

Modelica Compiler

Execution

Modelica Debugger

Figure 2.4: OpenModelica Environment [Fritzson et al., 2009]

2.3.1

The Modelica Language

The design of the Modelica Language was started in the fall 1996. The first report of the language was made available on the web in September 1997. The first publication on Modelica by Elmqvist [1997] was made at the Symposium on Computer-Aided control System Design is been developed ever since, with several researcher contributors e.g. Fritzson and Bunus [2002], Pop and Fritzson [2005, 2006], Akesson et al. [2008, 2010], Sj¨olund [2009], Sj¨ olund et al. [2011], Lundvall et al. [2009] as a language created for multi-domain modelling and simulation. Modelica is an equation-based and object-oriented language designed with the aim of defining a de facto standard for simulation. There have been recent efforts in writing a new Modelica compiler. The compiler and other parts of the OpenModelica project are described in Fritzson et al. [2009].

2.3.2

MetaModelica extension

The main source of information for the MetaModelica language is the draft document “MetaModelica users guide” written by Fritzson and Pop [2011a]. This document has been improved recently by Fritzson and Pop [2011b] towards the implementation of the specifications of a new version of the

2.3. THE OPENMODELICA PROJECT

19

Modelica Language. MetaModelica was created in the OpenModelica project with the intention of modelling the semantics of the Modelica language. MetaModelica is then the starting point for the construction of a Modelica Compiler. The MetaModelica Language is part of the project to create a Bootstrapping compiler written in MetaModelica for MetaModelica and Modelica language. MetaModelica adds new operators and types to the Modelica language. We cover in this report the constructs uniontype, record, matchcontinue and list. uniontype The uniontype is a construct that allows MetaModelica to declare types based on the union of 2 or more record types. It can be recursive and include other uniontypes. An example of uniontype is presented in listing 2.1. Listing 2.1: MetaModelica uniontype 1 3 5 7

uniontype Exp record INT Integer i n t e g e r ; end INT ; record IDENT String i d e n t ; end IDENT ; end Exp ;

matchcontinue The matchcontinue instruction resembles the switch instruction in C with some additions. Unlike the switch instruction, matchcontinue can return a value. It can contain more than one conditional, and it can also return more than one value. A section for definition of local variables is present right after the matchcontinue declaration. The wild card ‘ ’ (underscore) can be used to match all cases, additionally an else case can be used instead of the wild card. The matchcontinue instruction contains case blocks similar as the common switch instruction in C code. Each case can contain an equation-block. The program flow tries to execute correctly one instruction after the next

20

CHAPTER 2. THEORETICAL BACKGROUND

one in a specific equation-block. If any instruction is not executed or fails, the next case is tried. If it fails again then it keeps trying the next case until one case block reaches the end. Then a return value is assigned to the corresponding variables or case block can reach the final and no value is assigned to the variables. An example of the syntax for matchcontinue is presented in listing 2.2. Listing 2.2: MetaModelica matchcontinue 2 4

( token , env2 ) := matchcontinue ( a c t ) local Types . Token t o k ; case ( 1 ) equation

6 8 10 12

t o k = Types .TOKEN( tokName [ a c t ] , act , b u f f e r , i n f o ) ; then (SOME( t o k ) , env2 ) ; case ( ) then (NONE( ) , env2 ) ; else then (NONE( ) , env2 ) ; end matchcontinue ;

list It is used to create linked list that works as in C. The bracket are used to define the list elements. The operand ‘::’ is used to add items or to retrieve items from a list. To illustrate how a list works in MetaModelica, we have the following sample code in listing 2.3. In this code the instruction on line 1 creates a list called ‘a’. In the line 2 it retrieves the top element ‘3’ from the list ‘a’, and saves it in the variable i. It saves the rest of the list back in the variable ‘a’. Finally we have the line 3 which is the inverse operation of line 2 and will add an item ‘i’ into the list ‘a’. Listing 2.3: MetaModelica list 1 3

l i s t a = { 1 , 2 , 3 } ; i : : a=a ; a=i : : a ;

2.3. THE OPENMODELICA PROJECT

2.3.3

21

Abstract Syntax Tree - AST

The AST (Abstract Syntax Tree) is is a structure that abstracts part of the details present in the source code and represents unambiguously the constructs of the programming language. The declaration of the AST (semantic constructions of the language) is possible to be built in MetaModelica language because of the construct ‘uniontype’ explained in the last section. The construction of this tree is based in primitive operations such as integer operations. These constructs represent the semantic constructs of Modelica and MetaModelica language. The file Absyn.mo contains the specification for the constructs.

22

CHAPTER 2. THEORETICAL BACKGROUND

Chapter 3

Existing Technologies Several technologies were the base for the construction of this project. In this chapter we present an introduction about some of the features that the OpenModelica Compiler (OMC) has built-in and the current ANTLR parser generator used in OMC. Fast Lexical Analyzer Generator (FLEX) and GNU Bison are the technologies in which this project was based for the construction of the OMCCp. Aaby [2003] is a good reference for those who want to learn more about compilers construction with Flex and Bison. We use his book and the GNU Bison manual (latest version today 2.4.3 by Donnelly and Stallman [2010]) to explain some technological aspects of FLEX and GNU Bison. This chapter is not intended to be a guide for these technologies, but it gives the reader the required concepts to understand the rest of this thesis.

3.1

OpenModelica Compiler (OMC)

OpenModelica Compiler (OMC) is the core tool for the OpenModelica project. It has been developed since the beginning of the Modelica language. Featured in Fritzson et al. [2009].

3.1.1

Architecture and Components

The architecture of OMC is presented in Figure 3.1. The main components of this diagram represent some of the phases of a compiler where the lexical and syntax analysis is represented by the starting process “Parser”. This 23

24

CHAPTER 3. EXISTING TECHNOLOGIES

parser does both, the lexical and the syntax analysis due to the design of ANTLR as an integrated lexer and parser generator. Main

Lookup SCode.Class

(Env, name)

Parse

Absyn

SCode /explode

SCode

Inst

SCode.Exp Exp.Exp

DAE

DAE

Dump

Flat Modelica

DAELow

(Exp.Exp, Types.Type)

SimCode

Static

C code

(Env, name) Values.Value

Ceval

Figure 3.1: OMC simplified overall structure [Fritzson et al., 2009] The OMC is used to compile both, MetaModelica grammar and Modelica grammar from version 1 until 3 (see Fig 3.2). Its source code is available for download from the subversion repository1 .

OMC Compiler

MetaModelicaParser

Modelica1Parser

Modelica2Parser

Modelica3Parser

Figure 3.2: OMC Language Grammars

3.1.2

ANTLR

Another Tool for Language Recognition (ANTLR) is a parser generator tool that integrates the lexical analysis and the syntax analysis in one single tool. 1 OpenModelica

Subversion: svn co https://openmodelica.org/svn/OpenModelica/trunk/

3.1. OPENMODELICA COMPILER (OMC)

25

It generates parsers for LL(k) grammars. ANTLR was created by Parr and Quong [1995]. ANTLR is today in version 3. The information presented here is extracted from the official website2 and the tutorial website by Mills [2005]. The reference manual by Parr [2007] is a more complete and detailed information about ANTLR. This project is intended to be a substitute for this tool. We consider important for this project to get an overview of the most significant features and characteristics of this tool. The grammar used in ANTLR is of type LL(k), which means that the parsers generated by ANTLR are Top-Down parsers as explained in Section 2.1.3. ANTLR uses Extended Backus-Naur Form (EBNF) notation for defining the grammar rules. The notation Extended Backus-Naur Form is an extension of Backus-Naur Form (BNF). EBNF notation adds new constructs to the BNF such as ‘+’ to indicate one or more of an item after square brackets. The grammar file used by ANTLR contains several parts as presented in listing 3.1. Listing 3.1: ANTLR grammar file structure

3

header { // s t u f f t h a t i s p l a c e d a t t h e top o f g e n e r a t e d f i l e s }

5

o p t i o n s { o p t i o n s f o r e n t i r e grammar f i l e }

7

{ o p t i o n a l c l a s s preamble − output t o g e n e r a t e d c l a s s f i l e immediately b e f o r e the d e f i n i t i o n of the class } c l a s s Y o u r L e x e r C l a s s extends L e x e r ; // d e f i n i t i o n e x t e n d s from h e r e t o n e x t c l a s s d e f i n i t i o n // ( o r EOF i f no more c l a s s d e f s ) o p t i o n s { YourOptions } tokens { EXPR; // I m a g i n a r y t o k e n THIS=” t h a t ” ; // L i t e r a l d e f i n i t i o n INT=” i n t ” ; // L i t e r a l d e f i n i t i o n }

1

9 11 13 15 17 19 21

lexer rules . . . myrule [ a r g s ] r e t u r n s [ r e t v a l ] o p t i o n s { d e f a u l t E r r o r H a n d l e r=f a l s e ; } 2 ANTLR:

http://www.antlr.org/

26

23 25 27 29 31 33 35 37

CHAPTER 3. EXISTING TECHNOLOGIES

: ;

// body o f r u l e . . .

{ o p t i o n a l c l a s s preamble − output t o g e n e r a t e d c l a s s f i l e immediately b e f o r e the d e f i n i t i o n of the class } c l a s s Y o u r P a r s e r C l a s s extends P a r s e r ; o p t i o n s { YourOptions } tokens . . parser rules . . . rulename [ a r g s ] r e t u r n s [ r e t v a l ] o p t i o n s { d e f a u l t E r r o r H a n d l e r=f a l s e ; } { o p t i o n a l i n i t i a t i o n code } : alternative 1 | alternative 2 ... | alternative n ;

39

45

{ o p t i o n a l c l a s s preamble − output t o g e n e r a t e d c l a s s f i l e immediately b e f o r e the d e f i n i t i o n of the class } c l a s s Y o u r T r e e P a r s e r C l a s s extends T r e e P a r s e r ; o p t i o n s { YourOptions } tokens . . . tree parser rules . . .

47

// a r b i t r a r y l e x e r s , p a r s e r s and t r e e p a r s e r s may be i n c l u d e d

41 43

As we can see, it contains several sections including a header, lexer (tokens and rules), parser (tokens and rules), AST rules and options sections that are copied verbatim to the generated parser. The generated parser files are in the desired target-language that is specified when compiling the grammar file. ANTLR allows the OpenModelica developers to specify in a robust and flexible way the rules and the grammar for the combination of the lexer and parser. It then generates code in the target language that outputs the designed AST for both Modelica and MetaModelica grammar.

3.1.3

Current state

The OMC is today (May 2011) in the version 1.7.0 (r8600). It is intended to be used by both industry and academy. Various research materials have

3.2. FLEX

27

been produced since 1997 including: Master’s3 and PhD’s4 thesis, conference papers5 , journals papers6 and books7 . Other more are today under development or recently finished such as this master’s thesis. This proves that the OMC is today an active research topic in the OpenModelica project.

3.2

Flex

Based in the tool called Lexical Analyzer Generator (LEX). The grammar accepts regular expressions to define the tokens.

3.2.1

Input file lexer.l

The FLEX input file lexer.l contains three sections: definitions, rules and user code. Listing 3.2: Flex file structure 1 3 5

Definitions %% Rules %% User code

Definitions: Contains declarations of definitions and start conditions. Can contain code to be included verbatim to the output in the top as a declaration. Rules: Contains the rules in the form of patterns of and extended set of regular expressions. Each rule contains an action in C code that can return a token, reject or change the start condition. User Code: It is copied verbatim to the output file.

3.2.2

Output file lexer.c

The output file lexer.c generated by FLEX contains three main sections: Declaration of variables and arrays, the algorithm that runs the DFA and 3 http://www.openmodelica.org/index.php/research/master-theses 4 http://www.openmodelica.org/index.php/research/phd-and-licentiate-theses 5 http://www.openmodelica.org/index.php/research/conference-papers 6 http://www.openmodelica.org/index.php/research/journal-papers 7 http://www.openmodelica.org/index.php/research/booksproceedings

28

CHAPTER 3. EXISTING TECHNOLOGIES

the return action section with the actions that have been specified for each rule. The arrays that are present in the lexer are: yyec: Mach any UTF-8 code with a started condition. yyaccept: check the states against the accept condition. yyacclist: once accepted, the action for each state is found here. yymeta: control array for the transitions. yybase: control array for the transitions. yydef: default transition for the states. yynxt: determines the next transition of the states. yychk: control array that verifies errors. FLEX is designed to handle a large amount of rules and tokens. It simplifies the number of rules and tokens utilized by the parser in the next phase of the compiler. That is why it is common to find a combination of FLEX and other parser generators such as the tool called Yet Another Compiler-Compiler (YACC) or its successor GNU Bison. For a complete reference of FLEX, the FLEX manual by Paxson [2002] is a good source of information.

3.3

GNU Bison

GNU Bison is a parser generator that generates a LALR(1) parser from a context-free grammar. The generated parser can be in one of these three languages: C-code, C++ and Java. It is based on the tool called YACC. GNU Bison receives as an input a file with the grammar rules. This grammar file is specified using BNF. The output of the process is a parser written in C that communicates with a lexer, commonly written in LEX or FLEX. In this section we explain these input and output file in detail and cover some other details about GNU Bison that will increase the understanding of the presented project implementation in the next chapter.

3.3. GNU BISON

3.3.1

29

Input file parser.y

There are 4 sections in a grammar file: Prologue, Bison declaration, Grammar rules and Epilogue distributed as presented in Listing 3.3. Listing 3.3: Bison file structure 1 3 5 7 9 11

%{ Prologue %} Bison d e c l a r a t i o n s %% Grammar r u l e s result : r u l e 1 −components . . . | r u l e 2 −components . . . ... ; %% Epilogue

Prologue: Macro definitions, declarations of functions, variables used in the grammar rules. It is attached verbatim to the beginning of the generated file. Declarations: Define terminal, nonterminal symbols and specify precedence. Grammar rules: Rules expressed as result of composition of rules and defines an action for each rule in Backus-Naur Form (BNF) notation. Epilogue: is attached to the generated file at the end as the prologue in the beginning. Each section is separated by a specific token, the prologue uses ‘%}’ and the other sections use ‘%%’.

3.3.2

Output file parser.c

GNU Bison generates a file that contains C source code. In generated file we identify four main parts: Declarations of variables and transition arrays; the algorithm that runs the PDA; the response actions that build the AST tree while doing the reductions; and a section for custom code inserted by the developer.

30

CHAPTER 3. EXISTING TECHNOLOGIES

Transition arrays Aho et al. [2006] presents the algorithm for LALR(1) based on the creation of two dimensional parsing tables called ”ACTION table” and ”GOTO table” as presented in Section 2.1.4. GNU Bison improves the efficiency of the storage by converting these two dimensional tables in arrays using the algorithm method described by Tarjan and Yao [1979]. Popuri [2006] presents a detailed analysis of the role of each array in the generated file. The 15 arrays generated by GNU Bison are: yytranslate: interface with the lexer to understand the tokens of the lexer in the internal representation of the parser. yyrhs: list of symbol numbers of all rules. yyrhs[n]= first symbol on the RHS of the rule. yyprhs: index in yyrhs of the first rhs symbol of each rule. yyrline: line number in the grammar file where the rule is defined. yytname: list of names of defined symbols. yytoknum: list of value of the tokens in the lexer. yyr1: specifies the symbol number for each rule. yyr2: number of tokens to be reduced for a certain rule. yydefact: default reduction for each state yydefgoto compressed GOTO table, each entry specifies the state to transition to each non-terminal. yytable: state numbers in a pre-calculated order, works together with yycheck, yypgoto and yypact to indicate the next state and the rule to be use for a reduction. yypact: indicate what to do next, work together with yytable. yypgoto: indicates anomalies that derive in errors. yycheck: a control table that matches the current rule that guides to the discover of anomalies in the parser.

3.3. GNU BISON

31

As a summary the array yytranslate is as it names indicates a translator between the lexer and parser. The arrays yydefact, yydefgoto, yyr1, yyr2, yytable, yypgoto, yypact and yycheck are used to represent the LALR(1) parsing tables ACTION and GOTO. The other arrays are helpers for debugging and printing. LALR algorithm GNU Bison uses the function yyparse to start the parsing operation. It makes use of two stacks, one for the states and one more for the reductions and the construction of the AST that is called parser stack. The algorithm starts by reading a token and pushing the value into the parser stack, this operation is called SHIFT. When the stack accumulates enough elements to match a rule the elements are popped from the parser stack and converted into a new symbol, which is the result action of the rule, this operation is called REDUCE. This result is pushed back again into the parser stack. In each step the parser computes the next state based on the arrays presented before and pushes or pops at the same time as the SHIFT and REDUCE actions are performed over the parser stack. GNU Bison includes two new symbols into its internal symbol configuration, the token accept and end are added to identify when the parser stops and when it finish in an acceptance state. Construction of the AST The AST is constructed by the REDUCTIONS actions. The AST can be specified in the section of the grammar file for the grammar rules; specifically where the description of the actions for the construction are placed. Error handling GNU Bison uses a single primary recovery technique based on the activation of an error flag that tells the parser to suppress the syntax error diagnostic while recovering from an error. A developer can write specific rules for the special token error that can help the parser to detect easily the presence of errors. There is a chapter in the reference manual of GNU Bison that can give a more detailed overview of the error handling.

32

CHAPTER 3. EXISTING TECHNOLOGIES

Other considerations about BISON The parser generated by C manage the memory in an efficient way to avoid memory exhaustion specially in situations when too many tokens have been shifted without a reduction operation. This operation is done inside the parser and usually a developer does not need to configure anything, although the parameters called YYMAXDEPTH and YYINITDEPTH can be modified. The GNU Bison parser generator supports the generation of code in three programming languages: Java, C and C++. A table of symbols is provided in the GNU Bison manual [Donnelly and Stallman, 2010, Appendix A] that allows better understanding of the code inside the generated parsers. However the article written by Popuri [2006] explains in detail the structures, arrays and makes internal comments in specific part of the code that clarifies more the algorithm used by GNU Bison. Some examples of the interaction of FLEX and GNU Bison are presented in Aaby [2003].

Chapter 4

Implementation 4.1

Proposed Solution

This chapter presents the results of this project, which are the design and implementation of a MetaModelica Lexer and Parser Generator written in the MetaModelica Language that has been named as the OpenModelica Compiler-Compiler parser generator (OMCCp) 1 . We present in this chapter a description of the design and architecture of the parser generator proposed in this report. We cover at the end the description of the error handling techniques used in OMCCp and the integration into the OpenModelica Compiler (OMC). First we start by outlining the main characteristics that describe this solution: • It is a tool written entirely in MetaModelica language. • Includes a Lexer and a Parser LALR(1) Generator that generate MetaModelica code. • The Lexer is based on the existing tool FLEX for the generation of the transition tables for the DFA. • The Parser is based on the existing tool BISON for the generation of the transition tables for the Deterministic Push-down Automata. 1 OMCCp

was initially named OMCC. This explains why the design figures and some files in the project use the name OMCC instead of OMCCp. Whenever OMCC is present it may be interpreted as OMCCp.

33

34

CHAPTER 4. IMPLEMENTATION

• The error handler is based on a primary recovery technique, and the generation of a candidate set that is properly displayed to the developer. Section 4.2 presents the design of the solution, section 4.3 explains how the generation of files is performed by OMCCp. We continue explaining the error handling techniques utilised in OMCCp in Section 4.4 and at the end in Section 4.5 we explain the integration with the OMC. Appendix A covers the basic command line instructions for running the OMCCp that generates the files, and the instructions to run the generated parser for the grammar specified.

4.2

OMCCp Design

Figure 4.1 presents an overview of OMCCp. The Parser and Lexer Generator are based on the C files generated by FLEX and BISON from the grammar files lexer.l and parser.y.

Required files: Parser.y Lexer.l Lexer.mo Parser.mo LexCode.tmo ParseCode.tmo

INPUT FILES OMCC

OMCC GENERATED FILES

Lexer[Suffix].c

«subsystem» LexerGenerator

LexTable[Suffix].mo

Lexer[Suffix].mo

LexCode[Suffix].mo

FlexBison

Parser[Suffix].c

«subsystem» ParserGenerator

Parser[Suffix].mo ParseTable[Suffix].mo

Token[Suffix].mo

ParseCode[Suffix].mo

Figure 4.1: OMCCp (OpenModelica Compiler - Compiler) Lexer and Parser Generator

4.2. OMCCP DESIGN

35

Those files are c-code generated and contain three main parts that constitute the lexer and the parser generated. We can identify: a section with the transition arrays, a section with the machine that runs the algorithm for the lexer or parser and finally an action resolution section. The last section contains the return token action for the lexer and a reduction/AST construction action for the parser. Based on the identification of the main parts of each generated file, the design presented in figure 4.2 shows the different components of the solution. We can identify two principal package called LexerGenerator.mo and ParserGenerator.mo which are responsible for the generation of the complete Compiler-Compiler in MetaModelica code.

4.2.1

Lexical Analyser

The design of the Lexer is presented in figure 4.3. The Lexer contains 3 main files that were designed with the aim of separating: the parsing tables, the DFA and the action code. The tables for the transitions of the states are loaded at the start of the DFA. Once the DFA identifies the rule, it passes the control to the code file that takes the action and decides which token to return to the DFA. The DFA pushes the result returned into the list of tokens. The Lexical Analyser uses the built-in functions from System.mo and Util.mo files that are part of the compiler.

A new file was developed,

Types.mo, to support the uniontype Token and the records TOKEN and INFO that are used for both, the lexer and the parser. Lexer.mo The file Lexer.mo implements the DFA for the Lexical Analysis as explained in Section 2.1.2. It is the main file of the Lexer and makes the calls to the functions in the other files LexerCode.mo and LexTable.mo that constitute the lexer. Its main function is to load the source code file and recognise all the tokens described by the grammar. It returns as an output to the parser either a list of tokens or an error if no tokens were found or some characters in the source code are not compliant with the 8-bytes encoding format called UTF-8. In the process of loading the file, it converts the source code file into an

+action(in action : int)

-ENV

LexerCode.mo

«type» LexTable.mo -LexerTable

Lexer.mo

«uses»

+readFile() +writeFile() +stringOperations()

«utility» System.mo

«uses»

«uses»

«type» OMCCTypes.mo -Token -Info +printToken() +getMergeToken() +printErrorToken() +printTokens()

«call» «uses»

«type» Tokens -TokenCode

// call the lexer tokens = Lexer.scan(filename); // call the parser ast = Parser.parse(tokens);

+main(in args)

NewParser.mo

-ENV -LexerTable -program +scan(in program, out tokens) -loadSourceCode() +scanString() +lex() -consume(in program, in env, out tokens) -evalState(inout env) -findRule(inout env) +getInfo()

«utility» Util.mo -listprogram

«uses»

«uses»

«derived»«derived»

+generateLexer() : LexTable.mo +buildLexer() +buildTables() +buildCode()

LexerGenerator.mo

Parser.mo

Error.mo -addError

ParserGenerator.mo

«type» Absyn.mo -AST

«uses»

+actionReduce() +push() +getAST() +initializeStack() -reduceStringStack() -getInfo() «uses»

-MultiTypedStack

ParseCode.mo

«type» ParseTable.mo -ParseTable «uses»

+generateParser() -buildCode() -buildTables() : LexTable.mo -buildParser() -buildTokens()

-ParseTable -Enviroment -listtokens -settings +Parse(in tokens) -processToken() -reduce(in stack) -translate() +errorHandler() +addSourceMessage() +checkCandidates() +checkToken() +getTokenSemValue() +printCandidateTokens()

OMC

36 CHAPTER 4. IMPLEMENTATION

Figure 4.2: OMCCp Lexer and Parser Generator Architecture Design

4.2. OMCCP DESIGN

37

«type» Program -list program

Lexer.mo

«type» LexTable.mo -LexerTable

-ENV -LexerTable -program +scan(in program, out tokens) -loadSourceCode() +scanString() +lex() -consume(in program, in env, out tokens) -evalState(inout env) -findRule(inout env) +getInfo()

LexerCode.mo -ENV

«send»

«type» listtokens -*AllTokens

+action(in action : int, out tokens)

System.mo

Util.mo

«type» Token +name +value +col : int +row : int

Figure 4.3: OMC-Lexer design array of integers, each integer representing the UTF-8 code for the character present in the source code, e.g. the character ‘a’ will be the UTF-8 code 61. This helps the lexer to increase the speed in the process of recognising the tokens due to the direct mapping between the UTF-8 code and the position in the transition array used for the lexer. For recognizing the tokens, Lexer.mo runs a DFA based on the transition arrays found in LexTable.mo. When it reaches an acceptance state it calls the function action in the LexerCode.mo file. And finally it returns a list of tokens that are the input for the parser. The entrance function to the Lexer is named scan, and it is defined in Lexer.mo: Listing 4.1: Lexer.mo function scan

2 4

function s c a n ” Scan s t a r t s t h e l e x i c a l a n a l y s i s , l o a d t h e t a b l e s and consume t h e program t o o ut pu t t h e t o k e n s ” input String f i l e N a m e ” i n p u t s o u r c e code f i l e ” ; input Boolean debug ” f l a g t o a c t i v a t e t h e debug mode” ; output l i s t t o k e n s ” r e t u r n l i s t o f t o k e n s ” ;

The complete file Lexer.mo is available in Appendix B.1

38

CHAPTER 4. IMPLEMENTATION

LexTable.mo The LexTable.mo file is the source of the Lexer.mo file for performing the transitions to new states and finding the tokens out of the input stream. It contains two variables and 8 arrays that are extracted from the file generated by FLEX. The two parameters are utilised to perform control instructions in the Lexer.mo file and are yy limit and yy finish. The parameter yy finish is specially used to detect the end of a token. The 8 arrays that are used to perform the task of the DFA are yy accept, yy ec, yy meta, yy base, yy def, yy nxt, yy chk and yy acclist. All the arrays are utilised in the file Lexer.mo, but only the most significant arrays will be explained in this report. Those are: yy ec, yy accept and yy acclist. The first array used is yy ec, this array contains all the UTF-8 codes and the initial state of each character when consuming the input stream by the DFA. After performing a mathematical checking operation over the current character; the DFA uses the yy accept array to find if the current state is a valid acceptance state and continues until it determines that the lookahead character belongs to another token. Then it performs a roll-back operation to the last acceptance state in the stack. Following this the DFA uses the yy acclist to determine the return token; which is a call to a function in the LexerCode.mo file. The generation of this file is explained in Section 4.3.1. A complete sample of the generated file LexTable10.mo is available in Appendix E.4 LexCode.mo The file LexCode.mo contains all the specific actions that the lexer performs when a token has been recognised. The actions that can be perform are one of these three possible actions: ignore token, return a specific token or change to another DFA. The first action is to ignore the token, this operation is performed by the lexer when a space, line feed or a block of comment is been found in the input stream. The tokens ignored by the lexer simplifies the job of the parser, because those tokens are not used for any construction in the grammar, therefore they will not be converted into executable code in the further phases of the compiler.

4.2. OMCCP DESIGN

39

The second possible action that can be done when a token is been recognised is to return a specific token, the code of the action defines the token to be returned. Some information is collected together with the token in a RECORD called TOKEN that allows the parser to identify the line and the position of the token in the original source code file. The last and third action that this function does is to switch from one DFA to another one. This operation is performed in certain situations, e.g. when the DFA finds a starting comment block ‘/*’, and all the subsequent tokens are required to be ignored or categorised as a different token, e.g. in the case of recognising strings. For this action a new starting state is set up in the machine and the new characters run in a different DFA as the original. After the end token is found (e.g. ‘*/’) then the start state returns to the original one. This file is generated from the lexer.c file which is produced by FLEX, the generation of this file is explained in Section 4.3.1. A complete sample of the generated file LexerCode10.mo is available in Appendix E.5.

4.2.2

Syntax Analyser

The Parser design for OMCCp is presented in figure 4.4, it carries the function of performing the syntax analysis of the compiler. «type» Tokens -TokenCode

«call» «uses»

«type» OMCCTypes.mo -Token -Info +printToken() +getMergeToken() +printErrorToken() +printTokens()

«type» ParseTable.mo -ParseTable

Parser.mo -ParseTable -Enviroment -listtokens -settings +Parse(in tokens) -processToken() -reduce(in stack) -translate() +errorHandler() +addSourceMessage() +checkCandidates() +checkToken() +getTokenSemValue() +printCandidateTokens()

Error.mo -addError

«uses» AST «send» ErrorMessages

ParseCode.mo «uses»

«type» Absyn.mo -AST

-MultiTypedStack +actionReduce() +push() +getAST() +initializeStack() -reduceStringStack() -getInfo() «uses»

Figure 4.4: OMC-Parser design

40

CHAPTER 4. IMPLEMENTATION

The design of the parser is split in different files, similar as the lexer, to separate the logic of the PDA (Parser.mo) from the predictive tables (ParseTable.mo) and the Shift-Reduction actions that are used in the construction of the AST (ParseCode.mo). Additionally a file that connects the lexer with the parser is utilised to keep consistency with the token codification (Token.mo). The way these files interact can be seen in Figure 4.4. The interaction of the different components is split in different files and explained in this section.

Error Handling

Shift/Reduce Tokens MultiTyped Stack

OMC LALR(1) PARSER

State Stack

AST

Parsing Tables

Figure 4.5: OMC-Parser LALR(1)

Token.mo The file Token.mo contains the complete list of tokens utilised by the grammar with their respective code. This file is the link between the lexer and the parser, it is used by both of them to identify in the same way the tokens. When the parser receives the token code from the lexer, it performs a translation into local codes that are only used by the parser to simplify the addressing in the predictive arrays. This operation is done by using as index the new code assigned by the parser to the token. Each token is defined as a type constant Integer. A complete sample of the generated file Token10.mo is available in Appendix E.3

4.2. OMCCP DESIGN

41

Parser.mo The file Parser.mo implements the PDA for the Parser as explained in Section 2.1.3. In this file we find the implementation of the LALR(1) algorithm previously explained in Section 2.1.4. The main function of Parser.mo is to efficiently convert a list of tokens received by the Lexer into the AST described in the Absyn.mo file. For performing this task, Parser.mo utilizes the Predictive Parse Tables located in the file ParseTable.mo to perform the Reduce-Shift actions calls that are located in the file ParseCode.mo. The external interface of the parser is a function called “parse” described here: Listing 4.2: Parser.mo function parse

2

4

6

function p a r s e ” r e a l i z e t h e s y n t a x a n a l y s i s o v e r t h e l i s t o f t o k e n s and g e n e r a t e s t h e AST t r e e ” input l i s t t o k e n s ” l i s t o f t o k e n s from t h e l e x e r ” ; input String f i l e N a m e ” f i l e name o f t h e s o u r c e code ” ; input Boolean debug ” f l a g t o ou tp ut debug m e s s a g e s t h a t e x p l a i n t h e s t a t e s o f t h e machine w h i l e p a r s i n g ” ; output Boolean r e s u l t ” r e s u l t o f t h e p a r s i n g ” ; output ParseCode . AstTree a s t ”AST t r e e t h a t i s r e t u r n e d when t h e r e s u l t o ut pu t i s t r u e ” ;

The complete file Parser.mo is available in Appendix C.1 ParseTable.mo The ParseTable.mo file contains the arrays that allows the Parser.mo to run the PDA and performs the Shift-Reduce actions. It contains 10 integer constants and 15 arrays extracted from the file parser.c, which is generated by BISON. In this section we will cover only the most significant arrays that are used by the parser. More detailed explanation can be found in Popuri [2006]. As explained before, when the parser reads the code of the token from the lexer, it is required to performs a translation into local codes, the array yytranslate contains the converted code for each original code in the lexer. The numeration of the tokens starts at the index 3, the first 2 positions are used for the reserved tokens ”error” and ”$undefined”. If the token is an

42

CHAPTER 4. IMPLEMENTATION

UTF-8 character the table yytranslate contains the corresponding UTF-8 code. Otherwise the numeration starts after the index position 255. Once the token is being read by the lexer, it starts the algorithm that queries the yypact array. This determines the required action over the input, based on the current state and the current token code. Similar as the table ACTION in LALR parsing. The array yypact can contain positive or negative values. Positive values indicates that a Shift action must be done and negative values indicates that a Reduce action should be done. If the required action is Shift, then the PDA pushes the value obtained into the state stake. If the required action is a Reduce operation then array yydefact is queried to catch possible errors over the input. The constants YYPACT NINF and YYTABLE NINF are used to identify the errors in the corresponding arrays yypact and yytable and forward them to the error handler function. During the Reduce operations the array yyr2 is queried to find the number of tokens to reduce and the arrays yyr1,yypgoto,yytable,yydefgoto are used similar to the GOTO table in LALR parsing. These operations are send to the function ‘actionRed’, located in the file ParseCode.mo to construct the AST. The auxiliary arrays yytoknum,yytname are used to identify the token codes and names. A complete sample of the generated file ParseTable10.mo is available in Appendix E.1 ParseCode.mo The file ParseCode.mo contains the specific Reduce operations that each grammar performs when a certain rule matches the input tokens. The main function of this file is to handle the MultiTypedStack that is used by this parser to construct the AST. One of the difficulties that was found during the development of this part was the required strict-typed structure of the AST. This difficulty was overcame with the use of a MultiTypedStack, which handles the reduce operations requested by the LALR parsing algorithm. To explain how it works, we refer to Section 2.1.4 where the LALR algorithm is explained. Lets assume that we have a grammar that has 3 different types. The first one and the second one are the primitive types Integer and String, and the third one is a new type called Absyn.Exp.

4.2. OMCCP DESIGN

43

The MultiTypedStack contains one stack per each type found in the grammar specification and it is defined in MetaModelica language as presented here: Listing 4.3: MultiTypedStack AstStack 2 4 6

uniontype A s t S t a c k record ASTSTACK l i s t stackExp ; l i s t s t a c k S t r i n g ; l i s t s t a c k I n t e g e r ; end ASTSTACK; end A s t S t a c k ;

When the parser finds a Shift operation it calls the ‘function push’ on this file which will push a String value into the stackString. During the reduce operation the parser needs to know which stack to use for each of the constructions of the AST. The way this MultiTypedStack works can be explained in the following example that presents the reduce operation, the build of the AST operation and finally a push back into the stack of the construction build as part of AST. In the listing 4.4 the reduction takes three item from the three different stacks and constructs an Absyn.ALGORITHMITEM object. After this it pushes back the result into the stack for the Absyn.Algorithm type called skAlgorithmItem. Another feature that can be useful for the reductions is the use of the info keyword. In the listing 4.4, we can observe that the instruction four uses a stack called skToken to retrieve the token information. The token information returns a info token of type Absyn.Info which contains the combined information of the first and the last token in the stack that are used for this reduction. This makes possible for the developer to insert information about location that can be used later in the other phases of the compiler. Listing 4.4: ParseCode.mo case reduce action 1 3 5 7

case ( 5 7 , ) // #l i n e 344 ” p a r s e r M o d e l i c a . y” equation // r e d u c e ( i n f o , skToken ) = g e t I n f o ( skToken , mm r2 [ a c t ] ) ; v3String : : skString = skString ; v2Comment : : skComment = skComment ; v1Algorithm : : s k A l g o r i t h m = s k A l g o r i t h m ;

44

CHAPTER 4. IMPLEMENTATION

11

// b u i l d vAlgorithmItem = Absyn .ALGORITHMITEM( ( v1Algorithm ) , SOME( ( v2Comment ) ) , i n f o ) ; // push R e s u l t s k A l g o r i t h m I t e m= vAlgorithmItem : : s k A l g o r i t h m I t e m ;

13

then ( ) ;

9

One final task that can be found in this file is the function getAST, which returns to the parser the result of the AST tree. The listing 4.5 shows the interface of the function getAST implemented in MetaModelica code. Listing 4.5: ParseCode.mo function getAST 1 3

function getAST ” r e t u r n s t h e AST b u i l t by t h e p a r s i n g ” input A s t S t a c k a s t S t k ” MultiTypedStack used by t h e p a r s e r ” ; output AstTree a s t ” r e t u r n s t h e AST i n t h e f i n a l t y p e o f t h e tree ” ;

A complete sample of the generated file ParseCode10.mo is available in Appendix E.2.

4.3

OpenModelica Compiler-Compiler Parser (OMCCp)

OpenModelica Compiler-Compiler parser generator (OMCCp) is the tool implemented in this thesis that generates a Lexer and Parser in MetaModelica language based in the grammar files specified in BNF. OMCCp is composed by two packages, the first is the Lexer Generator and the second is the Parser Generator. Both of them will be explained in this section.

4.3.1

Lexer Generator

The Lexer Generator for OMCCp is implemented in the file LexerGenerator.mo. It receives as an input the file lexer.c generated by FLEX introduced in Section 3.2. The FLEX file contains the required information to be used by the file Lexer.mo explained in Section 4.2.1. For the generation of the files, a suffix for the name of the files generated is required. We encourage the name of the language as a suffix

4.3. OPENMODELICA COMPILER-COMPILER PARSER (OMCCP)45

for the generated files, although it is not restricted. There are three files generated by the Lexer Generator: Lexer[Suffix].mo, LexTable[Suffix].mo, LexerCode[Suffix].mo. By design, all the generated files by OMCCp will include the suffix after its name to avoid equal names for different languages parsers. For example, if the language grammar is Modelica the generated files will be LexerModelica.mo, LexTableModelica.mo, LexerCodeModelica.mo. For all the files generated, the Lexer Generator adds a time comment in the beginning of the file which allows a developer to identify the date and time when the file was created. This also happens with all the files genereted by the Parser Generator. Generation of Lexer[Suffix].mo The generation of the file Lexer[Suffix].mo does not require any search on the file generated by FLEX. The Lexer Generator makes a copy of the original Lexer.mo and changes some of the values that address the other files in the parser, for example LexTable turns into LexTable[Suffix], LexerCode turns into LexerCode[Suffix] and additionally it change the name of the package to Lexer[Suffix]. Generation of LexTable[Suffix].mo The generation of the file Lexer[Suffix].mo requires a search in the first section identified in the file lexer.c generated by FLEX. The regular expression function regex provided by the included file System.mo is used to match the declaration of arrays and variables explained in Section 4.2.1. Generation of LexerCode[Suffix].mo The generation of the file LexerCode[Suffix].mo requires a search in the third and last section identified in the file lexer.c generated by FLEX. The regular expression function regex provided by the included file System.mo is used to match the specified return actions for each final state of the DFA explained in Section 4.2.1. The complete file LexerGenerator.mo is available in Appendix B.2 all together with samples for the files explained here in the sections B.1, E.4 and E.5.

46

CHAPTER 4. IMPLEMENTATION

Considerations in FLEX The algorithm found in FLEX for the DFA was implemented in MetaModelica with the paradigm model of Functional Programming. This transition was not straight forward since the file generated by FLEX contains instructions GOTO; which redirects the control flow of the program to any place in the code. This behavior was emulated in MetaModelica with the recursion of the functions. FLEX generates C code, the arrays in C start the index at the position 0, in MetaModelica the arrays start the index at the position 1. All the arrays extracted from the file lexer.c were reduced in the first position to keep equivalent the addressing of arrays inside the DFA. For the extraction of the required information three main sections were identified in the code generated by FLEX: the first section includes the definition of variables and arrays, the second is the DFA and the last one is the return of the specific token for each action.

4.3.2

Parser Generator

The Parser Generator for OMCCp is implemented in the file ParserGenerator.mo. Similar to the Lexer Generator, the Parser Generator receives as an input the file parser.c generated by GNU Bison introduced in Section 3.3. The structure of the file parser.c is more complex than the generated by FLEX, because it implements a PDA algorithm, instead of the DFA of the Lexer. However, it keeps some similarity in the structure and the sections that we previously identified in the file lexer.c. The file parser.c contains the required information to be used by the file Parser.mo explained in Section 4.2.2. Modifications in the grammar file in Bison The file generated by GNU Bison was insufficient to generate the files required to parser the grammar and outputs the AST. The grammar file was modified to include MetaModelica code inside the actions of the grammar rules to allow the generation of the AST. The first modification is in the epilogue of the grammar file. The type definitions for the rules are defined as presented in listing 4.6.

4.3. OPENMODELICA COMPILER-COMPILER PARSER (OMCCP)47

Listing 4.6: Modifications in the Bison Epilogue 1 3 5

%{ import Absyn ; t y p e R e s t r i c t i o n = Absyn . R e s t r i c t i o n ; t y p e C l a s s D e f = Absyn . C l a s s D e f ; ... %}

The second modification is in the rule description section. The types inside brackets are introduced after each variable to specify the type. The parser generator reads these types and uses them to assign the REDUCE action of each part to the corresponding typed stack from the MultiTypedStack explained above. An example of this is presented in the listing 4.7. The types Restriction and ClassDef are specified right after each variable by introducing the tokens [Restriction] and [ClassDef ]. The types must be specified for both, the returning variable and the transformed input. Listing 4.7: Modifications in the Rules section in Bison 2 4 6

8

10

12

14

16

%% .. r e s t r i c t i o n : CLASS { $$ [ R e s t r i c t i o n ] = Absyn . R CLASS ( ) ; } | MODEL { $$ [ R e s t r i c t i o n ] = Absyn .R MODEL( ) ; } | RECORD { $$ [ R e s t r i c t i o n ] = Absyn .R RECORD( ) ; } | T PACKAGE { $$ [ R e s t r i c t i o n ] = Absyn .R PACKAGE () ; } | TYPE { $$ [ R e s t r i c t i o n ] = Absyn . R TYPE ( ) ; } | FUNCTION { $$ [ R e s t r i c t i o n ] = Absyn . R FUNCTION () ; } | UNIONTYPE { $$ [ R e s t r i c t i o n ] = Absyn . R UNIONTYPE( ) ; } | BLOCK { $$ [ R e s t r i c t i o n ] = Absyn . R BLOCK( ) ; } | CONNECTOR { $$ [ R e s t r i c t i o n ] = Absyn . R CONNECTOR( ) ; } | EXPANDABLE CONNECTOR { $$ [ R e s t r i c t i o n ] = Absyn .R EXP CONNECTOR( ) ; } | ENUMERATION { $$ [ R e s t r i c t i o n ] = Absyn . R ENUMERATION( ) ; } | OPERATOR FUNCTION { $$ [ R e s t r i c t i o n ] = Absyn . R OPERATOR FUNCTION( ) ; } | OPERATOR RECORD { $$ [ R e s t r i c t i o n ] = Absyn . R OPERATOR RECORD( ) ; } | OPERATOR { $$ [ R e s t r i c t i o n ] = Absyn .R OPERATOR( ) ; }

48

CHAPTER 4. IMPLEMENTATION

18 classdef 20

22 24

26

28

: c l a s s p a r t s T END IDENT { $$ [ C l a s s D e f ] = Absyn .PARTS( { } , $1 [ C l a s s P a r t s ] ,NONE( ) ) ; } | T END IDENT { $$ [ C l a s s D e f ] = Absyn .PARTS( { } , { } ,NONE( ) ) ; } | s t r i n g c l a s s p a r t s T END IDENT { $$ [ C l a s s D e f ] = Absyn .PARTS( { } , $2 [ C l a s s P a r t s ] ,SOME( $1 ) ) ; } | c l a s s d e f e n u m e r a t i o n { $$ [ C l a s s D e f ] = $1 [ ClassDef ] ; } | c l a s s d e f d e r i v e d { $$ [ C l a s s D e f ] = $1 [ C l a s s D e f ] ; }

... %%

The keyword (absyntree), visible in the listing 4.7, is used to generate the final output of the AST. The specification of this keyword in the grammar rules is mandatory. Considerations in Bison GNU Bison does not understand the semantic value of the tokens. The tokens received in the list of tokens by the parser contain this semantic value and can be use to display the error token in the error handling procedure. But for the error hints that are explained in Section 4.4 a new list structure was introduced to specify this semantic value. This new list is called lstSemValue and is shown in listing 4.8. The list order is the same as the list yytname generated by GNU Bison. This list should be located in the epilogue of the grammar file, if omitted the error handling mechanism will display the token names as in the list yytname. Listing 4.8: List of semantic values of tokens 2 4 6

c o n s t a n t l i s t l s t S e m V a l u e = { ” e r r o r ” , ” undetermined ” , ” r e a d ” , ” w r i t e ” , ” := ” , ” i f ” , ” then ” , ” e n d i f ” , ” e l s e ” , ” to ” , ” do ” , ” end ” , ” while ” , ”(” , ”)” , ” I d e n t i t y ” , ” I n t e g e r ” , ”=” , ”=” , ”” , ”+” , ”−” , ”∗” , ”/” , ”;”, ”” } ;

Some of the lists generated by GNU Bison make use of the index position 0. Unlike FLEX this arrays were copied verbatim to the ParseTable.mo file,

4.4. ERROR HANDLING

49

and a displacement of 1 was introduce in the algorithm for this arrays in the PDA implemented in Lexer.mo. MetaModelica in Bison file The epilogue and prologue of the grammar file are the places for inserting custom code. This custom code can be used to implement advance functionalities in the parser such as printing intermediate values or auxiliary functions for constructing the AST tree. The complete file ParserGenerator.mo is available in Appendix C.2 all together with samples for the files explained here in the sections C.1, E.3, E.1 and E.2.

4.4

Error handling

The error handling implemented in OMCCp is an adaptation of the methods presented by Burke and Fisher Jr [1982], Bilos [1983] explained in Section 2.2. For implementing these techniques, a backup of the state of the configuration is required to be saved after each shifted token. Four structures were created to handle the error in OMCCp, two for handling the errors (errors stack and error state variable), and two more to keep the backup of the configuration (astStack and state stack). Primary recovery and error messages techniques were implemented and will be explained in this section.

4.4.1

Error recovery

In the OMC it is not desirable to use any correction technique for malformed code. This is mainly because the developer may lose the semantic intentionality in the source code due to an erroneous automatic correction. This situation happens in when the error recovery merges two tokens instead of inserting the one intended by the developer. Having as an output a completely different behaviour in the flow of the program. Despite the considerations above, the parsing tables of GNU Bison provide a simple error recovery technique. This technique consists in ignoring

50

CHAPTER 4. IMPLEMENTATION

the error token. Following this action, an error flag is activated to detect if the error can be recovered or not. This recovery technique works in this way: It uses an integer error flag and assigns the value of a constant (see maxErrShiftToken in listing 4.9) when an error is been found. It then decreases the counter each time an element is shifted. If the flag gets back to zero then the error is being successful recovered. If another error is found before the flag gets to zero then the program stops. The constant maxErrShiftToken can only be modified in the source code of the file Parser.mo. Listing 4.9: Constants for error handling 1

3 5

/∗ when t h e e r r o r i s p o s i t i v e t h e p a r s e r r u n s i n r e c o v e r y mode , i f the e r r o r i s negative , the p a r s e r runs in t e s t i n g c a n d i d a t e mode ∗/ c o n s t a n t Integer maxErrShiftToken = 3 ; c o n s t a n t Integer maxCandidateTokens = 4 ; c o n s t a n t Integer maxErrRecShift = −5;

OMCCp tries to parse as much as possible with this technique by ignoring the error token but keeping the errors in an error stack structure that utilizes later to display the errors and fail the parsing. The Parser uses an error-flag to determine the error state similar to BISON but with an extra functionality. If the error state is zero, then no error is present or has been recovered. When the error is positive the parser runs in recovery mode and decreases the value after each shifted token. When it reaches zero, the error has been recovered by ignoring the token. If the error is negative, the parser runs in testing candidate mode for a specific error token. This is and important feature for improving the error messages and will be explained in the next part.

4.4.2

Error messages

OMCCp uses a primary recovery technique to display all possible recovery candidates to the OpenModelica developer. When an error is found, the parser fires the errorHandler function including the environmental variables that contain the actual configuration of the parser and the backed up configuration when the last token was shifted. This backup is used to start an extensive search for the possible valid tokens.

4.4. ERROR HANDLING

51

The input list of tokens is modified according to the type of error that is going to be tested. The Parser uses a negative constant (see maxErrRecShift in listing 4.9) that specifies the maxim number of tokens to shift after accepting a token between the possible candidates. This ensure consistence with the rest of the input. If a new error is found, then the token is discarded as a possible candidate, and the Parser continue with the next token in the list of tokens or the next type of error. A constant visible in listing 4.9 called maxCandidateTokens constrains the number of candidates the error message displays. The developer can sort the tokens in the grammar file to prioritise the most common suggested tokens and the order it will be displayed to the user. This allows the developer to understand better the messages and select the correct token more easy. As presented in section 2.2.2, we distinguish between 8 types of error messages for single recovery. In this implementation we use 7 different kind of error messages for the syntax analysis and one more for the lexical analysis. The error messages displayed in this implementation are: • Erase token • Insert token • Replace token • Insert token at the End • Merge tokens • Generic error or Undetermined • Custom error message For each error type a different test must be run in the Parser. Now we will explain the conditions required for displaying each kind of message. Erase token To test if a token can be erased, we run the Parser modifying the remaining list of tokens by placing the next token found after the error token as current token and ignoring the current token. If the test succeeds we display

52

CHAPTER 4. IMPLEMENTATION

the proper message to the OpenModelica developer suggesting a possible solution for the error the erasing of the current token. The error recovery technique explained above allows the parser to keep running until a next error is found. Insert token To test if a token can be inserted before the error token, we run the Parser modifying the remaining list of tokens by placing the candidate token before the error token as current token. The candidate token is selected if the test succeed and placed in the candidate list. All the available tokens are tested. If there are items in the candidate list we display the proper message to the OpenModelica developer suggesting as a possible solution for the error the insertion of one of the tokens present in the candidate list. Replace token Similar to the case of Insert Token, we run the Parser modifying the remaining list of tokens by replacing the error token with the candidate token as current token. The candidate token is selected if the test succeed and placed in the candidate list. All the available tokens are tested. If there are items in the candidate list we display the proper message to the OpenModelica developer. The message suggests as a possible solution for the error the replacement of the error token with one of the tokens present in the candidate list. Insert token at the End This case is used only at the end of the program, when no other token is available in the input of tokens and a non finished acceptance state has been achieved. All the tokens are tested to verify if they can make the program to end in a valid acceptance state. If a token succeed then we display the proper message to the OpenModelica developer suggesting as a possible solution for the error the insertion of the token found as valid. Merge tokens Sometimes a space can be inserted by mistake between two tokens and make a keyword look to the Parser as two separate identity tokens. In this case

4.4. ERROR HANDLING

53

the error token and the token that follows it are processed again by the Lexer with their semantic value concatenated. If the Lexer combine them in a valid token this token is tested to see if it satisfy the test and it is a valid configuration for the Parser. If it succeeds we display the proper message to the OpenModelica developer. This message consist in suggesting as a possible solution the merging of the error token with the next token. Generic error or Undetermined It can be possible that no solution or candidate is found for the current error token. In these cases a generic error message is displayed to the OpenModelica developer without any further description of the error than the location of the token. In this situation the developer needs to fix the error token without any hint. Custom error message Sometimes it is necessary while designing a grammar to communicate to the developer that a certain transformation rule must not be used in a more clear language than the presented by the error messages above. For this reason a custom error message has been introduce. The error message is introduced in the grammar as presented in listing 4.10. The keyword error and errorMsg are used to specified that an error has been found. The use of this custom message will activate the error flag in the parser and starts the simple error recovery technique explained above. Listing 4.10: Custom error messages in OMCCp 1

3 5

equation

: exp EQUALS exp { $$ [ Equation ] = Absyn . EQ EQUALS( $1 [ Exp ] , $2 [ Exp ] ) ; } | exp ASSIGN exp { e r r o r=true ; e r r o r M s g=” Assignments ca nn ot be used i n s i d e equations ” ; }

A sample of the different type of messages is presented in listing 4.11. The source code with the errors is also included in the listing 4.12. Listing 4.11: Error messages in OMCCp 1

∗∗∗ERROR( S ) FOUND∗∗∗

54

3

5

CHAPTER 4. IMPLEMENTATION

program . mo : 3 : 1 8 : Syntax ERROR n e a r t o k e n : [ T DO ’ do ’ ] , MERGE t o k e n s [ T DO ’ do ’ ] and [ T IDENT ’ ans ’ ] , ERASE t o k e n program . mo : 4 : 1 3 : Syntax ERROR n e a r t o k e n : [ T ADD ’+ ’ ] , INSERT t o k e n {T INTCONST o r T IDENT} , REPLACE t o k e n with {T LPAREN } , ERASE t o k e n program . mo : 9 : 3 : Syntax ERROR n e a r t o k e n : [ T ENDIF ’ e n d i f ’ ] , INSERT a t t h e End t o k e n {T END} FAILED PARSING

Listing 4.12: program.mo with errors 1 3 5 7 9

/∗ PAM program with e r r o r s ∗/ read x y z w; while x 99 do do ans := ( x++111) − ( y / 3 ) ; w r i t e ans ; read x , y ; i f x = 10 then y := 2 3 4 ; endif

As a summary OMCCp implements a non-corrective primary error recovery technique. It consists in ignoring the error token and parsing the rest of the source code if possible. If any error is found, the parsing fails and it displays a combination of 6 different kind of error messages. These messages are displayed depending on whether is possible to Erase, Insert, Replace, Insert at the End or Merge tokens. When none of these errors are suitable for recovering the error token, a more generic error for undetermined errors is displayed.

4.5

Integration OMC

The integration of OMCCp with OMC is done by modifying the file Parser.mo. This file is located in the folder ‘/trunk/Compiler/FrontEnd/Parser.mo’. Listing 4.13 shows the original function parse. We can notice in the line 5 an external call to the C function omparse implemented in ANTLR. Listing 4.13: Parser.mo original function 1 3

public function p a r s e ” P a r s e a mo− f i l e ” input String f i l e n a m e ; output Absyn . Program outProgram ;

4.5. INTEGRATION OMC

5

55

e x t e r n a l ”C” outProgram=P a r s e r p a r s e ( f i l e n a m e ) a n n o t a t i o n ( L i b r a r y = { ” omcruntime ” , ” omparse ” , ” a n t l r 3 ” } ) ; end p a r s e ;

We modify this function by adding a call to the parser generated by OMCCp. Listing 4.14 shows the modified function parse that includes the call to the new parser called ParserModelica. Listing 4.14: Parser.mo modified function 2 4 6 8

10

public function p a r s e ” P a r s e a mo− f i l e ” input String f i l e n a m e ; output Absyn . Program outProgram ; l i s t t o k e n s ; Boolean r e s u l t ; algorithm tokens = LexerModelica . scan ( filename , false ) ; // c a l l t h e p a r s e r ( r e s u l t , outProgram ) = P a r s e r M o d e l i c a . p a r s e ( t o k e n s , f i l e n a m e , false ) ; end p a r s e ;

56

CHAPTER 4. IMPLEMENTATION

Chapter 5

Discussion In this chapter we will analyse the results and explain the motivation for the design and the proposed implementation. We start this chapter by giving details about the OMCCp construction. Then we continue with a discussion on the OMCCp construction and the compiler OMC. Finally we address some limitations found during the implementation.

5.1

Analysis of Results

The design presented in Section 4.2 contains two main parts: one is the generation of the lexer and parser, and the other is the generated files for the parser.

5.1.1

Lexer and Parser

Lets start talking about the generated files. There are 3 files generated for the lexer (Lexer.mo, LexTable.mo and LexCode.mo) and four generated for the parser (Parser.mo, ParseTable.mo, ParseCode.mo and Token.mo) and one file more for the tokens (Token.mo). This design was chosen with the aim of achieving functional cohesion and low coupling between the modules and the functions. Functional cohesion is a software quality property that is achieved when the task performed by the parts of each modules are well defined. In the lexer and the parser we can identify three main parts: the tables, the custom code and the automata. 57

58

CHAPTER 5. DISCUSSION

The DFA and PDA that are implemented in the files Lexer.mo and Parser.mo have specific functions that performs well defined tasks. All the data (parsing tables and lexer tables) is separated from the automata in the files LexTable.mo and ParseTable.mo. Finally the custom code and specific actions are located in a different set of files,LexCode.mo and ParseCode.mo. The low coupling is a software quality property that is achieved when each module has a low dependency on the other modules and each module can be easily replaced by another one without major impact on the overall communication. The two main modules: lexer and parser, are independent from each other and it can be possible to replace a whole module without affecting the other as long as the parameters (input and output) in the interface remains the same. One possible negative impact of this solution could be the overhead cause by the function calling and the message passing. Although, we cannot measure this properly until the whole Modelica grammar is completed and benchmarks of the current solution against this proposed parser can be done. Further optimisations can be performed to reduce this possible negative behaviour.

5.1.2

OMCCp Construction

The generation of the files are performed based on FLEX and GNU Bison. The main reason for this is the correctness and efficiency of the algorithms implemented by them. The use of these tools for the generation of the tables allows OMCCp to reuse the algorithm for implementing the lexer and the LALR(1) parser. It is known that the generation of these tables is a complex and error-prone task as presented in Section 2.1.4. We succeed in implementing the algorithms of FLEX and GNU Bison in MetaModelica. MetaModelica differs from C in several characteristics. One of them is that MetaModelica is a functional programming language and makes use of recursion and loops instead of GOTO instructions. And the value of the variables are kept in the program by passing them through input and output parameters instead of the use of global variables as in C. The steps we went through during the implementation of this project OMCCp are broadly sketched here: 1. Prototype of a Lexer based on Flex

5.1. ANALYSIS OF RESULTS

59

2. Prototype of a Parser based on BISON 3. Integration of the Lexer and the Parser throught common token specification. 4. Automatic Generation of the lexer and parser table files for the parser based on the code generated by Flex and Bison 5. Define the changes required in the grammar definition files to integrate MetaModelica code in the generated Flex and Bison files. 6. Import the customised code generated by Flex and Bison into the action resolution file for the lexer and the parser. 7. Implement error handling mechanism and the generation of the candidate correction set for the improvement in the error handling. 8. Test the parser with a large subset of Modelica 3.2 + MetaModelica grammar to find and fix the issues found. 9. Run the Testsuite with the generated parser from the large subset of Modelica 3.2 + MetaModelica grammar to prove correctness. The files produced during the implementation of OMCCp are presented in Table 5.1. The files ending in Modelica.mo are generated from the grammar files. A total of 12340 lines of code were produced during this project. This gives an idea of the total effort spend during the construction of this parser. The software tools used to support this development are presented below: • Linux Ubuntu (10.10) operative system. • Eclipse Galileo (3.5.2) with the OpenModelica MTD plugging. • OpenModelica Compiler OMC always updated according to the latest source code (1.7.0 - r8600). • Flex (2.5.35) and Bison (2.4.3) compilers. • gcc compiler (4.4.5). • Subversion (1.6.12 - r955767) for version control.

60

CHAPTER 5. DISCUSSION

Table 5.1: OMCCp Files Implementation FILENAME LexerGenerator.mo LexerCode.tmo Lexer.mo NewParser.mo OMCC.mo OMCC.mos ParserGenerator.mo Parser.mo ParseCode.tmo Types.mo SCRIPT.mos lexerModelica.l LexerCodeModelica.mo LexerModelica.mo LexTableModelica.mo parserModelica.y ParserModelica.mo ParseCodeModelica.mo ParseTableModelica.mo TokenModelica.mo TOTAL

Lines of Code 441 67 467 77 95 11 891 896 122 194 38 207 728 469 425 832 898 4515 843 124 12340

5.1. ANALYSIS OF RESULTS

5.1.3

61

Implementation of a subset of Modelica and MetaModelica grammar

A large subset of Modelica 3.2 and MetaModelica 1.0 grammar was implemented based on the grammar of the Modelica 3.2 specifications [ModelicaAssociation, 2010] and MetaModelica specifications [Fritzson and Pop, 2011b]. The subset is large enough to be able to parse all the source code of OMCCp. It includes all the class types of Modelica and around 90% of the total Modelica specification 3.2. In addition, the subset includes the extensions of MetaModelica used for OMCCp. The lexer includes 100% of the tokens used by Modelica but without support for identities inside single quote ’QIdent’ and other characters besides the common alphabet. During the implementation of this grammar, some issues were found and corrected inside the parser. Among these issues we can identify in the list below some of the changes implemented during the construction of the grammar. • Change of recursion for loops for the input of both the lexer and the parser due to stack overflow errors. • The reduce of the MultiTyped stack was done in order of the variables instead of the correct way which was in reverse order to pop the correct values. • Added implementation of custom error messages to be inserted in the parser. • An additional bug regarding the conversion of large files from chars into integers were found in the OMC compiler. The grammar lexerModelica.y and parseModelica.y The tokens ENDIF, ENDFOR, ENDWHILE, ENDWHEN and ENDCLASS were added to the lexer grammar to avoid ambiguity in the LALR(1) parser. Some shift-reductions conflicts were found during the construction of the grammar. The parser select a shift over reduction in the case this happens. This allows the longest rules to have priority over the shorter. We avoided completely reduce-reduce errors. This is a symptom of ambiguity in the grammar and should always be avoided. The way to avoid

62

CHAPTER 5. DISCUSSION

reduce-reduce errors is by avoiding different rules where the same reductions can be performed. It happens often when empty transitions are implemented. The grammar files lexerModelica.y and parseModelica.y are available in Appendices F.1 and F.2. Testing the Modelica grammar and performance We performed a test over the files based on the test suit library implemented for Modelica. 48 out of 571 test failed. This means that around 92% of the Modelica grammar was implemented correctly. The rest 8% includes part of the grammar that were not implemented such as annotations and some other tokens for prefixes such as FINAL or REPLACEABLE in front of certain rules. The test cases that failed are easily identified in the log file. This file shows the OMCCp error messages as presented in Section 4.4. From here we can identify and test individual Modelica programs until we add the required instructions to parse the file, or to correct the construction of the AST in case the problem is found there. We discover that the parser was not scaling good for large files and we performed an optimization over the Lexer and later over the parser. The optimization consisted in minimize the load of the character list and the token list used for passing the parameters to the recursive functions inside the lexer and the parser. Table 5.2 shows the results in time for all the test cases including the failed tests for OMCCp and ANTLR. It is important to point that when the parser fails it performs a search over the candidate tokens that increases the time of parsing. Another comparison was performed over the same file as presented in table 5.3. This test was performed over the source code of OMCCp. Chart 5.1 shows how the No optimised OMCCp was taking 57 seconds to perform the parsing over an input of 162.000 chars. After the optimisations we reduce the total time to 5 seconds. However, we believe that the MultiTyped stack that consist for the Modelica grammar in around 70 stacks is causing overhead in the parser. The computer used for the test has the following configuration: • CPU: Intel Core DUO T9300

5.1. ANALYSIS OF RESULTS

63

Table 5.2: Test Suite - Compiler Compiler ANTLR Initial grammar Intermedia grammar No Optimization Lexer Optimized Parser Optimized

Time (sec) 19.367 48 58 124.492 43.294 48.657

Result 1 out of 571 tests failed 447 out of 568 tests failed 264 out of 568 tests failed 98 out of 568 tests failed 98 out of 568 tests failed 48 out of 571 tests failed

Table 5.3: OMCCp - Time Parsing File Name ParserGenerator.mo LexerGenerator.mo ParserModelica.mo LexerModelica.mo LexTableModelica.mo LexerCodeModelica.mo ParserModelica.mo ParseCodeModelica.mo ParseTableModelica.mo TokenModelica.mo

length 30955 15371 31964 15726 23894 29346 31964 162160 57439 4820

No Opt 3.29317966 1.098076435 3.239946711 1.070857021 3.163853208 2.904747315 3.136341788 57.240642819 12.518531729 0.187905045

Lexer Opt 1.362809223 0.603257462 1.228217513 0.520705675 1.940996074 1.305126972 1.220613415 7.805264418 4.834436661 0.147879692

Parser Opt 1.221133662 0.573149139 1.12769119 0.491600411 1.708091171 1.177701876 1.127851826 5.45346943 3.733743697 0.137828428

OMCC - Time parsing 70 60

Time (sec)

50 40 No Optimization Lexer Optimized Parser Optimized

30 20 10 0 0

20000

40000

60000

80000

100000

120000

140000

Characters

Figure 5.1: OMCCp - Time Parsing

160000

180000

64

CHAPTER 5. DISCUSSION

• RAM: 4GB • OS: Ubuntu 10.10

5.2

OpenModelica Compiler

It is expected that OMCCp benefits the development of the OMC by increasing the maintainability of the grammar. This helps a developer to implement easily modifications in the grammar, due to the familiarity with tools such as FLEX, GNU Bison and the language MetaModelica. The new modifications and adaptations can be integrated directly into the OMC Parser module by modifying the grammar files. However, the most important benefit of this project is expected to be the OpenModelica developer that uses the OMC, which will be helped by better hints when an error is detected by the compiler. The error messages displayed by the generation of possible candidates help the developer to decide in which way the program should be corrected. One additional contribution to the OpenModelica project done during the development of this project was the discover of few bugs in the OMC compiler. The bugs reported were corrected by the OMC developers. Those are related with array operations and sub-string functions. This will increase the stability of the OMC compiler for its further versions. The symbol table, used commonly in compiler construction was not developed in this project because it was not needed by a semantic analyser. In the further phases of the OMC, a symbol table is built based on the AST generated by this parser.

5.3

Limitations

Some of the limitations held during the development of this project include: lack of knowledge in the MetaModelica language, availability of the algorithm and other time limitations that will be addressed here. MetaModelica is an extension to the Modelica language that is only used by the OMC developers. One of the first limitations found in this project was the lack of knowledge to develop the parser in MetaModelica language. A guide for MetaModelica is today in a draft state and some available exercises are included. However, it took about a month of reading, prototype trial

5.3. LIMITATIONS

65

and supervision during the implementation of this project to identify the best practices to use the MetaModelica language in a proper way to build the parser. A first basic prototype of a handwritten-lexer was developed to analyse the benefits and the limitations of this functional programming language. The knowledge and experience on the development of MetaModelica language of the supervisor of this thesis was the key to overcome the difficulties found due to this lack of expertise and practical source code. After this initial prototype, FLEX and GNU Bison were defined to be the base of the parser generator for this project. We found that the algorithm to create the transition arrays from the grammar files was not available in FLEX and GNU Bison. Therefore the solution proposed was based on these applications to calculate the arrays. Later we extract this information from the generated files in C code. A reverse engineering technique was used to understand the coded of the generated lexer and parser. Only after we finished this process, we found the article of Popuri [2006] which clarifies on how the algorithm of GNU Bison for the parser works. However, this is not sufficient for the generation of all the tables produced by GNU Bison. There were also time limitations that made the scope of this project not big enough for replacing the current parser in ANTLR with the one generated by OMCCp. The grammar for MetaModelica and Modelica is only available for the existing parser in the form of LL grammar. The time for converting the whole set of variables and terminals into LR grammar was considerable large to include it for this project. A subset large enough to parse the most of the OMC test suite was implemented to prove the efficiency and correctness of the algorithm in parsing this grammar.

Chapter 6

Related Work In this chapter we cover the topic of related work. There are several researches connected with our project that cover both, OpenModelica Compiler and the area of compiler construction. We will discuss specially the latest efforts on bootstrapping and related projects in parser generation.

6.1

OpenModelica Development

Modelica is fundamentally a modelling language, and it is stated by Pop and Fritzson [2006] that through some small extensions this language can be used for modelling the domain of programming languages. MetaModelica is created with the intention to accomplish this job. Recent efforts using MetaModelica in modelling the OMC are presented by Sj¨olund et al. [2011] in the Modelica’2011 International Conference. This project is aim to contribute to this process by implementing a parser generator in MetaModelica that models the parsing behaviour of the OMC. OMCCp is then a piece that can fit the bootstrapping project. There are also recent efforts in implementing a Modelica compiler such as Akesson et al. [2010] that uses JastADD (An aspect-oriented compiler construction system). However, this efforts differs from the bootstrapping project referenced above. The most of the recent projects in OpenModelica are related with the compiler OMC. Ongoing projects1 such as Java portability, parallel code 1 Source:

http://www.openmodelica.org/index.php/research/open-master-theses

66

6.2. COMPILER-COMPILER CONSTRUCTION

67

generator and this project of a MetaModelica parser generator are prove of this.

6.2

Compiler-Compiler Construction

Compiler construction and specially parser generation is still an active research area, even though is based on established knowledge from researches done several decades ago. Several projects can be found such as “The FIKA parser generator” by Pise [2010], which also supports LALR(1) grammars and aims to implement inheritance reuse of grammars unlike the parser generators we covered in this project (FLEX, BISON and ANTLR). There are also more efforts specifically in parser generators testing that can be an important source for further work in this project such as the research done by Sampath et al. [2007]. This research guides the generation of tokens for testing lexers generated by FLEX. Finally in the area of error recovery and messaging there are recent efforts by de Jonge et al. [2010]. This project aims to implement novel techniques for improving the accuracy of the current error recovering techniques. These techniques make use of other characteristics of the source code such as indentation, which is usually placed with the aim of making the code more human readable.

68

CHAPTER 6. RELATED WORK

Chapter 7

Conclusions “People think that computer science is the art of geniuses but the actual reality is the opposite, just many people doing things that build on each other, like a wall of mini stones.” Donald Knuth In this last chapter we present the final conclusions of this thesis. We base the contribution on the implementation and discussion of OMCCp presented in Chapters 4 and 5. We end this report by addressing some future work desirable for the continuation of this thesis.

7.1

Accomplishments

We succeed in accomplish the three goals initially presented in section 1.2 for this master’s thesis. • A Lexer and a Parser for Modelica and MetaModelica grammar that outputs the Abstract Syntax Tree (AST) for the language processed. • Lexer and Parser generator written in MetaModelica language. • Improve in the error handling messages compared with ANTLR; specifically the messages concerning error correction hints of malformed syntax. From the results presented in Chapter 4 and discussed in Chapter 5, we can identify the main contribution of this work, which is the implementation of the OpenModelica Compiler-Compiler parser generator (OMCCp) 69

70

CHAPTER 7. CONCLUSIONS

that can be included in the bootstrapping project of the OpenModelica Compiler (OMC) as replacement of the ANTLR tool. OMCCp is a parser generator with low coupling and functional cohesion that allows the generation of MetaModelica code from the Modelica and MetaModelica grammar specification. OMCCp contributes to the OpenModelica developer task by implementing error handling techniques that display understandable error correction candidates. This feature will benefit directly the OpenModelica developers by giving them understandable hints to correct the source code. The implementation of more than 90% of the Modelica 3.2 and MetaModelica grammar is a remarkable advance towards the future replacement of the ANTLR parser. From the performed tests we concluded that ANTLR is still faster than OMCC. The two optimizations performed increased the speed of the parser by 319.76% in average over the source code of OMCCp and it is 10 times faster for the larger file we used with 160k characters. We believe there is still an overhead for handling the large MultiTyped stack for the construction of the AST. Despite that, the scalability of the parser is linear after the optimization. However, the parser should be optimized to handle large files in order to improve the performance. After this OMCCp can be a good candidate to replace ANTLR and be included in the bootstrapping project. Another additional contribution of this work was the discover of some bugs in the OMC compiler related with array operations and sub-string functions. The OpenModelica developers fixed these bugs. This contribution aims to archive a better and stable OMC compiler.

7.2

Future Work

Despite the accomplishments of this thesis, further work is required to archive other desired goals for the OMC, including the ones enumerated here: • Optimization for performance of the algorithm for both the lexer the parser. • Complete the full set of the grammar for Modelica 3.2 and MetaMod-

7.2. FUTURE WORK

71

elica and extend it for the coming Modelica 4. • Implement error recovery mechanism explained in section 2.2.1. The first and the second method are possible to be implemented here. • Integrate the parser generated by OMCC as part of the OMC in replacement of the current ANTLR as part of the Bootstrapping project [Sj¨ olund et al., 2011]. • Perform benchmark test to measure the speed in comparison with ANTLR. This can be more accurate after the whole MetaModelica and Modelica grammars are completed. • Update the MetaModelica exercises to directly use OMCCp instead of external C calls to Flex and Bison.

72

CHAPTER 7. CONCLUSIONS

Bibliography AA Aaby. Compiler Construction using Flex and Bison. Walla Walla College, 2003. URL http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.108.114&rep=rep1&type=pdf. [Accessed May 2011]. 23, 32 A.V. Aho, R. Sethi, and J.D. Ullman. Compilers: principles, techniques, and tools. Addison-Wesley, second edition, 2006. xv, 5, 13, 14, 15, 30 J. Akesson, T Ekman, and G Hedin. Development of a Modelica Compiler Using JastAdd. Electronic Notes in Theoretical Computer Science, 203(2):117–131, April 2008. ISSN 15710661. doi: 10.1016/j.entcs. 2008.03.048. URL http://linkinghub.elsevier.com/retrieve/pii/ S1571066108001539. [Accessed May 2011]. 18 J. Akesson, Torbj¨ orn Ekman, and G¨ orel Hedin. Implementation of a Modelica compiler using JastAdd attribute grammars. Science of Computer Programming, 75(1-2):21–38, January 2010. ISSN 01676423. doi: 10.1016/j.scico.2009.07.003. URL http://linkinghub.elsevier.com/ retrieve/pii/S0167642309001087. [Accessed May 2011]. 18, 66 Rober Bilos. Syntactic error diagnosis and recovery. Link¨ oping University, Link¨ oping, 1983. 16, 49

Master’s thesis,

D. Blasband. Parsing in a hostile world. Proceedings Eighth Working Conference on Reverse Engineering, pages 291–300, 2001. doi: 10.1109/WCRE. 2001.957834. URL http://ieeexplore.ieee.org/lpdocs/epic03/ wrapper.htm?arnumber=957834. [Accessed May 2011]. 12 Michael Burke and G.A. Fisher Jr. A practical method for syntactic error diagnosis and recovery. In Proceedings of the 1982 SIGPLAN symposium on Compiler construction, pages 67–78. ACM, 1982. ISBN 0897910745. URL http://portal.acm.org/citation.cfm?id=800230.806981. [Accessed May 2011]. 16, 49 Michael G. Burke and Gerald a. Fisher. A practical method for LR and LL syntactic error diagnosis and recovery. ACM Transactions on Programming Languages and Systems, 9(2):164–197, March 1987. ISSN 01640925. 73

74

BIBLIOGRAPHY

doi: 10.1145/22719.22720. URL http://portal.acm.org/citation. cfm?doid=22719.22720. [Accessed May 2011]. 16 Rafael Corchuelo, Jos´e a. P´erez, Antonio Ruiz, and Miguel Toro. Repairing syntax errors in LR parsers. ACM Transactions on Programming Languages and Systems, 24(6):698–710, November 2002. ISSN 01640925. doi: 10.1145/586088.586092. URL http://portal.acm.org/citation.cfm? doid=586088.586092. [Accessed May 2011]. 16 M. de Jonge, E. Nilsson-Nyman, L. Kats, and Eelco Visser. Natural and flexible error recovery for generated parsers. Software Language Engineering, pages 204–223, 2010. URL http://www.springerlink.com/index/ b0p750768wum5157.pdf. [Accessed May 2011]. 16, 67 Pierpaolo Degano and Corrado Priami. LR techniques for handling syntax errors. Computer Languages, 24(2):73–98, 1998. ISSN 0096-0551. URL http://linkinghub.elsevier.com/retrieve/pii/ S0096055197000167. [Accessed May 2011]. 16 F.L DeRemer. Practical translators for LR (k) languages. Project Mac, Massachusetts Institute of Technology, 1969. URL http://publications. csail.mit.edu/lcs/pubs/ps/MIT-LCS-TR-65.ps. [Accessed May 2011]. 13 Charles Donnelly and Richard Stallman. GNU Bison Manual version 2.4.3, 2010. URL http://www.gnu.org/software/bison/manual/bison.pdf. [Accessed May 2011]. 23, 32, 210 Hilding Elmqvist. Modelica - A Unified Object-Oriented Language for Physical Systems Modeling. EUROSIM -Simulation News Europe, (20):p32, 1997. 18 Peter Fritzson. Principles of Object-oriented modeling and simulation with Modelica 2.1. IEEE Press, 2004. 1, 210 Peter Fritzson and Peter Bunus. Fritzon Modelica A general OO Language for continuous and discrete event system modeling and simulation. In Simulation Symposium, 2002. Proceedings. 35th Annual, pages 365 – 380, 2002. 18 Peter Fritzson and Adrian Pop. Meta-programming and language modeling with metamodelica 1.0. Technical Report 9, Link¨oping UniversityLink¨ oping University, PELAB - Programming Environment Laboratory, The Institute of Technology, 2011a. 4, 18, 152 Peter Fritzson and Adrian Pop. Towards Modelica 4 Meta-Programming and Language Modeling with MetaModelica 2.0. Number April. 2011b. unpublished work. 4, 18, 61

BIBLIOGRAPHY

75

Peter Fritzson, Adrian Pop, Peter Aronsson, David Akhvlediani, Bernhard Bachmann, Vasile Baluta, Simon Bj¨orkl´en, Mikael Blom, Willi Braun, David Broman, Stefan Brus, Francesco Casella, Filippo Donida, Henrik Eriksson, Anders Fernstr¨ om, Pavel Grozman, Daniel Hedberg, Michael Hanke, Alf Isaksson, Daniel Kanth, Tommi Karhela, Joel Klinghed, Juha Kortelainen, Alexey Lebedev, Magnus Leksell, Oliver Lenord, H˚ a kan Lundvall, Eric Meyers, Hannu Niemist¨o, Kristoffer Norling, Atanas Pavlov, Pavol Privitzer, Per Sahlin, Wladimir Schamai, Gerhard Schmitz, and Klas Sj¨ oholm. OpenModelica System Documentation. Number November. Open Source Modelica Consortium, Link¨ oping, 2009. URL http://www.ida.liu.se/labs/pelab/modelica/ OpenModelica/releases/1.6.0/doc/OpenModelicaSystem.pdf. [Accessed May 2011]. xiii, 1, 18, 23, 24 Denis Howe. The Free On-line Dictionary of Computing, 2010. URL http: //foldoc.org/. [Accessed May 2011]. 209, 210 OG Kakde. Algorithms for compiler design. CHARLES RIVER MEDIA, INC., 2002. ISBN 81-7008-100-6. 5, 13 L.C.L. Kats, M. de Jonge, E. Nilsson-Nyman, and E. Visser. Providing rapid feedback in generated modular language environments: adding error recovery to scannerless generalized-LR parsing. ACM SIGPLAN Notices, 44(10):445–464, 2009. ISSN 0362-1340. URL http://portal.acm.org/ citation.cfm?id=1640089.1640122. [Accessed May 2011]. 16 Donald E Knuth. On the Translation of Languages from Left to Right. Information and Control, 8(6):607–639, 1965. ISSN 00199958. doi: 10. 1016/S0019-9958(65)90426-2. URL http://linkinghub.elsevier.com/ retrieve/pii/S0019995865904262. 12 H˚ a kan Lundvall, Kristian Stav˚ a ker, Peter Fritzson, and Christoph Kessler. Automatic parallelization of simulation code for equation-based models with software pipelining and measurements on three platforms. ACM SIGARCH Computer Architecture News, 36(5):46–55, June 2009. ISSN 0163-5964. doi: 10.1145/1556444.1556451. URL http://portal.acm. org/citation.cfm?id=1556444.1556451. [Accessed May 2011]. 18 Bruce J. McKenzie, Corey Yeatman, and Lorraine de Vere. Error repair in shift-reduce parsers. ACM Transactions on Programming Languages and Systems, 17(4):672–689, July 1995. ISSN 01640925. doi: 10.1145/210184.210193. URL http://portal.acm.org/citation.cfm? doid=210184.210193. [Accessed May 2011]. 16 Ashley J.S Mills. ANTLR Tutorial, 2005. URL http://supportweb.cs. bham.ac.uk/docs/tutorials/docsystem/build/tutorials/antlr/ antlr.html. [Accessed May 2011]. 25

76

BIBLIOGRAPHY

ˆ R - A Unified Object-Oriented Language Modelica-Association. Modelica A for Physical Systems Modeling Language Specification. Interface, 2010. 4, 61 Terence Parr. The Definitive ANTLR Reference: Building Domain-Specific Languages (Pragmatic Programmers). Pragmatic Bookshelf, 2007. ISBN 0978739256. 25 Terence Parr and R W Quong. ANTLR: a predicated-LL(k) parser generator. Software Practice Experience, 25(7):789, 1995. ISSN 00380644. URL http://portal.acm.org/citation.cfm?id=213593.213603. [Accessed May 2011]. 1, 25 Vern Paxson. Flex Manual, 2002. URL http://flex.sourceforge.net/ manual/. [Accessed May 2011]. 28 Michal Pise. The Fika Parser Generator. 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation, pages 99–100, September 2010. doi: 10.1109/SCAM.2010.27. URL http://ieeexplore.ieee. org/lpdocs/epic03/wrapper.htm?arnumber=5601827. [Accessed May 2011]. 67 Adrian Pop and Peter Fritzson. Debugging natural semantics specifications. Proceedings of the Sixth sixth international symposium on Automated analysis-driven debugging - AADEBUG’05, pages 77–82, 2005. doi: 10. 1145/1085130.1085140. URL http://portal.acm.org/citation.cfm? doid=1085130.1085140. [Accessed May 2011]. 18 Adrian Pop and Peter Fritzson. MetaModelica: A unified equation-based semantical and mathematical modeling language. Modular Programming Languages, pages 211–229, 2006. URL http://www.springerlink.com/ index/a5112k4m34067180.pdf. [Accessed May 2011]. 18, 66, 210 Satya Kiran Popuri. Understanding C parsers generated by GNU Bison. Free Software Foundation, 2006. URL http://www.cs.uic.edu/~spopuri/ cparser.html. [Accessed May 2011]. 30, 32, 41, 65 P. Sampath, AC Rajeev, KC Shashidhar, and S Ramesh. How to test program generators? A case study using flex. In Software Engineering and Formal Methods, 2007. SEFM 2007. Fifth IEEE International Conference on, pages 80–92. IEEE, 2007. doi: 10.1109/SEFM.2007.11. URL http: //ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4343926. [Accessed May 2011]. 67 M Sipser. Introduction to the Theory of Computation. Thomson Course Technology, second edition, 2005. 8, 10 Martin Sj¨ olund. Bidirectional External Function Interface Between Modelica/MetaModelica and Java. Master’s thesis, Link¨oping University, 2009. 18

BIBLIOGRAPHY

77

Martin Sj¨ olund, Peter Fritzson, and Adrian Pop. Bootstrapping a Modelica Compiler aiming at Modelica 4. In 8th International Modelica Conference (Modelica’2011), Dresden, Germany, 2011. 1, 2, 18, 66, 71 Robert Endre Tarjan and Andrew Chi-Chih Yao. Storing a sparse table. Communications of the ACM, 22(11):606–611, November 1979. ISSN 00010782. doi: 10.1145/359168.359175. URL http://portal.acm.org/ citation.cfm?doid=359168.359175. [Accessed May 2011]. 30 P.D. Terry. Compilers and Compiler Generators. Africa, 2000. URL http: //scifac.ru.ac.za/compilers/. [Accessed May 2011]. 5, 13

78

BIBLIOGRAPHY

Appendices

79

Appendix A

OMC Compiler Commands This appendix contains the source code developed during this project. It includes instructions of how to run the OpenModelica Compiler and samples of both; input files for the generator of the parser and output files generated from the exercise 10 of the MetaModelica guide.

A.1 A.1.1

Parameters - MetaModelica Parser Generator Generate compilerName

Used as a sufix for the name of the files input lexer and parser.

A.1.2

Run compilerName, fileName

Run the generated compiler with the given fileName as an input

A.2

OMC Commands

To run the Lexer and Parser generator for the compiler [compilerName] it is required the following: 1. lexer[compilerName].c generated from the grammar lexer[compilerName].l in Flex. 2. parser[compilerName].c generated from the grammar parser[compilerName].y in Bison.

80

A.2. OMC COMMANDS

81

Optionally you can test your grammar by generating the files in FLEX and GNU Bison as described in listing A.1. Listing A.1: Compile Flex and Bison 2

\ $ f l e x −t − l l e x e r [ compilerName ] . c > l e x e r [ compilerName ] . c \ $ b i s o n p a r s e r [ compilerName ] . y −−output=p a r s e r [ compilerName ] . c

Modify the file OMCC.mos presented in listing A.2 to include the grammar languages to generate. Listing A.2: OMCC.mos 2 4 6 8

getInstallationDirectoryPath () ; l o a d F i l e ( ”OMCC. mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l /RTOpts . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l / U t i l . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l / System . mo” ) ; l o a d F i l e ( ” L e x e r G e n e r a t o r . mo” ) ; l o a d F i l e ( ” P a r s e r G e n e r a t o r . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / FrontEnd /Absyn . mo” ) ; OMCC. main ( { ” M o d e l i c a ” } ) ;

10 getErrorString () ;

For running the OMCCp use the command presented in listing A.3. Listing A.3: OMCCP Command 1 3 5 7 9 11 13 15 17

$ omc +g=MetaModelica +d=rml OMCC. mos true true true true true true G e n e r a t i n g FLEX grammar f i l e l e x e r 1 0 . c . . . G e n e r a t i n g BISON grammar f i l e p a r s e r 1 0 . c . . . Reading FLEX grammar f i l e l e x e r 1 0 . c . . . Result : Lexer B u i l t Reading BISON grammar f i l e p a r s e r 1 0 . c Result : Parser Built 9 F i l e s Generated f o r t h e l a n g u a g e grammar : 1 0 OMCC v0 . 7 ( OpenModelica Compiler− Compiler ) C o p y r i g h t 2011 Open Souce M o d e l i c a Consorsium (OSMC)

This command generates the following 9 files: • lexer[compilerName].c • parser[compilerName].c • Lexer[compilerName].mo • LexTable[compilerName].mo

82

APPENDIX A. OMC COMPILER COMMANDS

• LexCode[compilerName].mo • Parser[compilerName].mo • ParseTable[compilerName].mo • ParseCode[compilerName].mo • Token[compilerName].mo Additionally for debugging purposes the debug flag can be activated in the script SCRIPT.mos presented in the appendix G.1 in the line shown in the listing A.4. Listing A.4: SCRIPT.mos debug mode 1

Main . main ( { ” M o d e l i c a ” , true } ) ; // run grammar 10 with debug on

The output of the file can be send to another file as presented in the listing A.5. The file result.txt will present the state stacks used for the lexer and parser and also some transformations and variables that can be tracked back to the original source code. Listing A.5: OMCCP debug mode 1

omc +g=MetaModelica +d=rml SCRIPT . mos > r e s u l t . t x t

Appendix B

Lexer Generator B.1

Lexer.mo Listing B.1: Lexer.mo

1 3

package L e x e r ” Implements t h e DFA o f OMCC” import Types ; import LexTable ; import LexerCode ;

5 7 9 11 13 15 17 19 21 23 25 27 29 31

uniontype L e x e r T a b l e record LEXER TABLE array a c c e p t ; array e c ; array meta ; array b a s e ; array d e f ; array nxt ; array chk ; array a c c l i s t ; end LEXER TABLE ; end L e x e r T a b l e ; uniontype Env record ENV Integer s t a r t S t , c u r r S t ; Integer pos , sPos , ePos , l i n e n r ; l i s t b u f f ; l i s t bkBuf ; l i s t s t a t e S k ; Boolean i s D e b u g g i n g ; String f i l e N a m e ; end ENV; end Env ; function s c a n ” Scan s t a r t s t h e l e x i c a l a n a l y s i s , l o a d t h e t a b l e s and consume t h e program t o o ut pu t t h e t o k e n s ” input String f i l e N a m e ” i n p u t s o u r c e code f i l e ” ;

83

84

33

APPENDIX B. LEXER GENERATOR

input Boolean debug ” f l a g t o a c t i v a t e t h e debug mode” ; output l i s t t o k e n s ” r e t u r n l i s t o f t o k e n s ” ;

35 37 39 41 43 45 47

algorithm // l o a d program ( t o k e n s ) := match ( f i l e N a m e , debug ) local l i s t r e s T o k e n s ; l i s t s t r e a m I n t e g e r ; case ( , ) equation s t r e a m I n t e g e r = loadSourceCode ( fileName ) ; r e s T o k e n s = l e x ( f i l e N a m e , s t r e a m I n t e g e r , debug ) ; then ( r e s T o k e n s ) ; end match ; end s c a n ;

49

51 53 55 57 59 61 63 65 67 69

function s c a n S t r i n g ” Scan s t a r t s t h e l e x i c a l a n a l y s i s , l o a d t h e t a b l e s and consume t h e program t o o ut pu t t h e t o k e n s ” input String f i l e S o u r c e ” i n p u t s o u r c e code f i l e ” ; input Boolean debug ” f l a g t o a c t i v a t e t h e debug mode” ; output l i s t t o k e n s ” r e t u r n l i s t o f t o k e n s ” ; algorithm // l o a d program ( t o k e n s ) := match ( f i l e S o u r c e , debug ) local l i s t r e s T o k e n s ; l i s t s t r e a m I n t e g e r ; l i s t c h a r s ; case ( , ) equation chars = stringListStringChar ( f i l e S o u r c e ) ; // s t r e a m I n t e g e r = U t i l . l i s t M a p ( c h a r s , s t r i n g C h a r I n t ) ; s t r e a m I n t e g e r = l i s t ( s t r i n g C h a r I n t ( c ) f o r c in c h a r s ) ; r e s T o k e n s = l e x ( ”” , s t r e a m I n t e g e r , debug ) ; then ( r e s T o k e n s ) ; end match ; end s c a n S t r i n g ;

71 73 75 77 79 81 83 85

87

function l o a d S o u r c e C o d e input String f i l e N a m e ” i n p u t s o u r c e code f i l e ” ; output l i s t program ; algorithm ( program ) := match ( f i l e N a m e ) local l i s t s t r e a m I n t e g e r ; l i s t c h a r s ; case ( ” ” ) equation print ( ”Empty FileName ” ) ; then ( { } ) ; case ( ) equation c h a r s = s t r i n g L i s t S t r i n g C h a r ( System . r e a d F i l e ( f i l e N a m e )); // s t r e a m I n t e g e r = U t i l . l i s t M a p ( c h a r s , s t r i n g C h a r I n t ) ;

B.1. LEXER.MO

89 91 93

95

97 99 101

103 105 107 109 111 113 115

117 119 121

85

s t r e a m I n t e g e r = l i s t ( s t r i n g C h a r I n t ( c ) f o r c in c h a r s ) ; then ( s t r e a m I n t e g e r ) ; end match ; end l o a d S o u r c e C o d e ; function l e x ” Scan s t a r t s t h e l e x i c a l a n a l y s i s , l o a d t h e t a b l e s and consume t h e program t o o ut pu t t h e t o k e n s ” input String f i l e N a m e ” i n p u t s o u r c e code f i l e ” ; input l i s t program ” s o u r c e code a s a stream o f Integers ” ; input Boolean debug ” f l a g t o a c t i v a t e t h e debug mode” ; output l i s t t o k e n s ” r e t u r n l i s t o f t o k e n s ” ; Integer r , cTok ; l i s t cProg ; l i s t c h a r s ; array mm accept , mm ec , mm meta , mm base , mm def , mm nxt , mm chk , m m a c c l i s t ; Env env ; LexerTable le xTa bl es ; algorithm // l o a d a r r a y s mm accept := l i s t A r r a y ( LexTable . y y a c c e p t ) ; mm ec := l i s t A r r a y ( LexTable . y y e c ) ; mm meta := l i s t A r r a y ( LexTable . yy meta ) ; mm base := l i s t A r r a y ( LexTable . y y b a s e ) ; mm def := l i s t A r r a y ( LexTable . y y d e f ) ; mm nxt := l i s t A r r a y ( LexTable . y y n x t ) ; mm chk := l i s t A r r a y ( LexTable . y y c hk ) ; m m a c c l i s t := l i s t A r r a y ( LexTable . y y a c c l i s t ) ; l e x T a b l e s := LEXER TABLE( mm accept , mm ec , mm meta , mm base , mm def , mm nxt , mm chk , m m a c c l i s t ) ; // I n i t i a l i z e t h e Env V a r i a b l e s env := ENV( 1 , 1 , 1 , 0 , 1 , 1 , { } , { } , { 1 } , debug , f i l e N a m e ) ; i f ( debug==true ) then print ( ” \ nLexer a n a l y z e r LexerCode . . . ” + f i l e N a m e + ” \n” ) ; // printAny ( ” \ nLexer a n a l y z e r LexerCode . . . ” + f i l e N a m e + ”\n ”) ; end i f ;

123 125 127 129 131 133 135 137

t o k e n s := { } ; i f ( debug ) then print ( ” \n TOTAL Chars : ” ) ; print ( intString ( l i s t L e n g t h ( program ) ) ) ; end i f ; while ( Util . i s L i s t E m p t y ( program )==f a l s e ) loop i f ( debug ) then print ( ” \ nChars r e m a i n i n g : ” ) ; print ( intString ( l i s t L e n g t h ( program ) ) ) ; end i f ; cTok : : program := program ; cProg := {cTok } ; ( t o k e n s , env , cProg ) := consume ( env , cProg , l e x T a b l e s , t o k e n s ) ; i f ( Util . i s L i s t E m p t y ( cProg )==f a l s e ) then cTok : : cProg := cProg ;

86

139 141 143 145 147 149 151 153

155 157 159

161

163

APPENDIX B. LEXER GENERATOR

program := cTok : : program ; end i f ; end while ; t o k e n s := l i s t R e v e r s e ( t o k e n s ) ; end l e x ; function consume input Env env ; input l i s t program ; input L e x e r T a b l e l e x T a b l e s ; input l i s t t o k e n s ; output l i s t resToken ; output Env env2 ; output l i s t program2 ; array mm accept , mm ec , mm meta , mm base , mm def , mm nxt , mm chk , m m a c c l i s t ; Integer mm startSt , mm currSt , mm pos , mm sPos , mm ePos , mm linenr ; l i s t b u f f e r , b k B u f f e r , s t a t e s ; String f i l e N m ; Integer c , cp , mm finish , baseCond ; Boolean debug ; algorithm LEXER TABLE( a c c e p t=mm accept , e c=mm ec , meta=mm meta , b a s e= mm base , d e f=mm def , nxt=mm nxt , chk=mm chk , a c c l i s t =m m a c c l i s t ) := lexTables ; ENV( s t a r t S t=mm startSt , c u r r S t=mm currSt , pos=mm pos , sPos= mm sPos , ePos=mm ePos , l i n e n r=mm linenr , b u f f=b u f f e r , bkBuf=b k B u f f e r , s t a t e S k=s t a t e s , i s D e b u g g i n g=debug , f i l e N a m e=f i l e N m ) := env ;

165 167 169 171 173

175 177 179 181 183 185

m m f i n i s h := LexTable . y y f i n i s h ; baseCond := mm base [ mm currSt ] ; i f ( debug==true ) then print ( ” \nPROGRAM: { ” + p r i n t B u f f e r ( program , ” ” ) + ” } ” ) ; print ( ” \nBUFFER: { ” + p r i n t B u f f e r ( b u f f e r , ” ” ) + ” } ” ) ; print ( ” \nBKBUFFER: { ” + p r i n t B u f f e r ( b k B u f f e r , ” ” ) + ” } ” ) ; print ( ” \nSTATE STACK: { ” + p r i n t S t a c k ( s t a t e s , ” ” ) + ” } ” ) ; print ( ” b a s e : ” + intString ( baseCond ) + ” s t : ” + intString ( mm currSt )+” ” ) ; end i f ; ( resToken , program2 ) := match ( program , t o k e n s ) local Integer c , d , a ct , v a l , c2 , c u r r 2 , f c h a r ; l i s t r e s t ; l i s t lToken ; String sToken ; Boolean emptyToken ; Option o t o k ; case ( , ) // l o o p t o k e n s equation cp : : r e s t = program ; b u f f e r = cp : : b u f f e r ;

187 mm pos = mm pos+1; 189

B.1. LEXER.MO

191 193 195 197

199

201 203 205

207 209

211 213 215 217 219 221

223

225

87

i f ( cp==10) then mm linenr = mm linenr +1; mm ePos = mm sPos ; mm sPos = 0 ; else mm sPos = mm sPos+1; end i f ; i f ( debug==true ) then print ( ” \n [ Reading : ’ ” + i n t S t r i n g C h a r ( cp ) +” ’ a t p : ” + intString ( mm pos−1) + ” l i n e : ”+ intString ( mm linenr ) + ” rPos : ” + intString ( mm sPos ) +” ] ” ) ; end i f ; c = mm ec [ cp ] ; c2 = c ; c u r r 2 = mm currSt ; i f ( debug==true ) then print ( ” e v a l S t a t e B e f o r e [ c ” + intString ( c2 ) + ” , s ”+ intString ( c u r r 2 )+” ] ” ) ; end i f ; ( mm currSt , c ) = e v a l S t a t e ( l e x T a b l e s , c u r r 2 , c2 ) ; i f ( debug==true ) then print ( ” A f t e r [ c ” + intString ( c ) + ” , s ”+ intString ( mm currSt )+” ] ” ) ; end i f ; i f ( mm currSt >0) then c u r r 2 = mm base [ mm currSt ] ; // p r i n t ( ”BASE:”+ i n t S t r i n g ( c u r r 2 ) +”]”) ; mm currSt = mm nxt [ c u r r 2 + c ] ; // p r i n t ( ”NEXT:”+ i n t S t r i n g ( mm currSt ) +”]”) ; else mm currSt = mm nxt [ c ] ; end i f ; s t a t e s = mm currSt : : s t a t e s ; // printAny ( s t a t e s ) ; // p r i n t ( ” [ c ” + i n t S t r i n g ( c ) + ” , s”+ i n t S t r i n g ( mm currSt ) +”]”) ; // p r i n t ( ” [ B: ” + i n t S t r i n g ( mm base [ mm currSt ] ) +”]”) ; env2 = ENV( mm startSt , mm currSt , mm pos , mm sPos , mm ePos , mm linenr , b u f f e r , r e s t , s t a t e s , debug , f i l e N m ) ; lToken = t o k e n s ;

227 229 231

233

baseCond = mm base [ mm currSt ] ; i f ( baseCond==m m f i n i s h ) then i f ( debug==true ) then print ( ” \n [RESTORE=” + intString ( mm accept [ mm currSt ] ) + ” ] ” ) ; end i f ; ( env2 , a c t ) = f i n d R u l e ( l e x T a b l e s , env2 ) ;

235 237

( otok , env2 ) = LexerCode . a c t i o n ( a ct , env2 ) ;

88

APPENDIX B. LEXER GENERATOR

239

// r e a d t h e env ENV( s t a r t S t=mm startSt , c u r r S t=mm currSt , pos=mm pos , sPos=mm sPos , ePos=mm ePos , l i n e n r=mm linenr , b u f f=b u f f e r , bkBuf= b k B u f f e r , s t a t e S k=s t a t e s , i s D e b u g g i n g=debug , f i l e N a m e=f i l e N m ) = env2 ;

241

243

// r e s t o r e t h e program program2 = b k B u f f e r ; // r e s t a r t c u r r e n t s t a t e env2 = ENV( mm startSt , mm startSt , mm pos , mm sPos , mm pos , mm linenr , b u f f e r , { } , { mm startSt } , debug , f i l e N m ) ; lToken = Util . l i s t C o n s O p t i o n ( otok , t o k e n s ); i f ( debug ) then print ( ” \n CountTokens : ” + intString ( l i s t L e n g t h ( lToken ) ) ) ; end i f ; else program2 = r e s t ; // consume t h e character

245

247

249

251

253 255

end i f ; then ( lToken , program2 ) ;

257

end match ;

259

end consume ;

261 263 265 267

269 271 273 275

277

function f i n d R u l e input L e x e r T a b l e l e x T a b l e s ; input Env env ; output Env env2 ; output Integer a c t i o n ; array mm accept , mm ec , mm meta , mm base , mm def , mm nxt , mm chk , m m a c c l i s t ; Integer mm startSt , mm currSt , mm pos , mm sPos , mm ePos , mm linenr ; l i s t b u f f e r , b k B u f f e r , s t a t e s ; String f i l e N m ; Integer lp , lp1 , stCmp ; Boolean s t , debug ; algorithm LEXER TABLE( a c c e p t=mm accept , e c=mm ec , meta=mm meta , b a s e= mm base , d e f=mm def , nxt=mm nxt , chk=mm chk , a c c l i s t =m m a c c l i s t ) := lexTables ; ENV( s t a r t S t=mm startSt , c u r r S t=mm currSt , pos=mm pos , sPos= mm sPos , ePos=mm ePos , l i n e n r=mm linenr , b u f f=b u f f e r , bkBuf=b k B u f f e r , s t a t e S k=s t a t e s , i s D e b u g g i n g=debug , f i l e N a m e=f i l e N m ) := env ;

279 stCmp : :

:= s t a t e s ;

B.1. LEXER.MO

281

l p := mm accept [ stCmp ] ;

283

// stCmp : : := s t a t e s ; l p 1 := mm accept [ stCmp + 1 ] ;

89

285 287

289 291 293

s t := i n t G t ( lp , 0 ) and i n t L t ( lp , l p 1 ) ; // p r i n t ( ”STATE : [ ” + i n t S t r i n g ( mm currSt )+ ” pos : ” + i n t S t r i n g ( mm pos ) + ” ] ” ) ; // printAny ( s t ) ; ( env2 , a c t i o n ) := match ( s t a t e s , s t ) local Integer act , cp ; l i s t r e s t B u f f , r e s t S t a t e s ; case ( { } , ) equation

295 297 299 301 303 305

act = mm acclist [ lp ] ; print ( ” \nERROR:EMPTY STATE STACK” ) ; then ( env , a c t ) ; case ( , true ) equation stCmp : : = s t a t e s ; l p = mm accept [ stCmp ] ; act = mm acclist [ lp ] ; then ( env , a c t ) ; case ( , f a l s e ) equation

307 cp : : r e s t B u f f = b u f f e r ; 309 311 313 315 317 319

321

323 325

b k B u f f e r = cp : : b k B u f f e r ; mm pos = mm pos − 1 ; mm sPos = mm sPos −1; i f ( cp==10) then mm sPos = mm ePos ; mm linenr = mm linenr −1; end i f ; // b k B u f f e r = cp : : b k B u f f e r ; mm currSt : : r e s t S t a t e s = s t a t e s ; // printAny ( r e s t S t a t e s ) ; // p r i n t ( ” R e s t o r e STATE : [ ” + i n t S t r i n g ( mm currSt )+ ” pos : ” + i n t S t r i n g ( mm pos ) + ” ] ” ) ; env2 = ENV( mm startSt , mm currSt , mm pos , mm sPos , mm ePos , mm linenr , r e s t B u f f , b k B u f f e r , r e s t S t a t e s , debug , f i l e N m ) ; ( env2 , a c t ) = f i n d R u l e ( l e x T a b l e s , env2 ) ; then ( env2 , a c t ) ; end match ; end f i n d R u l e ;

327 329 331 333

function e v a l S t a t e input L e x e r T a b l e l e x T a b l e s ; input Integer c S t a t e ; input Integer c ; output Integer n e w s t a t e ; output Integer new c ;

90

335 337

339 341 343 345 347 349 351 353 355 357 359 361

APPENDIX B. LEXER GENERATOR

array mm accept , mm ec , mm meta , mm base , mm def , mm nxt , mm chk , m m a c c l i s t ; Integer v a l , v a l 2 , chk ; algorithm LEXER TABLE( a c c e p t=mm accept , e c=mm ec , meta=mm meta , b a s e= mm base , d e f=mm def , nxt=mm nxt , chk=mm chk , a c c l i s t =m m a c c l i s t ) := lexTables ; chk := mm base [ c S t a t e ] ; chk := chk + c ; v a l := mm chk [ chk ] ; v a l 2 := mm base [ c S t a t e ] + c ; // p r i n t ( ” { v a l 2=” + i n t S t r i n g ( v a l 2 ) + ”}\ n ” ) ; ( n e w s t a t e , new c ) := match ( c S t a t e==v a l ) local Integer s , c2 ; case ( true ) then ( c S t a t e , c ) ; case ( f a l s e ) equation c S t a t e = mm def [ c S t a t e ] ; // p r i n t ( ” [ newS : ” + i n t S t r i n g ( c S t a t e ) +”]”) ; // c2 = c ; i f ( c S t a t e >= LexTable . y y l i m i t ) then c = mm meta [ c ] ; // p r i n t ( ”META[ c : ” + i n t S t r i n g ( c ) +”]”) ; end i f ; i f ( c S t a t e >0) then ( cState , c ) = evalState ( lexTables , cState , c ) ; end i f ; then ( c S t a t e , c ) ; end match ;

363 end e v a l S t a t e ; 365 367 369 371 373 375 377 379 381 383 385

function g e t I n f o input l i s t b u f f ; input Integer f r P o s ; input Integer f l i n e N r ; input String programName ; output OMCCTypes . I n f o i n f o ; Integer mm linenr , mm sPos ; Integer c ; algorithm mm sPos := f r P o s ; mm linenr := f l i n e N r ; while ( Util . i s L i s t E m p t y ( b u f f )==f a l s e ) loop c : : b u f f := b u f f ; i f ( c==10) then mm linenr := mm linenr − 1 ; mm sPos := 0 ; else mm sPos := mm sPos − 1 ; end i f ; end while ; i n f o := OMCCTypes . INFO( programName , f a l s e , mm linenr , mm sPos+1 , f l i n e N r , f r P o s +1 ,OMCCTypes . getTimeStamp ( ) ) ;

B.1. LEXER.MO

387

389

/∗ i f ( t r u e ) then p r i n t ( ” \nTOKEN f i l e : ” +programName + ” p ( ” + i n t S t r i n g ( mm sPos ) + ” : ” + i n t S t r i n g ( mm linenr ) + ” ) −(” + i n t S t r i n g ( frPos ) + ”:” + i n t S t r i n g ( f l i n e N r ) + ”) ”) ; end i f ; ∗/ end g e t I n f o ;

391 393 395 397 399 401 403 405 407 409 411

function p r i n t B u f f e r input l i s t i n L i s t ; input String c B u f f ; output String o u t L i s t ; l i s t i n L i s t 2 ; algorithm ( o u t L i s t ) := match ( i n L i s t , c B u f f ) local Integer c ; String new, t o u t ; l i s t r e s t ; case ( { } , ) then ( c B u f f ) ; else equation c : : rest = inList ; new = c B u f f + i n t S t r i n g C h a r ( c ) ; ( t o u t ) = p r i n t B u f f e r ( r e s t , new) ; then ( t o u t ) ; end match ; end p r i n t B u f f e r ;

413 415 417 419 421 423 425 427 429 431 433

function p r i n t S t a c k input l i s t i n L i s t ; input String c B u f f ; output String o u t L i s t ; l i s t i n L i s t 2 ; algorithm ( o u t L i s t ) := match ( i n L i s t , c B u f f ) local Integer c ; String new, t o u t ; l i s t r e s t ; case ( { } , ) then ( c B u f f ) ; else equation c : : rest = inList ; new = c B u f f + ” | ” + intString ( c ) ; ( t o u t ) = p r i n t S t a c k ( r e s t , new) ; then ( t o u t ) ; end match ; end p r i n t S t a c k ;

435 437 end L e x e r ;

91

92

APPENDIX B. LEXER GENERATOR

B.2

LexerGenerator.mo Listing B.2: LexerGenerator.mo

2

package L e x e r G e n e r a t o r import System ; /∗

4 ∗/ 6

c o n s t a n t Boolean debug = f a l s e ;

8

c o n s t a n t String l e y e n d = P a r s e r G e n e r a t o r . l e y e n d ;

10 12 14 16 18 20 22 24 26 28

30 32 34 36

38 40 42 44 46 48 50

function g e n L e x e r input String f l e x F i l e ; input String grammarFile ; input String outFileName ; output String r e s u l t ; String f l e x C o d e , re , ar1 , r e s t ; Boolean r e s B o l ; l i s t r e s u l t R e g e x , r e s T a b l e , c h a r s ; algorithm // open f l e x f i l e and v a l i d a t e i f ( debug==true ) then print ( ” \ n G e n e r a t i n g L e x e r from ” + f l e x F i l e ) ; end i f ; i f ( outFileName” ” and s t r i n g L e n g t h ( outFileName ) =0) then // s t a r t s BEGIN s w i t c h s t a r t s t a t e // f i n d t o k e n pos := System . s t r i n g F i n d ( r e s t , ” ( ” ) ; pos2 := System . s t r i n g F i n d ( rest , ”)”) ; cp := s u b s t r i n g 2 ( r e s t , pos +2 , pos2 ) ;

142 144 146

v a l B e g i n := f i n d V a l u e ( f l e x C o d e , cp ) ; v a l B e g i n := 1+2∗ v a l B e g i n ; i f ( debug==true ) then print ( ” \n BEGIN a t ” + intString ( v a l B e g i n ) ) ;

B.2. LEXERGENERATOR.MO

148

150

end i f ; cp := ” \n e q u a t i o n \n mm startSt = ” + intString ( v a l B e g i n ) +” ; ” ; r e s T a b l e := cp : : r e s T a b l e ; end i f ;

152

154 156 158 160 cp := ” \n ”; 162

95

i f ( p o s K e e p B u f f e r < posBreak and p o s K e e p B u f f e r >=0) then // s t a r t s keepbuffer switch start state // p r i n t keep b u f f e r i f ( debug==true ) then print ( ” \n k e e p b u f f e r ” ) ; end i f ; i f ( posBreak < p o s B e g i n ) then cp := ” \n equation ” ; r e s T a b l e := cp : : r e s T a b l e ; end i f ; bufferRet = listReverse ( buffer ) ; r e s T a b l e := cp : : r e s T a b l e ; end i f ;

164 166

168

i f ( posBreak > posReturn ) then i f ( posBreak < p o s B e g i n and posBreak < p o s K e e p B u f f e r ) then cp := ” \n equation ” ; r e s T a b l e := cp : : r e s T a b l e ; end i f ;

170 cp := ” \n

172

174 176

178

180

182

184 186 188 190

a c t 2 = Token” ; r e s T a b l e := cp : : r e s T a b l e ; r e s T a b l e := outFileName : : resTable ; cp := ” . ” ; r e s T a b l e := cp : : r e s T a b l e ; // f i n d t o k e n pos2 := System . s t r i n g F i n d ( rest , ” ; ”) ; cp := s u b s t r i n g 2 ( r e s t , posReturn +8 , pos2 ) ; i f ( debug==true ) then print ( ” \nFound t o k e n : ” + cp ) ; end i f ; r e s T a b l e := cp : : r e s T a b l e ; cp := ” ; \ n t o k = OMCCTypes . TOKEN( tokName [ a c t 2 −nameSpan ] , a c t 2 , buffer , info ) ;\n then (SOME( tok ) ) ; \ n ” ; r e s T a b l e := cp : : r e s T a b l e ; else // p r i n t ( ”NONE” ) ; cp := ” \n then (NONE( ) ) ; \ n” ; r e s T a b l e := cp : : r e s T a b l e ; end i f ;

96

192 194 196

198

APPENDIX B. LEXER GENERATOR

end f o r ; r e s T a b l e := l i s t R e v e r s e ( r e s T a b l e ) ; c a s e A c t i o n := s t r i n g C h a r L i s t S t r i n g ( r e s T a b l e ) ; r e s u l t := System . s t r i n g R e p l a c e ( r e s u l t , ”%c a s e A c t i o n%” , caseAction ) ; System . w r i t e F i l e ( ” LexerCode ” + outFileName + ” . mo” , r e s u l t ) ; b u i l d R e s u l t := true ; end b u i l d L e x e r C o d e ;

200 202 204 206 208 210 212 214 216 218 220 222

224

226

228 230 232

function f i n d V a l u e input String f l e x C o d e ; input String v a r i a b l e ; output Integer v a l u e ; Integer pos ; String r e s t , v a l , r e ; algorithm r e := ” d e f i n e ” + v a r i a b l e ; r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos := System . s t r i n g F i n d ( r e s t , ” \n” ) ; v a l := s u b s t r i n g 2 ( r e s t , s t r i n g L e n g t h ( r e ) +1 , pos ) ; // p r i n t ( ” \ n found v a l u e : ” + v a l ) ; v a l u e := s t r i n g I n t ( v a l ) ; end f i n d V a l u e ; function b u i l d L e x e r input String outFileName ; output Boolean b u i l d R e s u l t ; String lexCode , r e s u l t , stTime , cp ; algorithm lexCode := System . r e a d F i l e ( ” L e x e r . mo” ) ; stTime := l e y e n d + g e t C u r r e n t T i m e S t r ( ) ; r e s u l t := System . s t r i n g R e p l a c e ( lexCode , ” LexTable ” , ” LexTable ” + outFileName ) ; r e s u l t := System . s t r i n g R e p l a c e ( r e s u l t , ” LexerCode ” , ” LexerCode ” + outFileName ) ; cp := ” package L e x e r ” + outFileName + ” // ” + stTime ; r e s u l t := System . s t r i n g R e p l a c e ( r e s u l t , ” package L e x e r ” , cp ) ; r e s u l t := System . s t r i n g R e p l a c e ( r e s u l t , ” end L e x e r ; ” , ” end L e x e r ” + outFileName + ” ; ” ) ; System . w r i t e F i l e ( ” L e x e r ” + outFileName + ” . mo” , r e s u l t ) ; b u i l d R e s u l t := true ; end b u i l d L e x e r ;

238

function b u i l d L e x T a b l e input String f l e x C o d e ; input String outFileName ; output Boolean b u i l d R e s u l t ; String cp , re , re1 , ar1 , r e s t , r e s u l t , stTime ; Integer numMatches , pos1 , pos2 , l e n ; l i s t r e s u l t R e g e x , r e s T a b l e , c h a r s ;

240

algorithm

234 236

242

stTime := l e y e n d + g e t C u r r e n t T i m e S t r ( ) ;

B.2. LEXERGENERATOR.MO

244

cp := ” package ” + outFileName +” // ” + stTime + ” \n\ n c o n s t a n t I n t e g e r y y l i m i t := ” ;

246

r e s T a b l e := cp : : { } ;

248

// I n s e r t y y l i m i t r e := ” i f ( y y c u r r e n t s t a t e >= ” ; r e 1 := ” i f ( y y c u r r e n t s t a t e >=[ˆ) ] ∗ ) ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re1 , 1 , false , false ) ;

250

97

252 254 256 258

260 262 264

ar1 : : := r e s u l t R e g e x ; i f ( debug==true ) then print ( ” \nFound r e g e x : ” + a r 1 ) ; end i f ; numMatches : = 0 ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( ” i f ( y y c u r r e n t s t a t e >= 65 ) ” , ” [ 0 − 9 ] ∗ ” , 2 , f a l s e , f a l s e ) ; i f ( debug==true ) then print ( ” \nNumMatches : ” + intString ( numMatches ) ) ; end i f ; cp : : := r e s u l t R e g e x ; i f ( debug==true ) then print ( ” \nFound r e g e x 2 : ” + cp ) ; end i f ;

266 r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; 268 270

pos2 := System . s t r i n g F i n d ( r e s t , ” ) ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , s t r i n g L e n g t h ( r e ) +1 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ;

272 274 276 278

cp := ” ; \ n\ n c o n s t a n t I n t e g e r y y f i n i s h := ” ; r e s T a b l e := cp : : r e s T a b l e ; r e := ” w h i l e ( y y b a s e [ y y c u r r e n t s t a t e ] != ” ; r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” ) ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , s t r i n g L e n g t h ( r e ) +1 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ;

280 282 284 286

288 290 292 294 296

cp := ” ; \ n\ n c o n s t a n t l i s t y y a c c l i s t := { ” ; r e s T a b l e := cp : : r e s T a b l e ; // match a c c l i s t r e := ” y y a c c l i s t \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; ar1 : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then pos1 := System . s t r i n g F i n d ( ar1 , ” , ” ) ; pos2 := System . s t r i n g F i n d ( ar1 , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( ar1 , pos1 +2 , pos2 −1) ; else a r 1 := ” ” ; end i f ; r e s T a b l e := a r 1 : : r e s T a b l e ;

98

298 300

302 304 306 308 310 312 314

316 318 320 322 324 326

328 330 332 334

APPENDIX B. LEXER GENERATOR

cp := ” } ; \ n\ n c o n s t a n t l i s t y y a c c e p t := { ” ; r e s T a b l e := cp : : r e s T a b l e ; r e := ” y y a c c e p t \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ; cp := ” } ; \ n\ n c o n s t a n t l i s t y y e c := { ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c y y c o n s t i n t y y e c ” ; r e := ” y y e c \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ; cp := ” } ; \ n\ n c o n s t a n t l i s t yy meta := { ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c y y c o n s t i n t yy meta ” ; r e := ” yy meta \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

336 338 340 342

344 346 348

cp := ” } ; \ n\ n c o n s t a n t l i s t y y b a s e := { ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c y y c o n s t s h o r t i n t y y b a s e ” ; r e := ” y y b a s e \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ;

B.2. LEXERGENERATOR.MO

99

350

end i f ;

352

cp := ” } ; \ n\ n c o n s t a n t l i s t y y d e f := { ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c y y c o n s t s h o r t i n t y y d e f ” ; r e := ” y y d e f \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

354 356

358 360 362 364 366 368 370

372 374 376 378 380

cp := ” } ; \ n\ n c o n s t a n t l i s t y y n x t := { ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c y y c o n s t s h o r t i n t y y n x t ” ; r e := ” y y n x t \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; rest : : := r e s u l t R e g e x ; i f ( debug==true ) then print ( ” \nREST n e x t ” + r e s t ) ; end i f ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

382 384 386

388 390 392 394

cp := ” } ; \ n\ n c o n s t a n t l i s t y y c hk := { ” ; r e s T a b l e := cp : : r e s T a b l e ; r e := ” s t a t i c y y c o n s t s h o r t i n t y y c hk ” ; r e := ” y y c hk \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( f l e x C o d e , re , 1 , f a l s e , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( f l e x C o d e , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := s u b s t r i n g 2 ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

396 398

cp := ” } ; \ n\ nend ” + outFileName + ” ; ” ; r e s T a b l e := cp : : r e s T a b l e ;

400 402

r e s T a b l e := l i s t R e v e r s e ( r e s T a b l e ) ; r e s u l t := s t r i n g C h a r L i s t S t r i n g ( r e s T a b l e ) ; System . w r i t e F i l e ( outFileName + ” . mo” , r e s u l t ) ;

100

404

APPENDIX B. LEXER GENERATOR

b u i l d R e s u l t := true ; end b u i l d L e x T a b l e ;

406 408 410 412 414

416

418 420

public function g e t C u r r e n t T i m e S t r ” r e t u r n s c u r r e n t time i n f o r m a t Www Mmm dd hh :mm: s s yyyy u s i n g t h e a s c t i m e ( ) f u n c t i o n i n time . h ( l i b c ) ” output String t i m e S t r ; Integer s e c , min , hour , mday , mon , y e a r ; algorithm t i m e S t r := System . g e t C u r r e n t T i m e S t r ( ) ; /∗ ( s e c , min , hour , mday , mon , y e a r ) := System . getCurrentDateTime ( ) ; t i m e S t r := i n t S t r i n g ( y e a r ) + ”/” + i n t S t r i n g (mon)+ ”/” + i n t S t r i n g ( mday )+ ” ” + i n t S t r i n g ( hour )+ ” : ” + i n t S t r i n g ( min ) + ” : ” + i n t S t r i n g ( s e c ) ; ∗/ end g e t C u r r e n t T i m e S t r ;

440

public function s u b s t r i n g 2 input String i n S t r i n g ; input Integer s t a r t ; input Integer s t o p ; output String o u t S t r i n g ; l i s t c h a r s , r e s u l t ; String c ; Integer i ; algorithm o u t S t r i n g := System . s u b s t r i n g ( i n S t r i n g , s t a r t , s t o p ) ; /∗ r e s u l t : = { } ; c h a r s := s t r i n g L i s t S t r i n g C h a r ( i n S t r i n g ) ; f o r i in 1: stop loop c : : c h a r s := c h a r s ; i f ( i >=s t a r t ) then r e s u l t := c : : r e s u l t ; end i f ; end f o r ; r e s u l t := l i s t R e v e r s e ( r e s u l t ) ; o u t S t r i n g := s t r i n g C h a r L i s t S t r i n g ( r e s u l t ) ; end s u b s t r i n g 2 ;

442

end L e x e r G e n e r a t o r ;

422 424 426 428 430 432 434 436 438

B.3

LexerCode.tmo Listing B.3: LexerCode.tmo

2 4 6

package %LexerCode% // Generated %time% /∗ Template f o r L e x e r Code r e p l a c e keywords : %LexerCode %time %Token

∗/

B.3. LEXERCODE.TMO

8

101

18

%L e x e r %P a r s e T a b l e %c o n s t a n t %nameSpan %f u n c t i o n s %c a s e A c t i o n ∗/ import Types ; import %Token%; import %L e x e r %; import %P a r s e T a b l e %;

20

%p r o l o g u e%

22

function a c t i o n input Integer a c t ; input %L e x e r %.Env env ; output Option t o k e n ; output %L ex e r %.Env env2 ; Integer mm startSt , mm currSt , mm pos , mm sPos , mm ePos , mm linenr , m m f l i n e n r ; l i s t b u f f e r , b k B u f f e r , tb , b u f f e r R e t ; OMCCTypes . I n f o i n f o ; array tokName ; String sToken , f i l e N m ; Integer nameSpan , a c t 2 ; Boolean debug ; algorithm %L e x e r %.ENV( s t a r t S t=mm startSt , c u r r S t=mm currSt , pos=mm pos , sPos=mm sPos , ePos=mm ePos , l i n e n r=mm linenr , b u f f=b u f f e r , bkBuf=b k B u f f e r , i s D e b u g g i n g= debug , f i l e N a m e=f i l e N m ) := env ; b u f f e r := l i s t R e v e r s e ( b u f f e r ) ; tb := b u f f e r ; sToken := %L e x e r %. p r i n t B u f f e r ( tb , ” ” ) ; tokName := l i s t A r r a y (%P a r s e T a b l e %.yytname ) ; nameSpan := %nameSpan%; tb := b u f f e r ; // ( tb , m m f l i n e n r ) := %L e x e r %. l i n e U p d ( tb , m m f l i n e n r ) ; i n f o := %L e x e r %. g e t I n f o ( tb , mm sPos , mm linenr , f i l e N m ) ; // i n f o := OMCCTypes . INFO( fileNm , f a l s e , mm linenr , mm sPos , m m f l i n e n r , mm pos ) ; // p r i n t ( ” \ n” + i n t S t r i n g ( a c t ) + ” : ” ) ; a c t 2 := a c t ; b u f f e r R e t := { } ; ( t o k e n ) := matchcontinue ( a c t ) local OMCCTypes . Token t o k ; %c a s e A c t i o n% case ( ) equation // p r i n t ( ” [ e n t e r e l s e ] ” ) ; print ( ”ERROR TOKEN NOT FOUND: [ ’ ” + sToken + ” ’ TK: ” + intString ( a c t ) + ” , ” + tokName [ a c t 2 ] + ” ] ” ) ; t o k = OMCCTypes .TOKEN( tokName [ a c t 2 ] , act , b u f f e r , i n f o ) ; then (NONE( ) ) ; end matchcontinue ;

10 12 14 16

24 26

28 30 32 34

36

38 40 42 44

46 48 50 52 54 56

58

102

60

62

64

APPENDIX B. LEXER GENERATOR

env2 := %L e x e r %.ENV( mm startSt , mm startSt , mm pos , mm sPos , mm sPos , mm linenr , b u f f e r R e t , b k B u f f e r , { mm startSt } , debug , fileNm ) ; i f ( debug==true ) then print ( ” \n [TOKEN: ’ ” + sToken + ” ’ ( ”+ intString ( mm linenr ) + ” : ” + intString ( mm sPos ) +” ) i d : ” + intString ( a c t 2 ) + ” ] ”) ; end i f ; end a c t i o n ;

66

%e p i l o g u e%

68

end %LexerCode %;

B.4

Types.mo Listing B.4: Types.mo

2

package OMCCTypes import Absyn ; import INFO = Absyn . INFO ;

4 6 8 10

uniontype Token record TOKEN String name ; Integer i d ; l i s t v a l u e ; Info loc ; end TOKEN;

12 end Token ; 14 t y p e I n f o = Absyn . I n f o ; 16 18 20

22 24

26 28 30

/∗ u n i o n t y p e I n f o ” @author adrpo added 2005 −10 −29 , changed 2006−02−05 The I n f o a t t r i b u t e p r o v i d e s l o c a t i o n i n f o r m a t i o n f o r e l e m e n t s and c l a s s e s . ” r e c o r d INFO S t r i n g f i l e N a m e ” f i l e N a m e where t h e c l a s s i s d e f i n e d i n ” ; Boolean isReadOnly ” isReadOnly : ( t r u e | f a l s e ) . Should be true for l i b r a r i e s ” ; I n t e g e r lineNumberStart ” lineNumberStart ” ; I n t e g e r columnNumberStart ” columnNumberStart ” ; I n t e g e r lineNumberEnd ” lineNumberEnd ” ; I n t e g e r columnNumberEnd ”columnNumberEnd” ; Absyn . TimeStamp b u i l d T i m e s ” B u i l d and e d i t t i m e s ” ; end INFO ;

32

end I n f o ; ∗/

34

function getTimeStamp

B.4. TYPES.MO

36 38 40 42 44 46 48

103

output Absyn . TimeStamp timeStamp ; algorithm timeStamp := Absyn . dummyTimeStamp ; end getTimeStamp ; function p r i n t T o k e n input Token t o k e n ; output String s t r T k ; String tokName ; Integer i d t k , l n s , cns , l n e , cne ; l i s t v a l ; Info info ; algorithm TOKEN( name=tokName , i d=i d t k , v a l u e=v a l , l o c=i n f o ) := t o k e n ; INFO( l i n e N u m b e r S t a r t=l n s , columnNumberStart=cns , lineNumberEnd= l n e , columnNumberEnd=cne ) := i n f o ;

50

52 54 56 58 60 62 64 66 68 70 72 74

s t r T k := ” [TOKEN: ” + tokName + ” ’ ” + p r i n t B u f f e r ( v a l , ” ” ) + ” ’ ( ” + intString ( l n s ) + ” : ” + intString ( c n s ) + ”−”+ intString ( l n e ) + ” : ” + intString ( cne ) +” ) ] ” ; end p r i n t T o k e n ; function getMergeTokenValue input Token t o k e n 1 ; input Token t o k e n 2 ; output l i s t v a l u e ; l i s t v a l 1 ; l i s t v a l 2 ; algorithm TOKEN( v a l u e=v a l 1 ) := t o k e n 1 ; TOKEN( v a l u e=v a l 2 ) := t o k e n 2 ; v a l u e := l i s t A p p e n d ( v a l 1 , v a l 2 ) ; end getMergeTokenValue ; function p r i n t E r r o r T o k e n input Token t o k e n ; output String s t r T k ; String tokName , f i l e N m ; Integer i d t k , l n s , cns , l n e , cne ; l i s t v a l ; Info info ; algorithm TOKEN( name=tokName , i d=i d t k , v a l u e=v a l , l o c=i n f o ) := t o k e n ; INFO( f i l e N a m e=fileNm , l i n e N u m b e r S t a r t=l n s , columnNumberStart=cns , lineNumberEnd=l n e , columnNumberEnd=cne ) := i n f o ;

76

78

80 82 84

// s t r T k := f i l e N m + ” : ” + i n t S t r i n g ( l n s ) + ” : ” + i n t S t r i n g ( c n s ) + ” : Syntax ERROR n e a r t o k e n : [ ” + tokName + ” ’ ” + p r i n t B u f f e r ( val , ” ” ) + ” ’ ] ” ; // s t r T k := f i l e N m + ” : ” + i n t S t r i n g ( l n s ) + ” : ” + i n t S t r i n g ( c n s ) + ” : Syntax ERROR n e a r ’ ” + p r i n t B u f f e r ( v a l , ” ” ) + ” ’ ” ; s t r T k := ” ’ ” + p r i n t B u f f e r ( v a l , ” ” ) + ” ’ ” ; end p r i n t E r r o r T o k e n ; function p r i n t I n f o E r r o r input I n f o i n f o ; output String s t r T k ;

104

86 88

90

92 94 96 98 100 102

APPENDIX B. LEXER GENERATOR

String tokName , f i l e N m ; Integer i d t k , l n s , cns , l n e , cne ; l i s t v a l ; algorithm INFO( f i l e N a m e=fileNm , l i n e N u m b e r S t a r t=l n s , columnNumberStart=cns , lineNumberEnd=l n e , columnNumberEnd=cne ) := i n f o ; // s t r T k := f i l e N m + ” : ” + i n t S t r i n g ( l n s ) + ” : ” + i n t S t r i n g ( c n s ) + ” : Syntax ERROR n e a r t o k e n : [ ” + tokName + ” ’ ” + p r i n t B u f f e r ( val , ” ” ) + ” ’ ] ” ; s t r T k := f i l e N m + ” : ” + intString ( l n s ) + ” : ” + intString ( c n s ) ; end p r i n t I n f o E r r o r ; function p r i n t S h o r t T o k e n input Token t o k e n ; output String s t r T k ; String tokName ; Integer i d t k , l n s , cns , l n e , cne ; l i s t v a l ; Info info ; algorithm TOKEN( name=tokName , i d=i d t k , v a l u e=v a l , l o c=i n f o ) := t o k e n ; INFO( l i n e N u m b e r S t a r t=l n s , columnNumberStart=cns , lineNumberEnd= l n e , columnNumberEnd=cne ) := i n f o ;

104 106

// s t r T k := ” [ ” + tokName + ” ’ ” + p r i n t B u f f e r ( v a l , ” ” ) + ” ’ ] ” ; s t r T k := ” ’ ” + p r i n t B u f f e r ( v a l , ” ” ) +” ’ ” ; end p r i n t S h o r t T o k e n ;

108 110 112 114 116 118

120 122 124 126 128 130 132 134

function p r i n t S h o r t T o k e n 2 input Token t o k e n ; output String s t r T k ; String tokName ; Integer i d t k , l n s , cns , l n e , cne ; l i s t v a l ; Info info ; algorithm TOKEN( name=tokName , i d=i d t k , v a l u e=v a l , l o c=i n f o ) := t o k e n ; INFO( l i n e N u m b e r S t a r t=l n s , columnNumberStart=cns , lineNumberEnd= l n e , columnNumberEnd=cne ) := i n f o ; s t r T k := ” [ ” + tokName + ” ’ ” + p r i n t B u f f e r ( v a l , ” ” ) +” ’ ] ” ; // s t r T k := ” ’ ” + p r i n t B u f f e r ( v a l , ” ” ) + ” ’ ” ; end p r i n t S h o r t T o k e n 2 ; function p r i n t T o k e n s input l i s t i n L i s t ; input String c B u f f ; output String o u t L i s t ; l i s t i n L i s t 2 ; algorithm ( o u t L i s t ) := match ( i n L i s t , c B u f f ) local Token c ; String new, t o u t ; l i s t r e s t ; case ( { } , )

B.4. TYPES.MO

136 138 140 142 144

then ( c B u f f ) ; else equation c : : rest = inList ; //new = c B u f f + p r i n t S h o r t T o k e n 2 ( c ) ; new = c B u f f + p r i n t T o k e n ( c ) ; ( t o u t ) = p r i n t T o k e n s ( r e s t , new) ; then ( t o u t ) ; end match ; end p r i n t T o k e n s ;

146 148 150 152 154 156 158 160 162 164 166 168 170

function countTokens input l i s t i n L i s t ; input Integer s V a l u e ; output Integer o u t T o t a l ; l i s t i n L i s t 2 ; algorithm // printAny ( ” \ n h e r e 1 ” ) ; ( o u t T o t a l ) := match ( i n L i s t , s V a l u e ) local Token c ; Integer new, t o u t ; l i s t r e s t ; case ( { } , ) then ( s V a l u e +1) ; else equation // printAny ( ” \ n h e r e 2 ” ) ; c : : rest = inList ; // printAny ( ” \ n h e r e 3 ” ) ; new = s V a l u e + 1 ; ( t o u t ) = countTokens ( r e s t , new) ; then ( t o u t ) ; end match ; // printAny ( ” \ n h e r e 4 ” ) ; end countTokens ;

172 174 176 178 180 182 184 186 188 190 192

function p r i n t B u f f e r input l i s t i n L i s t ; input String c B u f f ; output String o u t L i s t ; l i s t i n L i s t 2 ; algorithm ( o u t L i s t ) := match ( i n L i s t , c B u f f ) local Integer c ; String new, t o u t ; l i s t r e s t ; case ( { } , ) then ( c B u f f ) ; else equation c : : rest = inList ; new = c B u f f + i n t S t r i n g C h a r ( c ) ; ( t o u t ) = p r i n t B u f f e r ( r e s t , new) ; then ( t o u t ) ; end match ;

105

106

end p r i n t B u f f e r ; 194 end OMCCTypes ;

APPENDIX B. LEXER GENERATOR

Appendix C

Parser Generator C.1

Parser.mo Listing C.1: Parser.mo

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33

package P a r s e r import Types ; import P a r s e T a b l e ; import ParseCode ; import Absyn ; import E r r o r ; uniontype Env record ENV OMCCTypes . Token crTk , lookAhTk ; l i s t s t a t e ; l i s t e r r M e s s a g e s ; Integer e r r S t a t u s , s S t a t e , c S t a t e ; l i s t program , progBk ; ParseCode . A s t S t a c k a s t S t a c k ; Boolean i s D e b u g g i n g ; l i s t s t a t e B a c k u p ; ParseCode . A s t S t a c k a s tS t a ck B a c ku p ; end ENV; end Env ; uniontype ParseData record PARSE TABLE array t r a n s l a t e ; array p r h s ; array r h s ; array r l i n e ; array tname ; array toknum ; array r 1 ; array r 2 ; array d e f a c t ; array d e f g o t o ; array p a c t ;

107

108

35 37 39 41

43

45 47 49

APPENDIX C. PARSER GENERATOR

array array array array end PARSE TABLE ; end ParseData ;

pgoto ; table ; check ; s t o s ; // t o be r e p l a c e d

/∗ when t h e e r r o r i s p o s i t i v e t h e p a r s e r r u n s i n r e c o v e r y mode , i f the e r r o r i s negative , the p a r s e r runs in t e s t i n g c a n d i d a t e mode i f t h e e r r o r i s c e r o , then no e r r o r i s p r e s e n t o r has been recovered The e r r o r v a l u e d e c r e a s e s with each s h i f t e d t o k e n ∗/ c o n s t a n t Integer maxErrShiftToken = 3 ; c o n s t a n t Integer maxCandidateTokens = 4 ; c o n s t a n t Integer maxErrRecShift = −5;

53

constant constant constant constant constant

55

t y p e AstTree = ParseCode . AstTree ;

57

function p a r s e ” r e a l i z e t h e s y n t a x a n a l y s i s o v e r t h e l i s t o f t o k e n s and g e n e r a t e s t h e AST t r e e ” input l i s t t o k e n s ” l i s t o f t o k e n s from t h e lexer ” ; input String f i l e N a m e ” f i l e name o f t h e s o u r c e code ” ; input Boolean debug ” f l a g t o ou tp ut debug m e s s a g e s t h a t e x p l a i n t h e s t a t e s o f t h e machine w h i l e p a r s i n g ” ; output Boolean r e s u l t ” r e s u l t o f t h e p a r s i n g ” ; output ParseCode . AstTree a s t ”AST t r e e t h a t i s r e t u r n e d when t h e r e s u l t o ut pu t i s t r u e ” ; array mm tname ; array m m t r a n s l a t e , mm prhs , mm rhs , mm rline , mm toknum , mm r1 , mm r2 , mm defact , mm defgoto , mm pact , mm pgoto , mm table , mm check , mm stos ; ParseData pt ; Env env ; OMCCTypes . Token emptyTok , cTok , cTok2 ; ParseCode . A s t S t a c k a s t S t k ; l i s t rToks ; l i s t s t a t e S t k ; l i s t e r r S t k ; // Boolean r e s u l t ; algorithm i f ( debug ) then print ( ” \ n P a r s i n g t o k e n s ParseCode . . . ” + f i l e N a m e + ” \n” ) ; end i f ; m m t r a n s l a t e := l i s t A r r a y ( P a r s e T a b l e . y y t r a n s l a t e ) ; mm prhs := l i s t A r r a y ( P a r s e T a b l e . yyprhs ) ; mm rhs := l i s t A r r a y ( P a r s e T a b l e . y y r h s ) ; mm rline := l i s t A r r a y ( P a r s e T a b l e . y y r l i n e ) ; mm tname := l i s t A r r a y ( P a r s e T a b l e . yytname ) ; mm toknum := l i s t A r r a y ( P a r s e T a b l e . yytoknum ) ;

51

59

61

63

65 67 69 71 73 75 77 79 81 83

Integer Integer Integer Integer Integer

ERR TYPE DELETE = 1 ; ERR TYPE INSERT = 2 ; ERR TYPE REPLACE = 3 ; ERR TYPE INSEND = 4 ; ERR TYPE MERGE = 5 ;

C.1. PARSER.MO

85 87 89 91

109

mm r1 := l i s t A r r a y ( P a r s e T a b l e . yyr1 ) ; mm r2 := l i s t A r r a y ( P a r s e T a b l e . yyr2 ) ; mm defact := l i s t A r r a y ( P a r s e T a b l e . y y d e f a c t ) ; mm defgoto := l i s t A r r a y ( P a r s e T a b l e . y y d e f g o t o ) ; mm pact := l i s t A r r a y ( P a r s e T a b l e . y yp ac t ) ; mm pgoto := l i s t A r r a y ( P a r s e T a b l e . yypgoto ) ; mm table := l i s t A r r a y ( P a r s e T a b l e . y y t a b l e ) ; mm check := l i s t A r r a y ( P a r s e T a b l e . yycheck ) ; mm stos := l i s t A r r a y ( P a r s e T a b l e . y y s t o s ) ;

93

95

97 99

pt := PARSE TABLE( m m t r a n s l a t e , mm prhs , mm rhs , mm rline , mm tname , mm toknum , mm r1 , mm r2 , mm defact , mm defgoto , mm pact , mm pgoto , mm table , mm check , mm stos ) ; s t a t e S t k := { 0 } ; e r r S t k := { } ; a s t S t k := ParseCode . i n i t A s t S t a c k ( a s t S t k ) ; env := ENV( emptyTok , emptyTok , s t a t e S t k , e r r S t k , 0 , 0 , 0 , t o k e n s , { } , a s t S t k , debug , s t a t e S t k , a s t S t k ) ;

101 103 105

while ( Util . i s L i s t E m p t y ( t o k e n s )==f a l s e ) loop i f ( debug ) then print ( ” \ nTokens r e m a i n i n g : ” ) ; print ( intString ( l i s t L e n g t h ( t o k e n s ) ) ) ; end i f ;

107 cTok : : t o k e n s := t o k e n s ; 109 111 113 115 117 119 121

i f ( Util . i s L i s t E m p t y ( t o k e n s )==f a l s e ) then cTok2 : : := t o k e n s ; rToks := cTok2 : : { } ; rToks := cTok : : rToks ; else rToks := cTok : : { } ; end i f ; ( rToks , env , r e s u l t , a s t ) := p r o c e s s T o k e n ( rToks , env , pt ) ; i f ( r e s u l t==f a l s e ) then break ; end i f ; end while ;

123 125

i f ( debug ) then printAny ( a s t ) ; end i f ;

127 129 131 133 135 137

/∗ i f ( r e s u l t==t r u e ) then p r i n t ( ” \ n SUCCEED − (AST) ” ) ; else p r i n t ( ” \ n FAILED PARSING” ) ; end i f ; ∗/ end p a r s e ; function addSourceMessage input l i s t e r r S t k ; input OMCCTypes . I n f o i n f o ;

110

139 141 143 145 147

APPENDIX C. PARSER GENERATOR

algorithm E r r o r . addSourceMessage ( 1 , e r r S t k , i n f o ) ; // p r i n t ( p r i n t S e m S t a c k ( l i s t R e v e r s e ( e r r S t k ) , ” ” ) ) ; end addSourceMessage ; function p r i n t E r r o r M e s s a g e s input l i s t e r r S t k ; algorithm // p r i n t ( ” \ n ∗∗∗ERROR( S ) FOUND∗∗∗ ” ) ; // p r i n t ( p r i n t S e m S t a c k ( l i s t R e v e r s e ( e r r S t k ) , ” ” ) ) ; end p r i n t E r r o r M e s s a g e s ;

149 151 153 155 157 159 161

163 165 167 169 171 173

function p r o c e s s T o k e n input l i s t t o k e n s ; input Env env ; input ParseData pt ; output l i s t rTokens ; output Env env2 ; output Boolean r e s u l t ; output ParseCode . AstTree a s t ; l i s t tempTokens ; // p a r s e t a b l e s array mm tname ; array m m t r a n s l a t e , mm prhs , mm rhs , mm rline , mm toknum , mm r1 , mm r2 , mm defact , mm defgoto , mm pact , mm pgoto , mm table , mm check , mm stos ; // env v a r i a b l e s OMCCTypes . Token cTok , nTk ; ParseCode . A s t S t a c k a s t S t k , astSkBk ; Boolean debug ; l i s t s t a t e S t k , s t a t e S k B k ; l i s t e r r S t k ; String astTmp ; Integer sSt , cSt , l S t , e r r S t , c F i n a l , c P a c t N i n f , c T a b l e N i n f ; l i s t prog , prgBk ; algorithm PARSE TABLE( t r a n s l a t e=m m t r a n s l a t e , p r h s=mm prhs , r h s=mm rhs , r l i n e=mm rline , tname=mm tname , toknum=mm toknum , r 1=mm r1 , r 2=mm r2 , d e f a c t=mm defact , d e f g o t o=mm defgoto , p a c t=mm pact , pgoto= mm pgoto , t a b l e=mm table , c h e c k=mm check , s t o s=mm stos ) := pt ;

175

177

179

181 183 185

ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k= a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c ku p=astSkBk ) := env ; i f ( debug ) then print ( ” \n [ S t a t e : ” + intString ( c S t ) +” ] { ” + p r i n t S t a c k ( s t a t e S t k , ” ” ) + ” }\n” ) ; end i f ; env2 := env ; // S t a r t t h e LALR( 1 ) P a r s i n g c F i n a l := P a r s e T a b l e . YYFINAL ; c P a c t N i n f := P a r s e T a b l e . YYPACT NINF ; c T a b l e N i n f := P a r s e T a b l e . YYTABLE NINF ;

C.1. PARSER.MO

187 189

191 193 195 197 199 201 203 205 207

209 211 213

215

217

219

221 223 225 227

111

prog := t o k e n s ; // c F i n a l==c S t i s a f i n a l s t a t e ? then ACCEPT // mm pact [ c S t ]== c P a c t N i n f i f t h i s REDUCE o r ERROR r e s u l t := true ; ( rTokens , r e s u l t ) := matchcontinue ( t o k e n s , env , pt , c F i n a l==cSt , mm pact [ c S t+1]==c P a c t N i n f ) local l i s t r e s t ; l i s t v l ; OMCCTypes . Token c , nt ; Integer n , l e n , v a l , tok , tmTok , chkVal ; String nm, semVal ; Absyn . I d e n t i d V a l ; case ( { } , , , f a l s e , f a l s e ) equation i f ( debug ) then print ( ” \nNow a t end o f i n p u t : \ n” ) ; end i f ; n = mm pact [ c S t + 1 ] ; rest = {}; i f ( debug ) then print ( ” [ n : ” + intString ( n ) + ” ] ” ) ; end i f ; i f ( n < 0 o r P a r s e T a b l e .YYLAST < n o r mm check [ n+1] 0 ) then // g o t o y y d e f a u l t ; n = mm defact [ c S t + 1 ] ; i f ( n==0) then // E r r o r Handler i f ( debug ) then print ( ” \n Syntax E r r o r found y y e r r l a b 5 : ” + intString ( e r r S t ) ) ; // printAny ( ” \ n Syntax E r r o r found y y e r r l a b 5 : ” + intString ( errSt ) ) ; end i f ; i f ( e r r S t >=0) then ( env2 , semVal , r e s u l t ) = e r r o r H a n d l e r ( cTok , env , pt ) ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , e r r M e s s a g e s= e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g =debug , s t a t e B a c k u p= stateSkBk , as t S t ac k B a ck u p= astSkBk )= env2 ; else r e s u l t=f a l s e ; end i f ; end i f ; i f ( debug ) then print ( ” REDUCE4” ) ; end i f ; env2=r e d u c e ( n , env , pt ) ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt ,

112

APPENDIX C. PARSER GENERATOR

c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c ku p=astSkBk )= env2 ;

229

231 233 235 237

239

241 243

245 247 249 251

253 255 257

259

261 263

else n = mm table [ n + 1 ] ; i f ( n=0) then ( env2 , semVal , r e s u l t ) = e r r o r H a n d l e r ( cTok , env , pt ) ; else result = false ; end i f ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , e r r M e s s a g e s= e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g =debug , s t a t e B a c k u p= stateSkBk , as t S t ac k B ac k u p= astSkBk )= env2 ; end i f ; n = −n ; i f ( debug ) then print ( ” REDUCE5” ) ; end i f ; env2=r e d u c e ( n , env , pt ) ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c ku p=astSkBk )= env2 ; else i f ( debug ) then print ( ” SHIFT” ) ; end i f ; i f ( e r r S t maxErrRecShift ) then // s t o p s when i t f i n d s and e r r o r i f ( debug ) then print ( ” \ n R e p r o c e s i n g a t t h e END” ) ; end i f ; ( r e s t , env2 , r e s u l t , a s t ) = p r o c e s s T o k e n ( r e s t , env2 , pt ) ; end i f ;

269

271 273 275 277 279

281 283 285 287 289 291 293 295

297

299

301 303 305 307 309

then ( { } , r e s u l t ) ; case ( , , , true , ) equation i f ( debug ) then print ( ” \n\n ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ −ACCEPTED−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗\ n” ) ; end i f ; r e s u l t = true ; i f ( Util . i s L i s t E m p t y ( e r r S t k )==f a l s e ) then printErrorMessages ( errStk ) ; result = false ; end i f ; a s t = ParseCode . getAST ( a s t S t k ) ; then ( { } , r e s u l t ) ; case ( , , , f a l s e , true ) equation n = mm defact [ c S t + 1 ] ; i f ( n == 0 ) then // E r r o r Handler i f ( debug ) then print ( ” \n Syntax E r r o r found y y e r r l a b 3 : ” + intString ( n ) ) ; end i f ; i f ( e r r S t >=0) then ( env2 , semVal , r e s u l t ) = e r r o r H a n d l e r ( cTok , env , pt ) ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e= s t a t e S t k , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e= cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c ku p =astSkBk )= env2 ; else result = false ; end i f ; end i f ; // r e d u c e ; i f ( debug ) then print ( ”REDUCE3” ) ; end i f ; env2=r e d u c e ( n , env , pt ) ;

114

311

313 315 317

319

321 323 325

APPENDIX C. PARSER GENERATOR

i f ( r e s u l t==true ) then // s t o p s when i t f i n d s and error ( r e s t , env2 , r e s u l t , a s t ) = p r o c e s s T o k e n ( t o k e n s , env2 , pt ) ; end i f ; then ( r e s t , r e s u l t ) ; case ( , , , f a l s e , f a l s e ) equation /∗ Do a p p r o p r i a t e p r o c e s s i n g g i v e n t h e c u r r e n t s t a t e . Read a l o o k a h e a d t o k e n i f we need one and don ’ t a l r e a d y have one . ∗/ c : : r e s t = tokens ; cTok = c ; OMCCTypes .TOKEN( i d=tmTok , name=nm, v a l u e=v l ) = c ; semVal = p r i n t B u f f e r ( v l , ” ” ) ; i f ( debug ) then print ( ” [ ” + nm + ” , ’ ” + semVal +” ’ ] ” ) ; end i f ;

327 t o k = t r a n s l a t e ( tmTok , pt ) ; 329 /∗ F i r s t t r y t o d e c i d e what t o do w i t h o u t r e f e r e n c e t o lookahead token . ∗/ 331 333 335 337 339

341 343 345

347 349 351

353

355

n = mm pact [ c S t + 1 ] ; i f ( debug ) then print ( ” [ n : ” + intString ( n ) + ”−” ) ; end i f ; n = n + tok ; i f ( debug ) then print ( ”NT: ” + intString ( n ) + ” ] ” ) ; end i f ; chkVal = n+1; i f ( chkVal =0) then ( env2 , semVal , r e s u l t ) = e r r o r H a n d l e r ( cTok , env , pt ) ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e= s t a t e S t k , e r r M e s s a g e s=e r r S t k ,

C.1. PARSER.MO

115

e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e= cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c ku p =astSkBk )= env2 ; else e r r S t = maxErrRecShift ; result = false ; end i f ;

357 359 else 361 363 365

367

369

371 373 375 377

379

381

383 385 387 389 391

i f ( debug ) then print ( ” REDUCE2” ) ; end i f ; env2=r e d u c e ( n , env , pt ) ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c k up=astSkBk )= env2 ; r e s t = tokens ; ( r e s t , env2 , r e s u l t , a s t ) = p r o c e s s T o k e n ( r e s t , env2 , pt ) ; end i f ; else // t r y t o g e t t h e v a l u e f o r t h e a c t i o n i n t h e t a b l e array n = mm table [ n + 1 ] ; i f ( n=0) then ( env2 , semVal , r e s u l t ) = e r r o r H a n d l e r ( cTok , env , pt ) ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e= s t a t e S t k , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e= cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c ku p =astSkBk )= env2 ; else result = false ; e r r S t = maxErrRecShift ; end i f ; else n = −n ; i f ( debug ) then print ( ” REDUCE” ) ; end i f ; env2=r e d u c e ( n , env , pt ) ;

116

APPENDIX C. PARSER GENERATOR

ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , s S t a t e=sSt , c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g=debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c ku p=astSkBk )= env2 ; r e s t = tokens ; ( r e s t , env2 , r e s u l t , a s t ) = p r o c e s s T o k e n ( r e s t , env2 , pt ) ; end i f ; else i f ( debug ) then print ( ” SHIFT1” ) ; end i f ; cSt = n ; s t a t e S t k = cSt : : s t a t e S t k ; i d V a l = semVal ; ( a s t S t k ) = ParseCode . push ( a s t S t k , idVal , cTok ) ; astSkBk = a s t S t k ; stateSkBk = s t a t e S t k ; i f ( e r r S t 0) then // r e d u c e t h e s h i f t e r r o r lookup errSt = errSt − 1; end i f ; env2 = ENV( c , nt , s t a t e S t k , e r r S t k , e r r S t , sSt , cSt , r e s t , r e s t , a s t S t k , debug , stateSkBk , astSkBk ); end i f ;

393

395 397 399 401 403 405

407 409

411

end i f ;

413 415

417 419 421

i f ( e r r S t 0 o r l i s t L e n g t h ( r e s t ) ==0) then i f ( ( r e s u l t==true ) and ( e r r S t >maxErrRecShift ) ) then // s t o p s when i t f i n d s and e r r o r ( r e s t , env2 , r e s u l t , a s t ) = p r o c e s s T o k e n ( r e s t , env2 , pt ) ; end i f ; end i f ; then ( r e s t , r e s u l t ) ; end matchcontinue ; // r e t u r n t h e AST

423

end p r o c e s s T o k e n ;

425

function e r r o r H a n d l e r input OMCCTypes . Token currTok ; input Env env ; input ParseData pt ; output Env env2 ; output String e r r o r M s g ; output Boolean r e s u l t ; // env v a r i a b l e s OMCCTypes . Token cTok , nTk ; ParseCode . A s t S t a c k a s t S t k , astSkBk ; Boolean debug ; Integer sSt , cSt , e r r S t ; l i s t prog , prgBk ;

427 429 431 433 435 437

C.1. PARSER.MO

439 441

443 445 447 449

451

453

117

l i s t s t a t e S t k , s t a t e S k B k ; l i s t e r r S t k ; // p a r s e t a b l e s array mm tname ; array m m t r a n s l a t e , mm prhs , mm rhs , mm rline , mm toknum , mm r1 , mm r2 , mm defact , mm defgoto , mm pact , mm pgoto , mm table , mm check , mm stos ; l i s t r e d S t k ; Integer numTokens ; String msg , semVal ; algorithm PARSE TABLE( t r a n s l a t e=m m t r a n s l a t e , p r h s=mm prhs , r h s=mm rhs , r l i n e=mm rline , tname=mm tname , toknum=mm toknum , r 1=mm r1 , r 2=mm r2 , d e f a c t=mm defact , d e f g o t o=mm defgoto , p a c t=mm pact , pgoto= mm pgoto , t a b l e=mm table , c h e c k=mm check , s t o s=mm stos ) := pt ; ENV( crTk=cTok , lookAhTk=nTk , s t a t e=s t a t e S t k , s S t a t e=sSt , e r r M e s s a g e s=e r r S t k , e r r S t a t u s=e r r S t , c S t a t e=cSt , program=prog , progBk=prgBk , a s t S t a c k=a s t S t k , i s D e b u g g i n g= debug , s t a t e B a c k u p=stateSkBk , a s tS t a c kB a c k up=astSkBk ) := env ;

455 457

459

461 463 465 467 469 471 473 475 477 479

481 483

i f ( debug ) then print ( ” \nERROR RECOVERY INITIATED : ” ) ; print ( ” \n [ S t a t e : ” + intString ( c S t ) +” ] { ” + p r i n t S t a c k ( s t a t e S t k , ” ” ) + ” }\n” ) ; print ( ” \n [ S t a t e S t a c k Backup : { ” + p r i n t S t a c k ( stateSkBk , ” ” ) + ” }\n” ) ; end i f ; semVal := OMCCTypes . p r i n t T o k e n ( currTok ) ; ( errorMsg , r e s u l t ) := matchcontinue ( e r r S t ==0, prog ) local String erMsg , name ; l i s t c a n d i d a t e s ; l i s t r e s t ; Integer i , idTok ; OMCCTypes . I n f o i n f o ; case ( true , { } ) // s t a r t e r r o r c a t c h i n g equation erMsg = OMCCTypes . p r i n t E r r o r T o k e n ( currTok ) ; // i n s e r t t o k e n i f ( debug ) then print ( ” \n Checking INSERT a t t h e END t o k e n : ” ) ; // printAny ( ” \ n Checking INSERT a t t h e END t o k e n : ” ) ; end i f ; candidates = {}; c a n d i d a t e s = c h e c k C a n d i d a t e s ( c a n d i d a t e s , env , pt , 3 ) ; i f ( Util . i s L i s t E m p t y ( c a n d i d a t e s )==f a l s e ) then erMsg = erMsg + ” , INSERT a t t h e End t o k e n ” + printCandidateTokens ( candidates , ”” ) ; end i f ; e r r S t k = erMsg : : e r r S t k ;

118

485 487 489 491 493

APPENDIX C. PARSER GENERATOR

OMCCTypes .TOKEN( l o c=i n f o ) = currTok ; addSourceMessage ( e r r S t k , i n f o ) ; printErrorMessages ( errStk ) ; e r r S t = maxErrShiftToken ; then ( erMsg , f a l s e ) ; // end e r r o r c a t c h i n g case ( true , ) // s t a r t e r r o r c a t c h i n g equation //OMCCTypes .TOKEN( i d=idTok ) = currTok ; erMsg = OMCCTypes . p r i n t E r r o r T o k e n ( currTok ) ;

495 497 499 501 503

505 507 509 511 513

515

i f ( debug ) then print ( ” \n Check MERGE t o k e n u n t i l n e x t t o k e n ” ) ; end i f ; nTk : : = prog ; OMCCTypes .TOKEN( i d=idTok ) = nTk ; i f ( checkToken ( idTok , env , pt , 5 )==true ) then : : nTk : : = prog ; erMsg = erMsg + ” , MERGE t o k e n s ” + OMCCTypes . p r i n t S h o r t T o k e n ( currTok ) + ” and ” + OMCCTypes . p r i n t S h o r t T o k e n ( nTk ) ; end i f ; // i n s e r t t o k e n i f ( debug ) then print ( ” \n Checking INSERT t o k e n : ” ) ; end i f ; candidates = {}; c a n d i d a t e s = c h e c k C a n d i d a t e s ( c a n d i d a t e s , env , pt , 2 ) ; i f ( Util . i s L i s t E m p t y ( c a n d i d a t e s )==f a l s e ) then erMsg = erMsg + ” , INSERT t o k e n ” + printCandidateTokens ( candidates , ”” ) ; // e r r S t k = erMsg : : e r r S t k ; end i f ;

517 e r r S t = maxErrShiftToken ; 519 521 523 525 527

529

// r e p l a c e t o k e n // erMsg = ” Syntax E r r o r n e a r ” + semVal ; i f ( debug ) then print ( ” \n Checking REPLACE t o k e n : ” ) ; end i f ; candidates = {}; c a n d i d a t e s = c h e c k C a n d i d a t e s ( c a n d i d a t e s , env , pt , 3 ) ; i f ( Util . i s L i s t E m p t y ( c a n d i d a t e s )==f a l s e ) then erMsg = erMsg + ” , REPLACE t o k e n with ” + printCandidateTokens ( candidates , ”” ) ; // e r r S t k = erMsg : : e r r S t k ; end i f ;

531 e r r S t = maxErrShiftToken ; 533 535 537

// t r y t o s u p r e s s t h e t o k e n // erMsg = ” Syntax E r r o r n e a r ” + semVal ; i f ( debug ) then print ( ” \n Check ERASE t o k e n u n t i l n e x t t o k e n ” ) ;

C.1. PARSER.MO

539 541 543 545 547 549 551 553 555 557 559 561

563

565

119

end i f ; nTk : : = prog ; OMCCTypes .TOKEN( i d=idTok ) = nTk ; i f ( checkToken ( idTok , env , pt , 1 )==true ) then erMsg = erMsg + ” , ERASE t o k e n ” ; // e r r S t k = erMsg : : e r r S t k ; end i f ; // printAny ( e r r S t k ) ; i f ( Util . i s L i s t E m p t y ( e r r S t k )==true ) then e r r S t k = erMsg : : { } ; else e r r S t k = erMsg : : e r r S t k ; end i f ; OMCCTypes .TOKEN( l o c=i n f o ) = currTok ; addSourceMessage ( e r r S t k , i n f o ) ; e r r S t = maxErrShiftToken ; then ( erMsg , true ) ; // end e r r o r c a t c h i n g case ( f a l s e , ) // add one more e r r o r equation printErrorMessages ( errStk ) ; erMsg = OMCCTypes . p r i n t E r r o r T o k e n ( currTok ) ; then ( erMsg , f a l s e ) ; end matchcontinue ; i f ( debug==true ) then print ( ” \nERROR NUM: ” + intString ( e r r S t ) +” DETECTED: \ n” + errorMsg ) ; end i f ; env2 := ENV( cTok , nTk , s t a t e S t k , e r r S t k , e r r S t , sSt , cSt , prog , prgBk , a s t S t k , debug , stateSkBk , astSkBk ) ; // env2 := env ; end e r r o r H a n d l e r ;

567 569 571 573 575 577 579 581 583 585

function c h e c k C a n d i d a t e s input l i s t c a n d i d a t e s ; input Env env ; input ParseData pt ; input Integer a c t i o n ; output l i s t r e s C a n d i d a t e s ; Integer n ; // env v a r i a b l e s OMCCTypes . Token cTok , nTk ; ParseCode . A s t S t a c k a s t S t k , astSkBk ; Boolean debug ; Integer sSt , cSt , e r r S t ; l i s t prog , prgBk ; l i s t s t a t e S t k , s t a t e S k B k ; l i s t e r r S t k ; // p a r s e t a b l e s array mm tname ; array m m t r a n s l a t e , mm prhs , mm rhs , mm rline , mm toknum , mm r1 , mm r2 , mm defact , mm defgoto , mm pact , mm pgoto , mm table , mm check , mm stos ;

587 589 591

Integer numTokens , i , j =1; String name , t ok V al ; algorithm PARSE TABLE( tname=mm tname ) := pt ;

120

593 595 597 599 601 603 605 607

APPENDIX C. PARSER GENERATOR

r e s C a n d i d a t e s := c a n d i d a t e s ; numTokens := 255 + P a r s e T a b l e .YYNTOKENS − 1 ; // e x h a u s t i v e s e a r c h o v e r t h e t o k e n s f o r i in 2 5 8 : numTokens loop i f ( checkToken ( i , env , pt , a c t i o n )==true ) then //name := mm tname [ i − 2 5 5 ] ; i f ( j 0 then l e n := mm r2 [ r u l e ] ; i f ( debug ) then print ( ” [ Reducing ( l : ” + intString ( l e n ) + ” , r : ” + intString ( r u l e ) +” ) ] ” ) ; end i f ; r e d S t k := { } ; f o r i in 1 : l e n loop v a l : : s t a t e S t k := s t a t e S t k ; end f o r ; i f ( e r r S t >=0) then ( a s t S t k , e r r o r , errMsg ) := ParseCode . a c t i o n R e d ( r u l e , a s t S t k , mm r2 ) ; end i f ; i f ( e r r o r ) then e r r S t k := errMsg : : e r r S t k ; e r r S t := maxErrShiftToken ; end i f ;

753 cSt : :

:= s t a t e S t k ;

755 n := mm r1 [ r u l e ] ; 757 nSt := mm pgoto [ n − P a r s e T a b l e .YYNTOKENS + 1]; nSt := nSt + c S t ; chkVal := nSt +1; i f ( chkVal =0) and ( nSt 0) then f o r i in 1 : numTk loop cp := r e p l a c e T o k e n V a l ( cp , i ) ; i f ( debug==true ) then print ( ” \n” + cp ) ; end i f ; end f o r ; end i f ; // r e p l a c e r e s u l t t y p e r e := ” ( y y v a l ) [ ” + tokRes + ” ] ” ; cp := System . s t r i n g R e p l a c e ( cp , re , ”v” + tokRes ) ; i f ( tokRes==” S t r i n g ” ) then // d e f a u l t t o k e n r e := ” ( y y v a l ) ” ; cp := System . s t r i n g R e p l a c e ( cp , re , ”v” + tokRes ) ; end i f ; cp := System . s t r i n g R e p l a c e ( cp , ” ; } ” , ” ” ) ; cp := System . s t r i n g R e p l a c e ( cp , ” { ” , ” ”) ; i f ( debug==true ) then print ( ” \n r e p l a c e T o k e n V a l : ” + cp ) ; end i f ;

C.2. PARSERGENERATOR.MO

332

334

336 338 340 342 344 346 348 350

352 354 356

358

360

133

r e s T a b l e := cp : : r e s T a b l e ; cp := ” \n // push R e s u l t \n ” ; r e s T a b l e := cp : : r e s T a b l e ; cp := ” sk ” +tokRes + ”= v” + tokRes + ” : : sk ” + tokRes + ” ; \n” ; r e s T a b l e := cp : : r e s T a b l e ; else // r o o t node cp := r u l e ; i f (numTk>0) then f o r i in 1 : numTk loop cp := r e p l a c e T o k e n V a l ( cp , i ) ; i f ( debug==true ) then print ( ” \n” + cp ) ; end i f ; end f o r ; end i f ; // r e p l a c e r e s u l t t y p e r e := ” { ( a b s y n t r e e ) [ ” + tokRes + ” ] ” ; cp := System . s t r i n g R e p l a c e ( cp , re , ” v” + tokRes ) ; cp := System . s t r i n g R e p l a c e ( cp , ” ; } ” , ” ” ) ; i f ( debug==true ) then print ( ” \n r e p l a c e T o k e n R o o t : ” + cp ) ; end i f ; r e s T a b l e := cp : : r e s T a b l e ; cp := ” \n // push R e s u l t \n ” ; r e s T a b l e := cp : : r e s T a b l e ; cp := ” sk ” +tokRes + ”= v” + tokRes + ” : : sk ” + tokRes + ” ; \n” ; r e s T a b l e := cp : : r e s T a b l e ; end i f ;

362 364 366 368 370 372 374 376

378 380

r e s T a b l e := l i s t R e v e r s e ( r e s T a b l e ) ; p r o c e s s e d R u l e s := s t r i n g C h a r L i s t S t r i n g ( r e s T a b l e ) ; end p r o c e s s R u l e ; function r e p l a c e T o k e n V a l input String r u l e ; input Integer t o k ; output String r e s u l t ; Integer pos , pos2 , numTok ; String re , r e s t , typeTok , cp ; algorithm numTok := numTokens ( r u l e ) ; typeTok := findTypeToken ( r u l e , t o k ) ; r e := ” ( yyvsp [ ( ” + intString ( t o k ) + ” ) − ( ” + intString ( numTok) + ” ) ] ) [ ” + typeTok + ” ] ” ; pos := System . s t r i n g F i n d ( r u l e , r e ) ; i f ( pos =0) then r e s t := System . s t r i n g F i n d S t r i n g ( r u l e , r e ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” ] ” ) ; typeTok := System . s u b s t r i n g ( r e s t , s t r i n g L e n g t h ( r e ) +1 , pos2 ) ; e l s e i f ( posAST>=0) then r e := ” ( a b s y n t r e e ) [ ” ; r e s t := System . s t r i n g F i n d S t r i n g ( r u l e , r e ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” ] ” ) ; typeTok := System . s u b s t r i n g ( r e s t , s t r i n g L e n g t h ( r e ) +1 , pos2 ) ; else i f ( pos2 >=0) then typeTok := ” S t r i n g ” ; end i f ; end i f ; i f ( debug==true ) then print ( ” \n TypeTok−” + typeTok + ”−” ) ; end i f ; end f i n d T y p e R e s u l t ;

C.2. PARSERGENERATOR.MO

436 438 440 442

135

function findTypeToken input String r u l e ; input Integer t o k ; output String typeTok ; Integer pos , pos2 , numTok ; String re , r e s t ; algorithm numTok := numTokens ( r u l e ) ; r e := ” ( yyvsp [ ( ” + intString ( t o k ) + ” ) − ( ” + intString ( numTok) + ” ) ] ) [ ” ;

444 pos := System . s t r i n g F i n d ( r u l e , r e ) ; 446 448 450

452 454 456

i f ( pos 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

648 650 652

654

cp := ” } ; \ n\ n c o n s t a n t l i s t yyprhs = {\n” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e u i n t 8 yyprhs ” ; r e := ” yyprhs \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then

C.2. PARSERGENERATOR.MO

656

// r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ;

658 660 end i f ; 662 664 666

668 670 672 674

cp := ” } ; \ n\ n c o n s t a n t l i s t y y r h s = ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e i n t 8 y y r h s ” ; r e := ” y y r h s \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

676 678 680

682 684 686 688

cp := ” ; \ n\ n c o n s t a n t l i s t y y r l i n e := {\n” ; r e s T a b l e := cp : : r e s T a b l e ; r e := ” s t a t i c c o n s t y y t y p e u i n t 8 y y r l i n e ” ; r e := ” y y r l i n e \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

690 692 694

cp := ” } ; \ n\ n c o n s t a n t l i s t yytname = {\n” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t c h a r ∗ c o n s t yytname ” ; r e := ” yytname [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ;

696 698

700 702 704

rest : : := r e s u l t R e g e x ; // p r i n t ( ” \ nNumMatches : ” + i n t S t r i n g ( numMatches ) + ”\n” + rest ) ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” , 0 ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +2 , pos2 ) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

706 708

cp := ” } ; \ n\ n c o n s t a n t l i s t yytoknum = {\n” ; r e s T a b l e := cp : : r e s T a b l e ;

139

140

710

712 714 716 718

APPENDIX C. PARSER GENERATOR

// r e := ” s t a t i c c o n s t y y t y p e u i n t 1 6 yytoknum ” ; r e := ”yytoknum \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

720 722 724

726 728 730 732

cp := ” } ; \ n\ n c o n s t a n t l i s t yyr1 = {\n” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e u i n t 8 yyr1 ” ; r e := ” yyr1 \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

734 736 738

740 742 744 746

cp := ” } ; \ n\ n c o n s t a n t l i s t yyr2 = {\n” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e u i n t 8 yyr2 ” ; r e := ” yyr2 \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” , ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +2 , pos2 −1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

748 750 752

754 756 758 760

cp := ” } ; \ n\ n c o n s t a n t l i s t y y d e f a c t = ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e u i n t 8 y y d e f a c t ” ; r e := ” y y d e f a c t \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

C.2. PARSERGENERATOR.MO

762 764 766

768 770 772 774

cp := ” ; \ n\ n c o n s t a n t l i s t y y d e f g o t o = ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e i n t 8 y y d e f g o t o ” ; r e := ” y y d e f g o t o \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

776 778 780

782 784 786 788

cp := ” ; \ n\ n c o n s t a n t l i s t yy pa ct = ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e i n t 8 y yp ac t ” ; r e := ” yy pa ct \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

790 792 794

796 798 800 802

cp := ” ; \ n\ n c o n s t a n t l i s t yypgoto = ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e i n t 8 yypgoto ” ; r e := ” yypgoto \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

804 806 808

810 812 814

cp := ” ; \ n\ n c o n s t a n t l i s t y y t a b l e = ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e u i n t 8 y y t a b l e ” ; r e := ” y y t a b l e \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ;

141

142

APPENDIX C. PARSER GENERATOR

a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ;

816 end i f ; 818 820 822

824 826 828 830

cp := ” ; \ n\ n c o n s t a n t l i s t yycheck =” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e i n t 8 yycheck ” ; r e := ” yycheck \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

832 834 836

838 840 842 844

cp := ” ; \ n\ n c o n s t a n t l i s t y y s t o s = ” ; r e s T a b l e := cp : : r e s T a b l e ; // r e := ” s t a t i c c o n s t y y t y p e u i n t 8 y y s t o s ” ; r e := ” y y s t o s \ \ [ [ 0 − 9 ] ∗ \ \ ] = [ ˆ } ] ∗ } ” ; ( numMatches , r e s u l t R e g e x ) := System . r e g e x ( bisonCode , re , 1 , false , false ) ; rest : : := r e s u l t R e g e x ; i f ( numMatches > 0 ) then // r e s t := System . s t r i n g F i n d S t r i n g ( bisonCode , r e ) ; pos1 := System . s t r i n g F i n d ( r e s t , ” { ” ) ; pos2 := System . s t r i n g F i n d ( r e s t , ” } ” ) ; a r 1 := System . s u b s t r i n g ( r e s t , pos1 +1 , pos2 +1) ; r e s T a b l e := a r 1 : : r e s T a b l e ; end i f ;

846 848 850 852 854 856 858 860 862 864

866

cp := ” ; \ n\ nend ” + outFileName + ” ; ” ; r e s T a b l e := cp : : r e s T a b l e ; r e s T a b l e := l i s t R e v e r s e ( r e s T a b l e ) ; r e s u l t := s t r i n g C h a r L i s t S t r i n g ( r e s T a b l e ) ; System . w r i t e F i l e ( outFileName + ” . mo” , r e s u l t ) ; b u i l d R e s u l t := true ; end b u i l d P a r s e T a b l e ; public function g e t C u r r e n t T i m e S t r ” r e t u r n s c u r r e n t time i n f o r m a t Www Mmm dd hh :mm: s s yyyy u s i n g t h e a s c t i m e ( ) f u n c t i o n i n time . h ( l i b c ) ” output String t i m e S t r ; Integer s e c , min , hour , mday , mon , y e a r ; algorithm t i m e S t r := System . g e t C u r r e n t T i m e S t r ( ) ; /∗ ( s e c , min , hour , mday , mon , y e a r ) := System . getCurrentDateTime ( ) ; t i m e S t r := i n t S t r i n g ( y e a r ) + ”/” + i n t S t r i n g (mon)+ ”/” + i n t S t r i n g ( mday )+ ” ” + i n t S t r i n g ( hour )+ ” : ” + i n t S t r i n g ( min ) + ” : ” + i n t S t r i n g ( s e c ) ; ∗/

C.3. PARSECODE.TMO

143

end g e t C u r r e n t T i m e S t r ; 868 870 872 874 876 878 880 882 884 886 888 890 892

public function s u b s t r i n g 3 input String i n S t r i n g ; input Integer s t a r t ; input Integer s t o p ; output String o u t S t r i n g ; l i s t c h a r s , r e s u l t ; String c ; Integer i ; algorithm r e s u l t :={}; c h a r s := s t r i n g L i s t S t r i n g C h a r ( i n S t r i n g ) ; f o r i in 1 : s t o p loop c : : c h a r s := c h a r s ; i f ( i >=s t a r t ) then r e s u l t := c : : r e s u l t ; end i f ; end f o r ; r e s u l t := l i s t R e v e r s e ( r e s u l t ) ; o u t S t r i n g := s t r i n g C h a r L i s t S t r i n g ( r e s u l t ) ; end s u b s t r i n g 3 ; end P a r s e r G e n e r a t o r ;

C.3

ParseCode.tmo Listing C.3: ParseCode.tmo

2

package %ParseCode% // %time% import Types ;

4

%p r o l o g u e%

6

uniontype A s t S t a c k record ASTSTACK %a s t S t a c k% end ASTSTACK; record EMPTY end EMPTY; end A s t S t a c k ;

8 10 12 14 16

function i n i t A s t S t a c k input A s t S t a c k a s t S t a c k ; output A s t S t a c k a s t S t a c k 2 ; algorithm a s t S t a c k 2 := ASTSTACK(% i n i t S t a c k %) ; end i n i t A s t S t a c k ;

18 20

22

function getAST ” r e t u r n s t h e AST b u i l t by t h e p a r s i n g ” input A s t S t a c k a s t S t k ” MultiTypedStack used by t h e p a r s e r ” ; output AstTree a s t ” r e t u r n s t h e AST i n t h e f i n a l t y p e o f t h e tree ” ; l i s t r e t S t k ;

144

24 26

APPENDIX C. PARSER GENERATOR

algorithm ASTSTACK( s t a c k%a s t T r e e%=r e t S t k ) := a s t S t k ; ast : : := r e t S t k ; end getAST ;

28 30 32 34 36 38 40

function a c t i o n R e d input Integer a c t ; input A s t S t a c k a s t S t k ; input array mm r2 ; output A s t S t a c k a s t S t k 2 ; output Boolean e r r o r=f a l s e ; output String e r r o r M s g=” ” ; OMCCTypes . I n f o i n f o ; // env v a r i a b l e s // A s t S t a c k %a s t S t a c k V a r%

42 algorithm 44 46 48 50 52 54

56 58 60

// p r i n t C o n t e n t S t a c k ( a s t S t k ) ; // printAny ( ” r u l e : ” + i n t S t r i n g ( a c t ) ) ; %GETASTSTACK% ( ) := matchcontinue ( ac t , a s t S t k ) local // l o c a l v a r i a b l e s %c a s e A c t i o n% case ( , ) equation //lAST = i n t S t r i n g ( a c t ) ; e r r o r M s g = ” \n” + OMCCTypes . p r i n t I n f o E r r o r ( i n f o ) + ” : I l e g a l a c t i o n c a s e ” + intString ( a c t ) ; e r r o r = true ; then ( ) ; end matchcontinue ; %PUTASTSTACK% /∗ a s t S t k 2 := ASTSTACK( i d S t k , i n S t k , boStk , roStk , exStk , i l S t k , s t S t k ) ; ∗/

62

end a c t i o n R e d ;

64

function r e d u c e S t r i n g S t a c k input l i s t s k S t r i n g ; input Integer nTokens ; output l i s t s k S t r i n g R e s ; String s t r R e d u c e ; Integer i ; algorithm f o r i in 1 : nTokens loop s t r R e d u c e : : s k S t r i n g := s k S t r i n g ; end f o r ; s k S t r i n g R e s := s k S t r i n g ; end r e d u c e S t r i n g S t a c k ;

66 68 70 72 74 76

function g e t I n f o

C.3. PARSECODE.TMO

78 80 82 84 86 88 90

92 94

96 98 100

102

104

input l i s t skToken ; input Integer nTokens ; output OMCCTypes . I n f o i n f o ; output l i s t skTokenRes ; Token t o k e n ; OMCCTypes . I n f o tmpInfo ; Integer l n s , cns , l n e , cne , i ; String f n ; algorithm f o r i in 1 : nTokens loop t o k e n : : skToken := skToken ; OMCCTypes .TOKEN( l o c=i n f o ) := t o k e n ; i f ( i==nTokens ) then OMCCTypes . INFO( f i l e N a m e=fn , l i n e N u m b e r S t a r t=l n s , columnNumberStart=c n s ) := i n f o ; end i f ; i f ( i ==1) then OMCCTypes . INFO( lineNumberEnd=l n e , columnNumberEnd=cne ) := info ; end i f ; end f o r ; i f ( nTokens==0) then token : : := skToken ; OMCCTypes .TOKEN( l o c=i n f o ) := t o k e n ; OMCCTypes . INFO( f i l e N a m e=fn , l i n e N u m b e r S t a r t=l n s , columnNumberStart=cns , lineNumberEnd=l n e , columnNumberEnd =cne ) := i n f o ; end i f ; i n f o := OMCCTypes . INFO( fn , f a l s e , l n s , cns , l n e , cne , OMCCTypes . getTimeStamp ( ) ) ; t o k e n := OMCCTypes .TOKEN( ” grupped ” , 0 , { } , i n f o ) ; skTokenRes := t o k e n : : skToken ; end g e t I n f o ;

106 108 110 112 114 116 118

145

function push input A s t S t a c k a s t S t k ; input String i n V a l ; input OMCCTypes . Token t o k e n ; output A s t S t a c k a s t S t k 2 ; // A s t S t a c k %a s t S t a c k V a r% algorithm %GETASTSTACK% s k S t r i n g := i n V a l : : s k S t r i n g ; skToken := t o k e n : : skToken ; %PUTASTSTACK% end push ;

120 %e p i l o g u e% 122 end %ParseCode %;

Appendix D

Sample Input D.1

lexer10.l Listing D.1: lexer10.l

1 3

%{ #i n c l u d e < s t d l i b . h> #d e f i n e YYSTYPE void ∗ #i n c l u d e ” p a r s e r . h”

5 7 9 11 13

#i f d e f RML #i n c l u d e ” y a c c l i b . h” #i n c l u d e ”Absyn . h” #e l s e #i n c l u d e ” m e t a m o d e l i c a . h” e x t e r n s t r u c t r e c o r d d e s c r i p t i o n Absyn Exp INT desc ; #d e f i n e Absyn INT (X1) ( mmc mk box2 (3 ,& Absyn Exp INT , X1) ) #e n d i f

15

19

i n t a b s y n i n t e g e r ( char ∗ s ) ; i n t a b s y n i d e n t o r k e y w o r d ( char ∗ s ) ; i n t yywrap ( ) ; %}

21

%o p t i o n y y l i n e n o

23

%x c comment

25

whitespace letter ident digit digits icon /∗ Lex s t y l e

17

27 29 31

[ \ t \n]+ [ a−zA−Z ] { l e t t e r }({ l e t t e r }|{ d i g i t }) ∗ [0 −9] { d i g i t }+ {digits} l e x i c a l s y n t a x o f t o k e n s i n t h e PAM l a n g u a g e ∗/

146

desc

D.2. PARSER10.Y

33

%%

35

{ whitespace } ” while ” ” do ” ”else ” ” end ” ” endif ” ”if” ” read ” ” then ” ” write ” { ident } {digits} ” := ” ”+” ”−” ”∗” ”/” ”(” ”)” ”” ”;”

; return T WHILE ; return T DO ; return T ELSE ; return T END ; return T ENDIF ; return T IF ; return T READ ; return T THEN ; return T WRITE ; return T IDENT ; return T INTCONST ; return T ASSIGN ; return T ADD ; return T SUB ; return T MUL ; return T DIV ; return T LPAREN ; return T RPAREN; return T LT ; return T LE ; return T EQ ; return T NE ; return T GE ; return T GT ; return T SEMIC ;

” /\∗ ”

{

37 39 41 43 45 47 49 51 53 55 57 59 61 63

BEGIN( c comment ) ; }

65 67 69 71 73 75

{ ” \∗/ ” { BEGIN( INITIAL ) ; } ” /\∗ ” { y y e r r o r ( ” S u s p i c i o u s comment” ) ; } [ˆ\n ] ; \n ; { y y e r r o r ( ” Unterminated comment” ) ; yyterminate () ; } }

77 %% 79 81 83

i n t yywrap ( ) { return 1 ; }

D.2

parser10.y

147

148

APPENDIX D. SAMPLE INPUT

Listing D.2: parser10.y 1 3 5 7

%{ type type type type type type type

AstTree = Absyn . Stmt ; Stmt = Absyn . Stmt ; I d e n t L s t = Absyn . I d e n t L s t ; I d e n t = Absyn . I d e n t ; Exp = Absyn . Exp ; BinOp = Absyn . BinOp ; RelOp = Absyn . RelOp ;

9 c o n s t a n t l i s t l s t S e m V a l u e 2 = { } ; 11 13 15 17

c o n s t a n t l i s t l s t S e m V a l u e = { ” e r r o r ” , ” undetermined ” , ” r e a d ” , ” w r i t e ” , ” := ” , ” i f ” , ” then ” , ” e n d i f ” , ” e l s e ” , ” to ” , ” do ” , ” end ” , ” while ” , ”(” , ”)” , ” I d e n t i t y ” , ” I n t e g e r ” , ”=” , ”=” , ”” , ”+” , ”−” , ”∗” , ”/” , ”;”, ”” } ;

19 %} 21

47

%t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n

49

%%

51

/∗ Yacc BNF grammar o f t h e PAM l a n g u a g e ∗/

53

program /∗ Stmt ∗/

23 25 27 29 31 33 35 37 39 41 43 45

T READ T WRITE T ASSIGN T IF T THEN T ENDIF T ELSE T TO T DO T END T WHILE T LPAREN T RPAREN T IDENT T INTCONST T EQ T LE T LT T GT T GE T NE T ADD T SUB T MUL T DIV T SEMIC

:

series { ( a b s y n t r e e ) [ Stmt ] = $1 [ Stmt ] ; }

D.2. PARSER10.Y

55

149

s e r i e s /∗ Stmt ∗/

|

57

: statement { $$ [ Stmt ] = Absyn . SEQ( $1 [ Stmt ] , Absyn . SKIP ( ) ) ; } statement s e r i e s { $$ [ Stmt ] = Absyn . SEQ( $1 [ Stmt ] , $2 [ Stmt ] ) ; }

59 s t a t e m e n t /∗ Stmt ∗/

:

61 | 63 | 65 | 67 | 69 | 71 73

i n p u t s t a t e m e n t T SEMIC { $$ [ Stmt ] = $1 [ Stmt ] ; } o u t p u t s t a t e m e n t T SEMIC { $$ [ Stmt ] = $1 [ Stmt ] ; } a s s i g n m e n t s t a t e m e n t T SEMIC { $$ [ Stmt ] = $1 [ Stmt ] ; } conditional statement { $$ [ Stmt ] = $1 [ Stmt ] ; } definite loop { $$ [ Stmt ] = $1 [ Stmt ] ; } while loop { $$ [ Stmt ] = $1 [ Stmt ] ; }

i n p u t s t a t e m e n t /∗ Stmt ∗/ :

T READ v a r i a b l e l i s t { $$ [ Stmt ] = Absyn .READ( $2 [ IdentLst ] ) ; }

75 o u t p u t s t a t e m e n t /∗ Stmt ∗/

:

77

79

v a r i a b l e l i s t /∗ I d e n t L s t ∗/

|

81

T WRITE v a r i a b l e l i s t { $$ [ Stmt ] = Absyn .WRITE( $2 [ IdentLst ] ) ; } :

variable { $$ [ I d e n t L s t ] = $1 [ I d e n t ] : : { } ; } variable variable list { $$ [ I d e n t L s t ] = $1 [ I d e n t ] : : $2 [ IdentLst ] ; }

83 a s s i g n m e n t s t a t e m e n t /∗ Stmt ∗/

:

85

87

c o n d i t i o n a l s t a t e m e n t /∗ Stmt ∗/ : T ENDIF

|

89 91

93

d e f i n i t e l o o p /∗ Stmt ∗/ T END

v a r i a b l e T ASSIGN e x p r e s s i o n { $$ [ Stmt ] = Absyn . ASSIGN( $1 [ I d e n t ] , $3 [ Exp ] ) ; } T IF c o m p a r i s o n T THEN s e r i e s

{ $$ [ Stmt ] = Absyn . IF ( $2 [ Exp ] , $4 [ Stmt ] , Absyn . SKIP ( ) ) ; } T IF c o m p a r i s o n T THEN s e r i e s T ELSE s e r i e s T ENDIF { $$ [ Stmt ] = Absyn . IF ( $2 [ Exp ] , $4 [ Stmt ] , $6 [ Stmt ] ) ; } :

T TO e x p r e s s i o n T DO s e r i e s { $$ [ Stmt ] = Absyn .TODO( $2 [ Exp ] , $4 [ Stmt ] ) ; }

95 w h i l e l o o p /∗ Stmt ∗/ T END 97

:

T WHILE c o m p a r i s o n T DO s e r i e s { $$ [ Stmt ] = Absyn .WHILE( $2 [ Exp ] , $4 [ Stmt ] ) ; }

150

99

APPENDIX D. SAMPLE INPUT

e x p r e s s i o n /∗Exp∗/

: |

101

term

expression

{ $$ [ Exp ] = $1 [ Exp ] ; } w e a k o p e r a t o r term { $$ [ Exp ] = Absyn . BINARY( $1 [ Exp ] , $2 [ BinOp ] , $3 [ Exp ] ) ; }

103 term /∗Exp∗/

:

105 |

term

107

109

e l e m e n t /∗Exp∗/

:

|

111

|

113

element { $$ [ Exp ] = $1 [ Exp ] ; } strong operator element { $$ [ Exp ] = Absyn . BINARY( $1 [ Exp ] , $2 [ BinOp ] , $3 [ Exp ] ) ; }

constant { $$ [ Exp ] ]) ; } variable { $$ [ Exp ] ]) ; } T LPAREN e x p r e s s i o n { $$ [ Exp ]

= Absyn . INT ( $1 [ Integer

= Absyn . IDENT( $1 [ I d e n t T RPAREN = $2 [ Exp ] ; }

115 c o m p a r i s o n /∗Exp∗/

:

117

119

v a r i a b l e /∗ S t r i n g ∗/

121

c o n s t a n t /∗ I n t e g e r ∗/

expression relation expression { $$ [ Exp ] = Absyn . RELATION( $1 [ Exp ] , $2 [ RelOp ] , $3 [ Exp ] ) ; } :

T IDENT { $$ [ I d e n t ] = $1 ; } : T INTCONST { $$ [ Integer ] = $1 ; }

123 r e l a t i o n /∗ RelOp ∗/ | | | | |

125 127 129 131

T LE T LT T GT T GE T NE

{ { { { {

: T EQ { $$ [ RelOp ] = Absyn .EQ( ) ; } $$ [ RelOp ] = Absyn . LE ( ) ; } $$ [ RelOp ] = Absyn . LT( ) ; } $$ [ RelOp ] = Absyn .GT( ) ; } $$ [ RelOp ] = Absyn .GE( ) ; } $$ [ RelOp ] = Absyn .NE( ) ; }

w e a k o p e r a t o r /∗BinOp∗/ : T ADD { $$ [ BinOp ] = Absyn .ADD( ) ; } | T SUB { $$ [ BinOp ] = Absyn . SUB( ) ; }

133 135

s t r o n g o p e r a t o r /∗BinOp∗/ : T MUL { $$ [ BinOp ] = Absyn .MUL( ) ; } | T DIV { $$ [ BinOp ] = Absyn . DIV ( ) ; }

137

%%

139

function printAST ” p r i n t t h e AST b u i l t by t h e p a r s i n g ” input A s t S t a c k a s t S t k ” MultiTypedStack used by t h e p a r s e r ” ; output AstTree a s t ” r e t u r n s t h e AST i n t h e f i n a l t y p e o f t h e tree ” ; l i s t r e t S t k ; algorithm ASTSTACK( s t a c k S t m t=r e t S t k ) := a s t S t k ; printAny ( a s t ) ; ast : : := r e t S t k ; end printAST ;

141

143 145 147

D.2. PARSER10.Y

149 151

153 155

function getSemValue ” r e t r i e v e s semval from t o k e n s ” input Integer t o k e n I d ; output String tokenSemValue ” r e t u r n s s e m a n t i c v a l u e o f t h e token ” ; array v a l u e s ; algorithm v a l u e s := l i s t A r r a y ( l s t S e m V a l u e ) ; tokenSemValue := v a l u e s [ t o k e n I d ] ; end getSemValue ;

151

Appendix E

Sample Output These output files are from the Exercise 10 from the MetaModelica guide [Fritzson and Pop, 2011a].

E.1

ParseTable10.mo Listing E.1: ParseTable10.mo

package P a r s e T a b l e 1 0 // 1 7 : 0 0 : 5 8 2011

g e n e r a t e d by OMCC v0 . 7 F r i Apr 29

2 4

c o n s t a n t Integer YYFINAL =

30;

6

c o n s t a n t Integer YYLAST =

71;

8

c o n s t a n t Integer YYNTOKENS =

29;

10

c o n s t a n t Integer YYNNTS =

12

c o n s t a n t Integer YYNRULES =

14

c o n s t a n t Integer YYNSTATES =

16

c o n s t a n t Integer YYUNDEFTOK =

2;

18

c o n s t a n t Integer YYMAXUTOK =

283;

20

c o n s t a n t Integer YYPACT NINF =

−20;

22

c o n s t a n t Integer YYTABLE NINF =

24

c o n s t a n t l i s t y y t r a n s l a t e = { 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,

26

20; 39; 68;

−1;

152

2,

2, 2,

2, 2,

2, 2,

E.1. PARSETABLE10.MO

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

2,

1,

2,

6, 7, 13 , 14 , 15 , 16 , 17 , 23 , 24 , 25 , 26 , 27 ,

8,

9,

10 ,

11 ,

12 ,

18 ,

19 ,

20 ,

21 ,

22 ,

2, 28

2,

2, 2,

2, 2,

2, 2,

2, 30

2,

2, 2,

2, 2,

2, 2,

2, 32

2,

2, 2,

2, 2,

2, 2,

2, 34

2,

2, 2,

2, 2,

2, 2,

2, 36

2,

2, 2,

2, 2,

2, 2,

2, 38

2,

2, 2,

2, 2,

2, 2,

2, 40

2,

2, 2,

2, 2,

2, 2,

2, 42

2,

2, 2,

2, 2,

2, 2,

2, 44

2,

2, 2,

2, 2,

2, 2,

2, 46

2,

2, 2,

2, 2,

2, 2,

2, 48

2,

2, 2,

2, 2,

2, 2,

2, 50

2, 5,

2, 2,

3,

52

153

4,

28};

54 56

c o n s t a n t l i s t yyprhs = { 0, 3, 5, 7, 10 ,

13 ,

16 ,

19 ,

21 ,

154

APPENDIX E. SAMPLE OUTPUT

23 ,

25 ,

28 ,

31 ,

33 ,

36 ,

40 ,

46 ,

72 ,

74 ,

78 ,

80 ,

82 ,

86 ,

96 , 98 , 110 , 112};

100 ,

102 ,

104 ,

106 ,

108 ,

−1,

32 ,

−1,

32 ,

28 ,

−1,

36 ,

28 ,

−1,

3,

35 ,

−1,

35 ,

−1,

44 ,

5,

8,

−1,

6,

43 ,

10 ,

40 ,

11 ,

31 ,

12 ,

−1,

41 ,

−1,

41 ,

48 ,

42 ,

−1,

15 ,

−1,

40 ,

46 ,

18 ,

−1,

19 ,

−1,

23 ,

−1,

24 ,

−1,

62 , 84 ,

64 , 87 ,

66 , 89 ,

111 ,

113 ,

116 ,

129 ,

131 ,

132 ,

54 , 58

66 ,

68 , 90 ,

94 ,

60 , 92 ,

60 62

64

66

68

70

72

74 76 78

80

82 84

86

88

90

c o n s t a n t l i s t y y r h s = { 30 , 0, −1, 31 , 31 , −1, 33 , 28 , −1, 34 , −1, 37 , −1, 38 , −1, 39 , 4, 35 , −1, 44 , −1, 44 , 40 , −1, 6, 43 , 7, 31 , 7, 31 , 9, 31 , 8, −1, 12 , −1, 13 , 43 , 11 , 31 , 40 , 47 , 41 , −1, 42 , −1, 45 , −1, 44 , −1, 14 , 40 , 40 , −1, 16 , −1, 17 , −1, 20 , −1, 21 , −1, 22 , −1, 25 , −1, 26 , −1, 27 , −1 };

c o n s t a n t l i s t y y r l i n e := { 53 , 53 , 55 , 57 , 60 , 70 , 73 , 76 , 79 , 81 , 93 , 96 , 99 , 101 , 104 , 106 , 109 , 119 , 121 , 124 , 125 , 126 , 127 , 128 , 134 , 135};

68 ,

c o n s t a n t l i s t yytname = { ” e r r o r ” , ” $ u n d e f i n e d ” , ”T READ” , ”T WRITE” , ”T ASSIGN” , ” T IF ” , ”T THEN” , ”T ENDIF” , ”T ELSE” , ”T TO” , ”T DO” , ”T END” , ” T WHILE” , ”T LPAREN” , ”T RPAREN” , ”T IDENT” , ”T INTCONST” , ”T EQ” , ”T LE ” , ”T LT” , ”T GT” , ”T GE” , ”T NE” , ”T ADD” , ”T SUB” , ”T MUL” , ”T DIV” , ” T SEMIC” , ” $ a c c e p t ” , ” program ” , ” s e r i e s ” , ” s t a t e m e n t ” , ” i n p u t s t a t e m e n t ” , ” output statement ” , ” v a r i a b l e l i s t ” , ” assignment statement ” , ” conditional statement ” , ” definite loop ” , ” while loop ” , ” expression ” , ” term ” , ” e l e m e n t ” , ” c o m p a r i s o n ” , ” v a r i a b l e ” , ” c o n s t a n t ” , ” relation ” ,

E.1. PARSETABLE10.MO

155

” weak operator ” , ” strong operator ” }; 92 94

96

98 100

102

104 106

108

110

112

114

116

118 120

122

c o n s t a n t l i s t yytoknum = { 256 , 257 , 258 , 259 , 260 , 261 , 262 , 263 , 264 , 265 , 266 , 267 , 268 , 269 , 270 , 271 , 272 , 273 , 274 , 275 , 276 , 277 , 278 , 279 , 280 , 281 , 282 , 283}; c o n s t a n t l i s t yyr1 = { 29 , 30 , 31 , 31 , 32 , 32 , 33 , 34 , 35 , 35 , 38 , 39 , 40 , 40 , 41 , 41 , 42 , 44 , 45 , 46 , 46 , 46 , 46 , 46 , 48 , 48}; c o n s t a n t l i s t yyr2 = { 2, 1, 1, 2, 1, 2, 2, 1, 5, 5, 1, 3, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1};

2,

32 , 36 ,

32 , 37 ,

32 , 37 ,

42 ,

42 ,

43 ,

46 ,

47 ,

47 ,

2,

2,

1,

1,

2,

3,

5,

7,

1,

1,

3,

3,

1,

1,

1,

1,

0,

28 ,

0,

10 ,

0,

11 ,

0,

25 ,

24 ,

0,

14 ,

0,

37 ,

0,

0,

26 ,

27 ,

21 ,

19 ,

0,

11 ,

17 ,

12 ,

26 ,

27 ,

46 ,

13 ,

−20 ,

19 ,

−20 ,

26 ,

−20 ,

29 ,

−20 ,

−20 ,

c o n s t a n t l i s t y y d e f a c t = { 0, 0, 0, 0, 0, 2, 3, 0, 0, 0, 8, 9, 13 , 12 , 0, 29 , 0, 20 , 22 , 0, 0, 1, 4, 5, 6, 7, 30 , 31 , 32 , 33 , 34 , 35 , 36 , 38 , 39 , 0, 0, 0, 0, 15 , 23 , 0, 0, 0, 16 , 0, 18 , }; c o n s t a n t l i s t y y d e f g o t o = { −1, 7, 8, 9, 10 , 13 , 14 , 15 , 22 , 23 , 24 , 25 , 47 , 50 };

17

124 126

128

c o n s t a n t l i s t yy pa ct = { 22 , 1, 1, 13 , 13 , −20 , 22 , −8, 5, 6, −20 , −20 , 1, −20 , 13 , −20 , 46 , −19 , −20 , −1, 28 ,

32 ,

156

APPENDIX E. SAMPLE OUTPUT

−20 ,

−20 , −20 , −20 , −20 , −20 , −20 , −20 , −20 , −20 , 13 , 22 , 22 , −20 , 7, 30 , 31 , −20 ,

130

132

−20 ,

−20 ,

13 ,

−20 ,

−11 ,

−20 ,

−20 ,

−20 ,

13 ,

13 ,

22 ,

−13 ,

−20 ,

−13 ,

−19 ,

22 ,

−20 ,

−20 ,

32 ,

−20 ,

3,

−20 ,

0,

−20 ,

−20 ,

19 ,

28 ,

48 ,

62 ,

63 ,

6,

1,

2,

20 ,

5,

51 ,

54 ,

59 ,

60 ,

61 ,

0,

0,

66 ,

39 ,

40 ,

41 ,

2,

4,

26 ,

8,

9,

16 ,

3,

4,

14 ,

13 ,

7,

35 ,

51 ,

52 ,

53 ,

−1,

−1,

63 ,

19 ,

20 ,

21 ,

13 ,

16 ,

30 ,

39 ,

44 ,

35 ,

−20

}; 134 136

138 140

142

144

146

148

c o n s t a n t l i s t yypgoto = { −20 , −20 , −6, −20 , −20 , −20 , −20 , −20 , 2, −3, −9, 44 , −20 , −20 }; c o n s t a n t l i s t y y t a b l e = { 16 , 18 , 18 , 31 , 55 , 49 , 16 , 52 , 44 , 45 , 44 , 45 , 18 , 30 , 32 , 36 , 37 , 44 , 45 , 3, 6, 21 , 35 , 4, 33 , 34 , 6, 53 , 67 , 58 , 64 , 65 , 57 , 56 , 29 , 0, 16 , 16 , 16 , 0, 0, 0, 0, 0, 0, 16 , 38 , 42 , 43 , 44 , 45 };

150 152

154

156

158

160 162

164

c o n s t a n t l i s t yycheck ={ 0, 1, 2, 9, 15 , 27 , 9, 11 , 24 , 25 , 24 , 25 , 18 , 0, 28 , 18 , 20 , 24 , 25 , 6, 16 , 17 , 5, 10 , 28 , 28 , 16 , 11 , 8, 50 , 12 , 12 , 47 , 46 , 5, −1, 51 , 52 , 53 , −1, −1, −1, −1, −1, −1, 63 , 18 , 22 , 23 , 24 , 25 }; c o n s t a n t l i s t y y s t o s = { 0, 3, 4, 6, 10 , 31 , 32 , 33 , 34 , 36 , 37 , 38 , 44 , 35 ,

E.2. PARSECODE10.MO

14 ,

17 ,

40 ,

41 ,

42 ,

43 ,

44 ,

45 ,

31 , 28 , 18 , 19 , 20 , 21 , 22 , 26 , 27 , 48 , 7, 11 , 42 , 31 , 31 , 31 , 8,

28 ,

28 ,

5,

35 ,

40 ,

23 ,

24 ,

25 ,

46 ,

47 ,

11 ,

40 ,

15 ,

40 ,

41 ,

9,

12 ,

12 ,

31 ,

8

40 , 166

0,

168

170

};

172

end P a r s e T a b l e 1 0 ;

E.2

157

43 ,

ParseCode10.mo Listing E.2: ParseCode10.mo

package ParseCode10 // Generated 29 1 7 : 0 0 : 5 8 2011

g e n e r a t e d by OMCC v0 . 7 F r i Apr

2 4

import AbsynPAM ; import Types ;

6 8 10 12

type type type type type type type

AstTree = AbsynPAM . Stmt ; Stmt = AbsynPAM . Stmt ; I d e n t L s t = AbsynPAM . I d e n t L s t ; I d e n t = AbsynPAM . I d e n t ; Exp = AbsynPAM . Exp ; BinOp = AbsynPAM . BinOp ; RelOp = AbsynPAM . RelOp ;

14 c o n s t a n t l i s t l s t S e m V a l u e 2 = { } ; 16 18 20 22

c o n s t a n t l i s t l s t S e m V a l u e = { ” e r r o r ” , ” undetermined ” , ” r e a d ” , ” w r i t e ” , ” := ” , ” i f ” , ” then ” , ” e n d i f ” , ” e l s e ” , ” to ” , ” do ” , ” end ” , ” while ” , ”(” , ”)” , ” I d e n t i t y ” , ” I n t e g e r ” , ”=” , ”=” , ”” , ”+” , ”−” , ”∗” , ”/” , ”;”, ”” } ;

24 26 28 30 32 34 36

uniontype A s t S t a c k record ASTSTACK l i s t stackRelOp ; l i s t stackBinOp ; l i s t stackExp ; l i s t s t a c k I d e n t ; l i s t s t a c k I d e n t L s t ; l i s t s t a c k S t m t ; l i s t s t a c k S t r i n g ; l i s t s t a c k I n t e g e r ; end ASTSTACK;

158

38 40 42 44

APPENDIX E. SAMPLE OUTPUT

record EMPTY end EMPTY; end A s t S t a c k ; function i n i t A s t S t a c k input A s t S t a c k a s t S t a c k ; output A s t S t a c k a s t S t a c k 2 ; algorithm a s t S t a c k 2 := ASTSTACK( { } , { } , { } , { } , { } , { } , { } , { } ) ; end i n i t A s t S t a c k ;

46 48

50 52 54 56 58 60 62 64 66 68

function getAST ” r e t u r n s t h e AST b u i l t by t h e p a r s i n g ” input A s t S t a c k a s t S t k ” MultiTypedStack used by t h e p a r s e r ” ; output AstTree a s t ” r e t u r n s t h e AST i n t h e f i n a l t y p e o f t h e tree ” ; l i s t r e t S t k ; algorithm ASTSTACK( s t a c k S t m t=r e t S t k ) := a s t S t k ; ast : : := r e t S t k ; end getAST ; function a c t i o n R e d input Integer a c t ; input A s t S t a c k a s t S t k ; output A s t S t a c k a s t S t k 2 ; // env v a r i a b l e s // A s t S t a c k l i s t skRelOp ; l i s t skBinOp ; l i s t skExp ; l i s t s k I d e n t ; l i s t s k I d e n t L s t ; l i s t skStmt ; l i s t s k S t r i n g ; l i s t s k I n t e g e r ;

70 72 algorithm 74 /∗ 76

78 80

82

84

ASTSTACK( i d e n t S t a c k=i d S t k , i n t S t a c k=i n S t k , binOpStack=boStk , r e l O p S t a c k=roStk , e x p S t a c k=exStk , i d e n t L s t S t a c k=i l S t k , s t m t S t a c k=s t S t k ) := a s t S t k ; ∗/ ASTSTACK( stackRelOp=skRelOp , stackBinOp=skBinOp , stackExp= skExp , s t a c k I d e n t=s k I d e n t , s t a c k I d e n t L s t=s k I d e n t L s t , s t a c k S t m t=skStmt , s t a c k S t r i n g=s k S t r i n g , s t a c k I n t e g e r= s k I n t e g e r ) := a s t S t k ; ( ) := matchcontinue ( ac t , a s t S t k ) local // l o c a l v a r i a b l e s RelOp vRelOp , v1RelOp , v2RelOp , v3RelOp , v4RelOp , v5RelOp , v6RelOp , v7RelOp ; BinOp vBinOp , v1BinOp , v2BinOp , v3BinOp , v4BinOp , v5BinOp , v6BinOp , v7BinOp ; Exp vExp , v1Exp , v2Exp , v3Exp , v4Exp , v5Exp , v6Exp , v7Exp ; Ident vIdent , v1Ident , v2Ident , v3Ident , v4Ident , v5Ident , v6Ident , v7Ident ;

E.2. PARSECODE10.MO

86

88

90 92 94 96

IdentLst vIdentLst , v1IdentLst , v2IdentLst , v3IdentLst , v4IdentLst , v5IdentLst , v6IdentLst , v7IdentLst ; Stmt vStmt , v1Stmt , v2Stmt , v3Stmt , v4Stmt , v5Stmt , v6Stmt , v7Stmt ; String v S t r i n g , v1String , v2String , v3String , v4String , v5String , v6String , v7String ; Integer v I n t e g e r , v1Integer , v2Integer , v3Integer , v4Integer , v5Integer , v6Integer , v7Integer ; case ( 2 , ) // #l i n e 54 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; // b u i l d vStmt = ( v1Stmt ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

98 then ( ) ; 100 102 104 106 108 110 112 114 116 118 120 122 124 126 128 130 132 134 136

159

case ( 3 , ) // #l i n e 56 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; // b u i l d vStmt = AbsynPAM . SEQ( ( v1Stmt ) , AbsynPAM . SKIP ( ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ; then ( ) ; case ( 4 , ) // #l i n e 58 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; v2Stmt : : skStmt = skStmt ; // b u i l d vStmt = AbsynPAM . SEQ( ( v1Stmt ) , ( v2Stmt ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ; then ( ) ; case ( 5 , ) // #l i n e 61 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; v2String : : skString = skString ; // b u i l d vStmt = ( v1Stmt ) ; // push R e s u l t skStmt= vStmt : : skStmt ; then ( ) ; case ( 6 , ) // #l i n e 63 ” p a r s e r 1 0 . y” equation

160

138

APPENDIX E. SAMPLE OUTPUT

144

// r e d u c e v1Stmt : : skStmt = skStmt ; v2String : : skString = skString ; // b u i l d vStmt = ( v1Stmt ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

146

then ( ) ;

140 142

148 150 152 154 156 158 160 162 164 166

case ( 7 , ) // #l i n e 65 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; v2String : : skString = skString ; // b u i l d vStmt = ( v1Stmt ) ; // push R e s u l t skStmt= vStmt : : skStmt ; then ( ) ; case ( 8 , ) // #l i n e 67 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; // b u i l d vStmt = ( v1Stmt ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

168 then ( ) ; 170 172 174 176 178 180 182 184 186 188

case ( 9 , ) // #l i n e 69 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; // b u i l d vStmt = ( v1Stmt ) ; // push R e s u l t skStmt= vStmt : : skStmt ; then ( ) ; case ( 1 0 , ) // #l i n e 71 ” p a r s e r 1 0 . y” equation // r e d u c e v1Stmt : : skStmt = skStmt ; // b u i l d vStmt = ( v1Stmt ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

190 then ( ) ; 192 194

case ( 1 1 , ) // #l i n e 74 ” p a r s e r 1 0 . y” equation

E.2. PARSECODE10.MO

196 198 200

// r e d u c e v1String : : skString = skString ; v2IdentLst : : skIdentLst = skIdentLst ; // b u i l d vStmt = AbsynPAM .READ( ( v 2 I d e n t L s t ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

202 then ( ) ; 204 206 208 210 212

case ( 1 2 , ) // #l i n e 77 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; v2IdentLst : : skIdentLst = skIdentLst ; // b u i l d vStmt = AbsynPAM .WRITE( ( v 2 I d e n t L s t ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

214 then ( ) ; 216 218 220 222 224 226 228 230 232 234 236 238 240 242 244 246 248

case ( 1 3 , ) // #l i n e 80 ” p a r s e r 1 0 . y” equation // r e d u c e v1Ident : : skIdent = skIdent ; // b u i l d vIdentLst = ( v1Ident ) : : { } ; // push R e s u l t s k I d e n t L s t= v I d e n t L s t : : s k I d e n t L s t ; then ( ) ; case ( 1 4 , ) // #l i n e 82 ” p a r s e r 1 0 . y” equation // r e d u c e v1Ident : : skIdent = skIdent ; v2IdentLst : : skIdentLst = skIdentLst ; // b u i l d vIdentLst = ( v1Ident ) : : ( v2IdentLst ) ; // push R e s u l t s k I d e n t L s t= v I d e n t L s t : : s k I d e n t L s t ; then ( ) ; case ( 1 5 , ) // #l i n e 85 ” p a r s e r 1 0 . y” equation // r e d u c e v1Ident : : skIdent = skIdent ; v2String : : skString = skString ; v3Exp : : skExp = skExp ; // b u i l d vStmt = AbsynPAM . ASSIGN ( ( v 1 I d e n t ) , ( v3Exp ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

250 then ( ) ;

161

162

APPENDIX E. SAMPLE OUTPUT

252 254 256 258 260 262

264 266 268 270 272 274 276 278 280

case ( 1 6 , ) // #l i n e 88 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; v2Exp : : skExp = skExp ; v3String : : skString = skString ; v4Stmt : : skStmt = skStmt ; v5String : : skString = skString ; // b u i l d vStmt = AbsynPAM . IF ( ( v2Exp ) , ( v4Stmt ) , AbsynPAM . SKIP () ) ; // push R e s u l t skStmt= vStmt : : skStmt ; then ( ) ; case ( 1 7 , ) // #l i n e 91 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; v2Exp : : skExp = skExp ; v3String : : skString = skString ; v4Stmt : : skStmt = skStmt ; v5String : : skString = skString ; v6Stmt : : skStmt = skStmt ; v7String : : skString = skString ; // b u i l d vStmt = AbsynPAM . IF ( ( v2Exp ) , ( v4Stmt ) , ( v6Stmt ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

282 then ( ) ; 284 286 288 290 292 294 296 298 300 302 304 306

case ( 1 8 , ) // #l i n e 94 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; v2Exp : : skExp = skExp ; v3String : : skString = skString ; v4Stmt : : skStmt = skStmt ; v5String : : skString = skString ; // b u i l d vStmt = AbsynPAM .TODO( ( v2Exp ) , ( v4Stmt ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ; then ( ) ; case ( 1 9 , ) // #l i n e 97 ” p a r s e r 1 0 equation // r e d u c e v1String : : skString = skString v2Exp : : skExp = skExp ; v3String : : skString = skString v4Stmt : : skStmt = skStmt ; v5String : : skString = skString

. y”

; ; ;

E.2. PARSECODE10.MO

308 310

163

// b u i l d vStmt = AbsynPAM .WHILE( ( v2Exp ) , ( v4Stmt ) ) ; // push R e s u l t skStmt= vStmt : : skStmt ;

312 then ( ) ; 314 316 318 320 322 324 326 328 330 332 334

case ( 2 0 , ) // #l i n e 100 ” p a r s e r 1 0 . y” equation // r e d u c e v1Exp : : skExp = skExp ; // b u i l d vExp = ( v1Exp ) ; // push R e s u l t skExp= vExp : : skExp ; then ( ) ; case ( 2 1 , ) // #l i n e 102 ” p a r s e r 1 0 . y” equation // r e d u c e v1Exp : : skExp = skExp ; v2BinOp : : skBinOp = skBinOp ; v3Exp : : skExp = skExp ; // b u i l d vExp = AbsynPAM . BINARY( ( v1Exp ) , ( v2BinOp ) , ( v3Exp ) ) ; // push R e s u l t skExp= vExp : : skExp ;

336 then ( ) ; 338 340 342 344 346 348 350 352 354 356 358

case ( 2 2 , ) // #l i n e 105 ” p a r s e r 1 0 . y” equation // r e d u c e v1Exp : : skExp = skExp ; // b u i l d vExp = ( v1Exp ) ; // push R e s u l t skExp= vExp : : skExp ; then ( ) ; case ( 2 3 , ) // #l i n e 107 ” p a r s e r 1 0 . y” equation // r e d u c e v1Exp : : skExp = skExp ; v2BinOp : : skBinOp = skBinOp ; v3Exp : : skExp = skExp ; // b u i l d vExp = AbsynPAM . BINARY( ( v1Exp ) , ( v2BinOp ) , ( v3Exp ) ) ; // push R e s u l t skExp= vExp : : skExp ;

360 then ( ) ; 362 364

case ( 2 4 , ) // #l i n e 110 ” p a r s e r 1 0 . y” equation

164

APPENDIX E. SAMPLE OUTPUT

370

// r e d u c e v1Integer : : skInteger = skInteger ; // b u i l d vExp = AbsynPAM . INT ( ( v 1 I n t e g e r ) ) ; // push R e s u l t skExp= vExp : : skExp ;

372

then ( ) ;

366 368

374 376 378 380

case ( 2 5 , ) // #l i n e 112 ” p a r s e r 1 0 . y” equation // r e d u c e v1Ident : : skIdent = skIdent ; // b u i l d vExp = AbsynPAM . IDENT ( ( v 1 I d e n t ) ) ; // push R e s u l t skExp= vExp : : skExp ;

382 then ( ) ; 384 386 388 390 392 394 396 398 400 402 404

406

case ( 2 6 , ) // #l i n e 114 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; v2Exp : : skExp = skExp ; v3String : : skString = skString ; // b u i l d vExp = ( v2Exp ) ; // push R e s u l t skExp= vExp : : skExp ; then ( ) ; case ( 2 7 , ) // #l i n e 117 ” p a r s e r 1 0 . y” equation // r e d u c e v1Exp : : skExp = skExp ; v2RelOp : : skRelOp = skRelOp ; v3Exp : : skExp = skExp ; // b u i l d vExp = AbsynPAM . RELATION( ( v1Exp ) , ( v2RelOp ) , ( v3Exp ) ) ; // push R e s u l t skExp= vExp : : skExp ;

408 then ( ) ; 410 412 414 416 418 420

case ( 2 8 , ) // #l i n e 120 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vIdent = ( v1String ) ; // push R e s u l t s k I d e n t= v I d e n t : : s k I d e n t ; then ( ) ;

E.2. PARSECODE10.MO

422 424 426 428

case ( 2 9 , ) // #l i n e 122 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vInteger = ( s t r i n gI n t ( v1String ) ) ; // push R e s u l t s k I n t e g e r= v I n t e g e r : : s k I n t e g e r ;

430 then ( ) ; 432 434 436 438 440 442 444 446 448 450

case ( 3 0 , ) // #l i n e 124 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vRelOp = AbsynPAM .EQ( ) ; // push R e s u l t skRelOp= vRelOp : : skRelOp ; then ( ) ; case ( 3 1 , ) // #l i n e 125 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vRelOp = AbsynPAM . LE ( ) ; // push R e s u l t skRelOp= vRelOp : : skRelOp ;

452 then ( ) ; 454 456 458 460 462 464 466 468 470 472

case ( 3 2 , ) // #l i n e 126 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vRelOp = AbsynPAM . LT( ) ; // push R e s u l t skRelOp= vRelOp : : skRelOp ; then ( ) ; case ( 3 3 , ) // #l i n e 127 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vRelOp = AbsynPAM .GT( ) ; // push R e s u l t skRelOp= vRelOp : : skRelOp ;

474 then ( ) ; 476 case ( 3 4 , ) // #l i n e 128 ” p a r s e r 1 0 . y”

165

166

478 480 482 484 486 488 490 492 494

APPENDIX E. SAMPLE OUTPUT

equation // r e d u c e v1String : : skString = skString ; // b u i l d vRelOp = AbsynPAM .GE( ) ; // push R e s u l t skRelOp= vRelOp : : skRelOp ; then ( ) ; case ( 3 5 , ) // #l i n e 129 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vRelOp = AbsynPAM .NE( ) ; // push R e s u l t skRelOp= vRelOp : : skRelOp ;

496 then ( ) ; 498 500 502 504 506 508 510 512 514 516

case ( 3 6 , ) // #l i n e 131 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vBinOp = AbsynPAM .ADD( ) ; // push R e s u l t skBinOp= vBinOp : : skBinOp ; then ( ) ; case ( 3 7 , ) // #l i n e 132 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vBinOp = AbsynPAM . SUB( ) ; // push R e s u l t skBinOp= vBinOp : : skBinOp ;

518 then ( ) ; 520 522 524 526 528 530 532 534

case ( 3 8 , ) // #l i n e 134 ” p a r s e r 1 0 . y” equation // r e d u c e v1String : : skString = skString ; // b u i l d vBinOp = AbsynPAM .MUL( ) ; // push R e s u l t skBinOp= vBinOp : : skBinOp ; then ( ) ; case ( 3 9 , ) // #l i n e 135 ” p a r s e r 1 0 . y” equation // r e d u c e

E.2. PARSECODE10.MO

536 538

167

v1String : : skString = skString ; // b u i l d vBinOp = AbsynPAM . DIV ( ) ; // push R e s u l t skBinOp= vBinOp : : skBinOp ;

540 then ( ) ; 542 544 546 548

550

case ( , ) equation print ( ”FAIL : I l e g a l a c t i o n ” ) ; //lAST = i n t S t r i n g ( a c t ) ; then ( ) ; end matchcontinue ; a s t S t k 2 := ASTSTACK( skRelOp , skBinOp , skExp , s k I d e n t , s k I d e n t L s t , skStmt , s k S t r i n g , s k I n t e g e r ) ; /∗ a s t S t k 2 := ASTSTACK( i d S t k , i n S t k , boStk , roStk , exStk , i l S t k , s t S t k ) ; ∗/

552

end a c t i o n R e d ;

554

function push input A s t S t a c k a s t S t k ; input String i n V a l ; output A s t S t a c k a s t S t k 2 ; // A s t S t a c k l i s t skRelOp ; l i s t skBinOp ; l i s t skExp ; l i s t s k I d e n t ; l i s t s k I d e n t L s t ; l i s t skStmt ; l i s t s k S t r i n g ; l i s t s k I n t e g e r ;

556 558 560 562 564 566 568

570

572

algorithm ASTSTACK( stackRelOp=skRelOp , stackBinOp=skBinOp , stackExp=skExp , s t a c k I d e n t=s k I d e n t , s t a c k I d e n t L s t=s k I d e n t L s t , s t a c k S t m t= skStmt , s t a c k S t r i n g=s k S t r i n g , s t a c k I n t e g e r=s k I n t e g e r ) := astStk ; s k S t r i n g := i n V a l : : s k S t r i n g ; a s t S t k 2 := ASTSTACK( skRelOp , skBinOp , skExp , s k I d e n t , s k I d e n t L s t , skStmt , s k S t r i n g , s k I n t e g e r ) ; end push ;

574 576 578

580 582 584

function printAST ” p r i n t t h e AST b u i l t by t h e p a r s i n g ” input A s t S t a c k a s t S t k ” MultiTypedStack used by t h e p a r s e r ” ; output AstTree a s t ” r e t u r n s t h e AST i n t h e f i n a l t y p e o f t h e tree ” ; l i s t r e t S t k ; algorithm ASTSTACK( s t a c k S t m t=r e t S t k ) := a s t S t k ; printAny ( a s t ) ; ast : : := r e t S t k ; end printAST ;

168

586 588

590 592

APPENDIX E. SAMPLE OUTPUT

function getSemValue ” r e t r i e v e s semval from t o k e n s ” input Integer t o k e n I d ; output String tokenSemValue ” r e t u r n s s e m a n t i c v a l u e o f t h e token ” ; array v a l u e s ; algorithm v a l u e s := l i s t A r r a y ( l s t S e m V a l u e ) ; tokenSemValue := v a l u e s [ t o k e n I d ] ; end getSemValue ;

594 596

end ParseCode10 ;

E.3

Token10.mo Listing E.3: Token10.mo

package Token10 // g e n e r a t e d by OMCC v0 . 7 . 7 F r i Apr 29 1 7 : 0 0 : 5 8 2011 2 4 6 8 10 12 14 16 18 20 22 24 26 28

c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer c o n s t a n t Integer end Token10 ;

E.4

T READ = 2 5 8 ; T WRITE = 2 5 9 ; T ASSIGN = 2 6 0 ; T IF = 2 6 1 ; T THEN = 2 6 2 ; T ENDIF = 2 6 3 ; T ELSE = 2 6 4 ; T TO = 2 6 5 ; T DO = 2 6 6 ; T END = 2 6 7 ; T WHILE = 2 6 8 ; T LPAREN = 2 6 9 ; T RPAREN = 2 7 0 ; T IDENT = 2 7 1 ; T INTCONST = 2 7 2 ; T EQ = 2 7 3 ; T LE = 2 7 4 ; T LT = 2 7 5 ; T GT = 2 7 6 ; T GE = 2 7 7 ; T NE = 2 7 8 ; T ADD = 2 7 9 ; T SUB = 2 8 0 ; T MUL = 2 8 1 ; T DIV = 2 8 2 ; T SEMIC = 2 8 3 ;

LexTable10.mo

g e n e r a t e d by OMCC v0

E.4. LEXTABLE10.MO

169

Listing E.4: LexTable10.mo 1

package LexTable10 // 1 7 : 0 0 : 5 8 2011

g e n e r a t e d by OMCC v0 . 7 F r i Apr 29

3 c o n s t a n t Integer y y l i m i t := 6 5 ; 5 c o n s t a n t Integer y y f i n i s h := 7 6 ; 7 9 11 13 15 17

c o n s t a n t l i s t y y a c c l i s t := 33 , 32 , 1, 32 , 18 , 14 , 32 , 15 , 32 , 17 , 32 , 20 , 32 , 22 , 32 , 32 , 11 , 32 , 11 , 32 , 32 , 30 , 32 , 31 , 32 , 27 , 12 , 13 , 21 , 23 , 11 , 7, 11 , 11 , 11 , 5, 11 , 11 , 11 , 11 , 11 , 9, 11 , 11 , 11 , 11

{ 32 , 32 , 25 , 11 , 30 , 24 , 11 , 11 , 6,

19 , 12 , 32 , 32 , 32 , 11 , 11 , 4, 11 ,

32 , 32 , 11 , 11 , 30 , 3, 28 , 11 , 2,

16 , 32 , 32 , 32 , 32 , 11 , 29 , 11 , 11 ,

32 , 26 , 11 , 11 , 1, 11 , 11 , 8, 10 ,

3, 22 , 42 , 56 , 68 , 80 ,

5, 24 , 44 , 57 , 69 , 82 ,

7, 26 , 46 , 58 , 70 , 84 ,

9, 28 , 48 , 60 , 71 , 85 ,

1, 1, 1, 1, 1, 10 , 16 , 16 , 16 , 1,

1, 1, 1, 1, 9, 10 , 16 , 16 , 16 , 17 ,

1, 1, 1, 1, 10 , 11 , 16 , 16 , 16 , 16 ,

2, 1, 1, 1, 10 , 12 , 16 , 16 , 16 , 16 ,

3, 1, 1, 4, 10 , 13 , 16 , 16 , 16 , 18 ,

16 , 28 , 1, 1, 1, 1, 1, 1, 1, 1,

16 , 16 , 1, 1, 1, 1, 1, 1, 1, 1,

23 , 16 , 1, 1, 1, 1, 1, 1, 1, 1,

16 , 29 , 1, 1, 1, 1, 1, 1, 1, 1,

24 , 16 , 1, 1, 1, 1, 1, 1, 1, 1,

19 }; 21 23 25 27 29

c o n s t a n t l i s t y y a c c e p t := { 1, 1, 1, 1, 1, 2, 11 , 13 , 15 , 17 , 19 , 20 , 30 , 32 , 34 , 36 , 38 , 40 , 50 , 51 , 52 , 53 , 54 , 55 , 61 , 62 , 64 , 65 , 66 , 67 , 73 , 74 , 75 , 76 , 77 , 79 , 86 , 88 , 90 , 92 , 92 };

31 33 35 37 39 41

c o n s t a n t l i s t y y e c := { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 5, 6, 7, 1, 8, 10 , 10 , 10 , 10 , 10 , 14 , 15 , 1, 1, 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 1, 1, 1, 1, 1,

43 45 47 49 51 53

19 , 25 , 16 , 1, 1, 1, 1, 1, 1, 1,

20 , 16 , 16 , 1, 1, 1, 1, 1, 1, 1,

16 , 16 , 1, 1, 1, 1, 1, 1, 1, 1,

21 , 26 , 1, 1, 1, 1, 1, 1, 1, 1,

22 , 27 , 1, 1, 1, 1, 1, 1, 1, 1,

170

APPENDIX E. SAMPLE OUTPUT

55

1, 1, 1, 1, 1, 1,

57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99

1, 1, 1, 1, 1, 1,

1, 1, 1, 1, 1, 1,

1, 1, 1, 1, 1, 1,

1, 1, 1, 1, 1, 1

1, 1, 1, 1, 1,

1, 1, 1, 1, 1,

1, 1, 1, 1, 1,

1, 1, 1, 1, 1,

1, 1, 1, 1, 1,

1, 2, 2,

1, 2, 2,

1, 2, 2,

1, 2, 2

2, 2,

76 , 76 , 23 , 76 , 36 , 28 , 30

36 , 26 , 76 , 76 , 76 , 0,

76 , 76 , 76 , 0, 76 , 0,

76 , 57 , 57 , 0, 38 , 28 ,

76 , 0, 59 , 36 , 34 , 16 ,

64 , 64 , 66 , 64 , 66 , 66 , 64

64 , 64 , 64 , 64 , 64 , 66 ,

64 , 64 , 64 , 66 , 64 , 66 ,

64 , 64 , 64 , 66 , 66 , 66 ,

64 , 66 , 64 , 66 , 66 , 66 ,

10 , 20 , 24 , 30 , 31 , 56 , 47 , 5, 64 , 64 ,

11 , 20 , 20 , 30 , 62 , 55 , 44 , 64 , 64 , 64 ,

12 , 21 , 25 , 31 , 61 , 54 , 43 , 64 , 64 , 64 ,

13 , 22 , 26 , 31 , 46 , 53 , 42 , 64 , 64 , 64 ,

14 , 20 , 28 , 35 , 27 , 52 , 39 , 64 , 64 , 64 ,

1, 1, 1, 3, 31 , 50 , 29 ,

1, 1, 1, 4, 59 , 49 , 25 ,

1, 1, 1, 7, 56 , 46 , 24 ,

1, 1, 1, 7, 26 , 45 , 23 ,

1, 1, 3, 17 , 65 , 44 , 21 ,

}; c o n s t a n t l i s t yy meta := { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, }; c o n s t a n t l i s t y y b a s e 0, 0, 27 , 28 , 76 , 76 , 68 , 63 , 45 , 19 , 49 , 49 , 43 , 76 , 54 , 76 , 44 , 0, 44 , 41 , 37 , 30 , 30 , 24 , 0, 0, 0, 76 , };

:= { 75 , 58 , 46 , 76 , 37 , 0, 49 ,

c o n s t a n t l i s t y y d e f := { 64 , 1, 65 , 65 , 64 , 64 , 64 , 64 , 64 , 64 , 66 , 66 , 66 , 66 , 66 , 64 , 64 , 64 , 64 , 64 , 66 , 66 , 66 , 66 , 66 , 66 , 66 , 66 , 66 , 66 , 66 , 66 , 66 , 0, 64 , }; c o n s t a n t l i s t y y n x t := { 6, 7, 7, 8, 9, 15 , 16 , 17 , 18 , 19 , 20 , 23 , 20 , 20 , 20 , 28 , 38 , 29 , 29 , 63 , 36 , 40 , 41 , 45 , 31 , 27 , 60 , 59 , 58 , 57 , 51 , 50 , 49 , 33 , 48 , 37 , 34 , 33 , 32 , 64 , 64 , 64 , 64 , 64 , 64 , 64 , 64 , 64 , 64 , 64 ,

101

64 ,

64 ,

64 ,

64 ,

64

}; 103 105 107 109 111

c o n s t a n t l i s t y y c hk := { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 66 , 3, 4, 60 , 17 , 22 , 22 , 26 , 31 , 65 , 54 , 53 , 52 , 51 , 43 , 41 , 40 , 33 , 30 ,

E.5. LEXERCODE10.MO

113

171

19 , 64 , 64 ,

15 , 64 , 64 ,

14 , 64 , 64 ,

13 , 64 , 64 ,

5, 64 , 64 ,

64 ,

64 ,

64 ,

64 ,

64

64 , 64 , 64 ,

64 , 64 , 64 ,

64 , 64 , 64 ,

64 , 64 , 64 ,

64 , 64 , 64 ,

115 117 119

}; end LexTable10 ;

E.5

LexerCode10.mo Listing E.5: LexerCode10.mo

1

package LexerCode10 // Generated 29 1 7 : 0 0 : 5 8 2011

3

/∗ Template f o r L e x e r Code r e p l a c e keywords : %LexerCode %time %Token %L e x e r %P a r s e T a b l e %c o n s t a n t %nameSpan %f u n c t i o n s %c a s e A c t i o n ∗/ import Types ; import Token10 ; import L e x e r 1 0 ; import P a r s e T a b l e 1 0 ;

5 7 9 11 13 15 17 19

g e n e r a t e d by OMCC v0 . 7 F r i Apr

21 23 25 27

29 31 33 35

37

function a c t i o n input Integer a c t ; input L e x e r 1 0 . Env env ; output Option t o k e n ; output L e x e r 1 0 . Env env2 ; Integer mm startSt , mm currSt , mm pos , mm sPos , mm ePos , mm linenr , m m f l i n e n r ; l i s t b u f f e r , b k B u f f e r , tb ; Types . I n f o i n f o ; array tokName ; String sToken , f i l e N m ; Integer nameSpan , a c t 2 ; Boolean debug ; algorithm L e x e r 1 0 .ENV( s t a r t S t=mm startSt , c u r r S t=mm currSt , pos=mm pos , sPos=mm sPos , ePos=mm ePos , l i n e n r=mm linenr , b u f f=b u f f e r , bkBuf=b k B u f f e r , i s D e b u g g i n g= debug , f i l e N a m e=f i l e N m ) := env ;

172

39 41 43 45

47

APPENDIX E. SAMPLE OUTPUT

b u f f e r := l i s t R e v e r s e ( b u f f e r ) ; tb := b u f f e r ; sToken := L e x e r 1 0 . p r i n t B u f f e r ( tb , ” ” ) ; tokName := l i s t A r r a y ( P a r s e T a b l e 1 0 . yytname ) ; nameSpan := 2 5 5 ; tb := b u f f e r ; // ( tb , m m f l i n e n r ) := L e x e r 1 0 . l i n e U p d ( tb , m m f l i n e n r ) ; i n f o := L e x e r 1 0 . g e t I n f o ( tb , mm sPos , mm linenr , f i l e N m ) ; // i n f o := Types . INFO( fileNm , f a l s e , mm linenr , mm sPos , m m f l i n e n r , mm pos ) ; // p r i n t ( ” \ n” + i n t S t r i n g ( a c t ) + ” : ” ) ; a c t 2 := a c t ;

49 51 53 55 57

59 61 63

65 67

69 71 73

75 77

79 81 83

85 87

( t o k e n ) := matchcontinue ( a c t ) local Types . Token t o k ; case ( 1 ) // #l i n e 35 ” l e x e r 1 0 . l ” then (NONE( ) ) ; case ( 2 ) // #l i n e 36 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T WHILE ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 3 ) // #l i n e 37 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T DO ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 4 ) // #l i n e 38 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T ELSE ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 5 ) // #l i n e 39 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T END ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 6 ) // #l i n e 40 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T ENDIF ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 7 ) // #l i n e 41 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T IF ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 8 ) // #l i n e 42 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T READ ;

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

E.5. LEXERCODE10.MO

89 91 93

95 97

99 101 103

105 107

109 111 113

115 117

119 121 123

125 127

129 131 133

t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 9 ) // #l i n e 43 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T THEN ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 0 ) // #l i n e 44 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T WRITE ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 1 ) // #l i n e 45 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T IDENT ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 2 ) // #l i n e 46 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T INTCONST ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 3 ) // #l i n e 47 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T ASSIGN ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 4 ) // #l i n e 48 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T ADD ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 5 ) // #l i n e 49 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T SUB ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 6 ) // #l i n e 50 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T MUL ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 7 ) // #l i n e 51 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T DIV ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ;

173

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

174

135 137

139 141 143

145 147

149 151 153

155 157

159 161 163

165 167

169 171 173

175 177

179 181

APPENDIX E. SAMPLE OUTPUT

case ( 1 8 ) // #l i n e 52 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T LPAREN ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 1 9 ) // #l i n e 53 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 .T RPAREN; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 0 ) // #l i n e 54 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T LT ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 1 ) // #l i n e 55 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T LE ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 2 ) // #l i n e 56 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T EQ ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 3 ) // #l i n e 57 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T NE ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 4 ) // #l i n e 58 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T GE ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 5 ) // #l i n e 59 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T GT ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 6 ) // #l i n e 60 ” l e x e r 1 0 . l ” equation a c t 2 = Token10 . T SEMIC ; t o k = Types .TOKEN( tokName [ a c t 2 −nameSpan ] info ) ; then (SOME( t o k ) ) ; case ( 2 7 ) // #l i n e 63 ” l e x e r 1 0 . l ” equation mm startSt = 3 ;

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

, act2 , b u f f e r ,

E.5. LEXERCODE10.MO

183 185 187 189 191 193 195 197 199

201 203

205

207

then (NONE( ) ) ; case ( 2 8 ) // #l i n e 68 equation mm startSt = 1 ; then (NONE( ) ) ; case ( 2 9 ) // #l i n e 69 then (NONE( ) ) ; case ( 3 0 ) // #l i n e 70 then (NONE( ) ) ; case ( 3 1 ) // #l i n e 71 then (NONE( ) ) ; case ( 3 2 ) // #l i n e 78 then (NONE( ) ) ;

175

” lexer10 . l ”

” lexer10 . l ” ” lexer10 . l ” ” lexer10 . l ” ” lexer10 . l ”

case ( ) equation // p r i n t ( ” [ e n t e r e l s e ] ” ) ; print ( ”ERROR TOKEN NOT FOUND: [ ’ ” + sToken + ” ’ TK: ” + intString ( a c t ) + ” , ” + tokName [ a c t 2 ] + ” ] ” ) ; t o k = Types .TOKEN( tokName [ a c t 2 ] , a ct , b u f f e r , i n f o ) ; then (NONE( ) ) ; end matchcontinue ; env2 := L e x e r 1 0 .ENV( mm startSt , mm startSt , mm pos , mm sPos , mm sPos , mm linenr , { } , b k B u f f e r , { mm startSt } , debug , f i l e N m ); i f ( debug==true ) then print ( ” \n [TOKEN: ’ ” + sToken + ” ’ ( ”+ intString ( mm sPos ) + ” : ” + intString ( mm linenr ) +” ) i d : ” + intString ( a c t 2 ) + ” ] ”) ; end i f ; end a c t i o n ;

209 211 end LexerCode10 ;

Appendix F

Modelica Grammar F.1

lexerModelica.l Listing F.1: lexerModelica.l

%{ 2 %} 4 6

%x c comment %x c l i n e c o m m e n t %x c s t r i n g

8 10 12 14 16 18 20 22 24 26

whitespace [ \ t \n]+ letter [ a−zA−Z ] wild [ ] ident ({ l e t t e r }|{ wild }) ({ l e t t e r }|{ d i g i t }|{ wild }) ∗ digit [0 −9] digits { d i g i t }+ exponent ( [ e ] | [ E ] ) ( [ + ] | [ − ] ) ?{ d i g i t s } real { d i g i t s } [ \ . ] ( { d i g i t s } ) ? ( { exponent } ) ? real2 { d i g i t s }{ exponent } endif ” end ” { w h i t e s p a c e } ” i f ” endfor ” end ” { w h i t e s p a c e } ” f o r ” endwhile ” end ” { w h i t e s p a c e } ” w h i l e ” endwhen ” end ” { w h i t e s p a c e } ”when” endmatch ” end ” { w h i t e s p a c e } ” match ” e n d m a t c h c o n t i n u e ” end ” { w h i t e s p a c e } ” m a t c h c o n t i n u e ” endident ” end ” { w h i t e s p a c e }{ i d e n t } /∗ Lex s t y l e ∗/

l e x i c a l s y n t a x o f t o k e n s i n t h e MODELICA l a n g u a g e

28 30

%%

32

{ whitespace } ;

176

F.1. LEXERMODELICA.L

34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88

{ real } return UNSIGNED REAL ; { real2 } return UNSIGNED REAL ; { endif } return ENDIF ; { endfor } return ENDFOR; { endwhile } return ENDWHILE; { endwhen } return ENDWHEN; { endmatchcontinue } return ENDMATCHCONTINUE; { endmatch } return ENDMATCH; { endident } return ENDCLASS ; ” a l g o r i t h m ” return T ALGORITHM; ” and ” return T AND ; ” a n n o t a t i o n ” return T ANNOTATION; ” b l o c k ” return BLOCK; ” c l a s s ” return CLASS ; ” c o n n e c t ” return CONNECT; ” c o n n e c t o r ” return CONNECTOR; ” c o n s t a n t ” return CONSTANT; ” d i s c r e t e ” return DISCRETE ; ” d e r ” return DER; ” d e f i n e u n i t ” return DEFINEUNIT ; ” each ” return EACH; ” e l s e ” return ELSE ; ” e l s e i f ” return ELSEIF ; ” e l s e w h e n ” return ELSEWHEN; ” end ” return T END ; ” e n u m e r a t i o n ” return ENUMERATION; ” e q u a t i o n ” return EQUATION; ” e n c a p s u l a t e d ” return ENCAPSULATED; ” e x p a n d a b l e ” return EXPANDABLE; ” e x t e n d s ” return EXTENDS; ” c o n s t r a i n e d b y ” return CONSTRAINEDBY; ” e x t e r n a l ” return EXTERNAL; ” f a l s e ” return T FALSE ; ” f i n a l ” return FINAL ; ” f l o w ” return FLOW; ” f o r ” return FOR; ” f u n c t i o n ” return FUNCTION; ” i f ” return IF ; ” i mp ort ” return IMPORT; ” i n ” return T IN ; ” i n i t i a l ” return INITIAL ; ” i n n e r ” return INNER ; ” i n p u t ” return T INPUT ; ” l o o p ” return LOOP; ” model ” return MODEL; ” not ” return T NOT ; ” o u t e r ” return T OUTER; ” o p e r a t o r ” return OPERATOR; ” o v e r l o a d ” return OVERLOAD; ” o r ” return T OR ; ” ou tp ut ” return T OUTPUT; ” package ” return T PACKAGE; ” p a r a m e t e r ” return PARAMETER; ” p a r t i a l ” return PARTIAL ; ” p r o t e c t e d ” return PROTECTED; ” p u b l i c ” return PUBLIC ; ” r e c o r d ” return RECORD;

177

178

90 92 94 96 98 100

APPENDIX F. MODELICA GRAMMAR

” r e d e c l a r e ” return REDECLARE; ” r e p l a c e a b l e ” return REPLACEABLE; ” r e s u l t s ” return RESULTS ; ” then ” return THEN; ” t r u e ” return T TRUE ; ” t y p e ” return TYPE; ” u n s i g n e d r e a l ” return UNSIGNED REAL ; ”when” return WHEN; ” w h i l e ” return WHILE; ” w i t h i n ” return WITHIN ; ” r e t u r n ” return RETURN; ” b r e a k ” return BREAK;

102 104 106 108 110 112 114

” ( ” return LPAR; ” ) ” return RPAR; ” [ ” return LBRACK; ” ] ” return RBRACK; ” { ” return LBRACE; ” } ” return RBRACE; ”==” return EQEQ; ”=” return EQUALS; ” , ” return COMMA; ” := ” return ASSIGN ; ” : : ” return COLONCOLON; ” : ” return COLON; ” ; ” return SEMICOLON;

116 118 120

”Code” return CODE; ” $Code ” return CODE; ”$TypeName” return CODE NAME; ”$Exp” return CODE EXP; ” $Var ” return CODE VAR;

122 124 126 128 130 132 134 136 138

” pure ” return PURE; ” impure ” return IMPURE; ” .+ ” ”.−” ” .∗ ” ” ./ ” ” .ˆ ”

return return return return return

PLUS EW ; MINUS EW ; STAR EW; SLASH EW ; POWER EW;

”∗” ”−” ”+” ”=”

return STAR; return MINUS ; return PLUS ; return LESSEQ ; return LESSGT ; return LESS ; return GREATER; return GREATEREQ;

140 142 144 146

” ˆ ” return POWER; ” / ” return SLASH ; ” a s ” return AS ; ” c a s e ” return CASE ; ” e q u a l i t y ” return EQUALITY;

F.1. LEXERMODELICA.L

156

” f a i l u r e ” return FAILURE ; ” guard ” return GUARD; ” l o c a l ” return LOCAL; ” match ” return MATCH; ” m a t c h c o n t i n u e ” return MATCHCONTINUE; ” u n i o n t y p e ” return UNIONTYPE; ” ” return ALLWILD; ” ” return WILD; ” s u b t y p e o f ” return SUBTYPEOF; ”\%” return MOD;

158

” stream ” return STREAM;

160

” \ . ” return DOT;

162

%” [ \ ” ] [ ˆ \ ” ] ∗ [ \ ” ] ”

164

{ ident } {digits}

return IDENT ; return UNSIGNED INTEGER ;

” \” ”

{

148 150 152 154

return STRING ;

166 168

BEGIN( c s t r i n g ) k e e p B u f f e r ; }

170

180

{ ” \\\” ” { keepBuffer ; } ” \” ” { BEGIN( INITIAL ) return STRING ; } [ˆ\n ] { keepBuffer ; } ; \n { keepBuffer ; } ; { y y e r r o r ( ” Unterminated s t r i n g ” ) ; yyterminate () ; } }

182

” /\∗ ”

172 174 176 178

{ BEGIN( c comment ) ; }

184 186 188 190 192 194

{ ” \∗/ ” { BEGIN( INITIAL ) ; } ” /\∗ ” { y y e r r o r ( ” S u s p i c i o u s comment” ) ; } [ˆ\n ] ; \n ; { y y e r r o r ( ” Unterminated comment” ) ; yyterminate () ; } }

196 ” // ”

{

198

BEGIN( c l i n e c o m m e n t ) k e e p B u f f e r ; }

200 202

{ \n { BEGIN( INITIAL ) ; }

179

180

APPENDIX F. MODELICA GRAMMAR

[ˆ\n ]

204

;

} 206 208 %%

F.2

parserModelica.y Listing F.2: parserModelica.y

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43

%{ import Absyn ; /∗ Type D e c l a r a t i o n s ∗/ t y p e AstTree = Absyn . Program ; t y p e Token = OMCCTypes . Token ; t y p e Program = Absyn . Program ; t y p e Within = Absyn . Within ; t y p e l s t C l a s s = l i s t ; t y p e C l a s s = Absyn . C l a s s ; t y p e I d e n t = Absyn . I d e n t ; t y p e Path = Absyn . Path ; t y p e C l a s s D e f = Absyn . C l a s s D e f ; t y p e C l a s s P a r t = Absyn . C l a s s P a r t ; t y p e C l a s s P a r t s = l i s t ; t y p e Import = Absyn . Import ; t y p e ElementItem = Absyn . ElementItem ; t y p e ElementItems = l i s t ; t y p e Element = Absyn . Element ; t y p e ElementSpec = Absyn . ElementSpec ; t y p e E l e m e n t A t t r i b u t e s = Absyn . E l e m e n t A t t r i b u t e s ; t y p e Comment = Absyn . Comment ; t y p e D i r e c t i o n = Absyn . D i r e c t i o n ; t y p e Exp = Absyn . Exp ; t y p e Exps = l i s t ; t y p e Matrix = l i s t ; t y p e S u b s c r i p t = Absyn . S u b s c r i p t ; t y p e ArrayDim = l i s t ; t y p e O p e r a t o r = Absyn . O p e r a t o r ; t y p e Case = Absyn . Case ; t y p e Cases = l i s t ; t y p e MatchType = Absyn . MatchType ; t y p e R e s t r i c t i o n = Absyn . R e s t r i c t i o n ; t y p e I n n e r O u t e r = Absyn . I n n e r O u t e r ; t y p e ComponentRef = Absyn . ComponentRef ; t y p e V a r i a b i l i t y = Absyn . V a r i a b i l i t y ; t y p e RedeclareKeywords = Absyn . RedeclareKeywords ; t y p e NamedArg=Absyn . NamedArg ; t y p e TypeSpec=Absyn . TypeSpec ; t y p e TypeSpecs=l i s t ; t y p e ComponentItem=Absyn . ComponentItem ; t y p e ComponentItems=l i s t ; t y p e Component=Absyn . Component ; t y p e EquationItem = Absyn . EquationItem ;

F.2. PARSERMODELICA.Y

181

75

type type type type type type type type type type type type type type type type type type type type type type type type type type type type type type type type

77

c o n s t a n t l i s t l s t S e m V a l u e 3 = { } ;

79

c o n s t a n t l i s t l s t S e m V a l u e = { ” e r r o r ” , ” $ u n d e f i n e d ” , ”ALGORITHM” , ”AND” , ”ANNOTATION” , ”BLOCK” , ”CLASS” , ”CONNECT” , ”CONNECTOR” , ”CONSTANT” , ” DISCRETE” , ”DER” , ”DEFINEUNIT” , ”EACH” , ”ELSE” , ”ELSEIF” , ”ELSEWHEN” , ”END” , ”ENUMERATION” , ”EQUATION” , ”ENCAPSULATED” , ”EXPANDABLE” , ” EXTENDS” , ”CONSTRAINEDBY” , ”EXTERNAL” , ”FALSE” , ”FINAL” , ”FLOW” , ”FOR” , ”FUNCTION” , ” IF ” , ”IMPORT” , ”IN” , ”INITIAL” , ”INNER” , ”INPUT” , ”LOOP” , ”MODEL” , ”NOT” , ”OUTER” , ”OPERATOR” , ”OVERLOAD” , ”OR” , ”OUTPUT” , ”PACKAGE” , ”PARAMETER” , ”PARTIAL” , ”PROTECTED” , ” PUBLIC” , ”RECORD” , ”REDECLARE” , ”REPLACEABLE” , ”RESULTS” , ”THEN” , ”TRUE ”, ”TYPE” , ”REAL” , ”WHEN” , ”WHILE” , ”WITHIN” , ”RETURN” , ”BREAK” , ” . ” , ” ( ” , ” ) ” , ” [ ” , ” ] ” , ” { ” , ” } ” , ”=” , ”ASSIGN” , ”COMMA” , ”COLON” , ”SEMICOLON” , ”CODE” , ”CODE NAME” , ”CODE EXP” , ”CODE VAR” , ”PURE” , ”IMPURE” , ” I d e n t i t y ” , ”DIGIT” , ”INTEGER” , ” ∗ ” , ”−” , ”+” , ”=” , ”==” , ” ˆ ” , ”SLASH” , ”STRING” , ” .+ ” , ”.−” , ” . ∗ ” , ” . / ” , ” . ∗ ” , ”STREAM” , ”AS” , ”CASE” , ”EQUALITY” ,

45 47 49 51 53 55 57 59 61 63 65 67 69 71 73

81

83

85 87

89 91

93 95

E q u a t i o n I t e m s = l i s t ; Equation = Absyn . Equation ; E l s e i f = t u p l e ; E l s e i f s = l i s t ; F o r I t e r a t o r= Absyn . F o r I t e r a t o r ; F o r I t e r a t o r s = l i s t ; Elsewhen = t u p l e ; Elsewhens = l i s t ; F u n c t i o n A r g s = Absyn . F u n c t i o n A r g s ; NamedArgs = l i s t ; AlgorithmItem = Absyn . AlgorithmItem ; A l g o r i t h m I t e m s = l i s t ; Algorithm = Absyn . Algorithm ; A l g E l s e i f = t u p l e ; A l g E l s e i f s = l i s t
; AlgElsewhen = t u p l e ; AlgElsewhens = l i s t ; E x p E l s e i f = t u p l e ; E x p E l s e i f s = l i s t ; EnumDef = Absyn . EnumDef ; EnumLiteral = Absyn . EnumLiteral ; E n u m L i t e r a l s = l i s t ; M o d i f i c a t i o n = Absyn . M o d i f i c a t i o n ; Boolean3 = t u p l e ; Boolean2 = t u p l e ; ElementArg = Absyn . ElementArg ; ElementArgs = l i s t ; Each = Absyn . Each ; EqMod=Absyn . EqMod ; ComponentCondition = Absyn . ComponentCondition ; E x t e r n a l D e c l = Absyn . E x t e r n a l D e c l ; A nno tat io n = Absyn . A nno tat ion ;

182

”FAILURE” , ”GUARD” , ”LOCAL” , ”MATCH” , ”MATCHCONTINUE” , ” UNIONTYPE” , ”ALLWILD” , ”WILD” , ”SUBTYPEOF” , ”COLONCOLON” , ”MOD” , ”ENDIF” , ”ENDFOR” , ”ENDWHILE” , ”ENDWHEN” , ”ENDCLASS” , ”ENDMATCHCONTINUE” , ” ENDMATCH” , ” $accept ” , ” program ” , ” w i t h i n ” , ” c l a s s e s l i s t ” , ” c l a s s ” , ” c l a s s p r e f i x ” , ” encapsulated ” , ” p a r t i a l ” , ” r e s t r i c t i o n ” , ” c l a s s d e f ” , ” c l a s s d e f e n u m e r a t i o n ” , ” c l a s s d e f d e r i v e d ” , ” enumeration ” , ” enumlist ” , ” enumliteral ” , ” classparts ” , ” classpart ” , ” restClass ” , ” algorithmsection ” , ” algorithmitem ” , ” algorithm ” , ” if algorithm ” , ” a l g e l s e i f s ” , ” a l g e l s e i f ” , ” when algorithm ” , ” algelsewhens ” , ” algelsewhen ” , ” equationsection ” , ” equationitem ” , ” equation ” , ” when equation ” , ” elsewhens ” , ” elsewhen ” , ” f o r i t e r a t o r s ” , ” foriterator” , ” i f e q u a t i o n ” , ” e l s e i f s ” , ” e l s e i f ” , ” elementItems ” , ” elementItem ” , ” e l e m e n t ” , ” c o m p o n e n t c l a u s e ” , ” componentitems ” , ” componentitem ”, ” component ” , ” m o d i f i c a t i o n ” , ” r e d e c l a r e k e y w o r d s ” , ” i n n e r o u t e r ” , ” i m p o r t e l e m e n t s p e c ” , ” c l a s s e l e m e n t s p e c ” , ” i mpo rt ” , ” elementspec ” , ” elementAttr ” , ” v a r i a b i l i t y ” , ” d i r e c t i o n ” , ” typespec ” , ” arrayComplex ” , ” t y p e s p e c s ” , ” a r r a y S u b s c r i p t s ” , ” arrayDim ” , ” f u n c t i o n c a l l ” , ” f u n c t i o n a r g s ” , ” namedargs ” , ” namedarg ” , ” exp ” , ” matchcont ” , ” if exp ” , ” e x p e l s e i f s ” , ” e x p e l s e i f ” , ” matchlocal ” , ” cases ” , ” case ” , ” casearg ” , ” simpleExp ” , ” h e a d t a i l ” , ” rangeExp ” , ” l o g i c e x p ” , ” l o g i c t e r m ” , ” l o g f a c t o r ” , ” r e l t e r m ” , ” addterm ” , ” term ” , ” f a c t o r ” , ” expElement ” , ” tuple ” , ” e x p l i s t ” , ” e x p l i s t 2 ” , ” c r e f ” , ” woperator ” , ” soperator ” , ” power ” , ” r e l O p e r a t o r ” , ” path ” , ” i d e n t ” , ” s t r i n g ” , ”comment” } ;

97

99 101

103

105 107

109

111

113

115

117

119 121

%}

123

%t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n

125 127 129 131 133 135 137

APPENDIX F. MODELICA GRAMMAR

T ALGORITHM T AND T ANNOTATION BLOCK CLASS CONNECT CONNECTOR CONSTANT DISCRETE DER DEFINEUNIT EACH ELSE ELSEIF ELSEWHEN

F.2. PARSERMODELICA.Y

139 141 143 145 147 149 151 153 155 157 159 161 163 165 167 169 171 173 175 177 179 181 183 185 187 189 191 193

%t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n

T END ENUMERATION EQUATION ENCAPSULATED EXPANDABLE EXTENDS CONSTRAINEDBY EXTERNAL T FALSE FINAL FLOW FOR FUNCTION IF IMPORT T IN INITIAL INNER T INPUT LOOP MODEL T NOT T OUTER OPERATOR OVERLOAD T OR T OUTPUT T PACKAGE PARAMETER PARTIAL PROTECTED PUBLIC RECORD REDECLARE REPLACEABLE RESULTS THEN T TRUE TYPE UNSIGNED REAL WHEN WHILE WITHIN RETURN BREAK DOT LPAR RPAR LBRACK RBRACK LBRACE RBRACE EQUALS ASSIGN COMMA COLON SEMICOLON

183

184

195 197 199 201 203 205

%t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n %t o k e n

APPENDIX F. MODELICA GRAMMAR

CODE CODE NAME CODE EXP CODE VAR PURE IMPURE IDENT DIGIT UNSIGNED INTEGER

215

%t o k e n STAR %t o k e n MINUS %t o k e n PLUS %t o k e n LESSEQ %t o k e n LESSGT %t o k e n LESS %t o k e n GREATER %t o k e n GREATEREQ %t o k e n EQEQ %t o k e n POWER %t o k e n SLASH

217

%t o k e n STRING

219

223

%t o k e n %t o k e n %t o k e n %t o k e n %t o k e n

225

%t o k e n STREAM

227

%t o k e n AS %t o k e n CASE %t o k e n EQUALITY %t o k e n FAILURE %t o k e n GUARD %t o k e n LOCAL %t o k e n MATCH %t o k e n MATCHCONTINUE %t o k e n UNIONTYPE %t o k e n ALLWILD %t o k e n WILD %t o k e n SUBTYPEOF %t o k e n COLONCOLON %t o k e n MOD %t o k e n ENDIF %t o k e n ENDFOR %t o k e n ENDWHILE %t o k e n ENDWHEN %t o k e n ENDCLASS %t o k e n ENDMATCHCONTINUE %t o k e n ENDMATCH //%e x p e c t 42

207 209 211 213

221

229 231 233 235 237 239 241 243 245 247 249 251

PLUS EW MINUS EW STAR EW SLASH EW POWER EW

F.2. PARSERMODELICA.Y

185

%% 253 /∗ Yacc BNF grammar o f t h e M o d e l i c a+MetaModelica l a n g u a g e ∗/ 255 program 257

259

:

classes list { ( a b s y n t r e e ) [ Program ] = Absyn . PROGRAM( $1 [ l s t C l a s s ] , Absyn . TOP( ) , Absyn .TIMESTAMP( System . getCurrentTime ( ) , System . getCurrentTime ( ) ) ) ; } | within c l a s s e s l i s t { ( a b s y n t r e e ) [ Program ] = Absyn . PROGRAM( $2 [ l s t C l a s s ] , $1 [ Within ] , Absyn .TIMESTAMP( System . getCurrentTime ( ) , System . getCurrentTime ( ) ) ) ; }

261 within : WITHIN path SEMICOLON { $$ [ Within ] = Absyn . WITHIN( $2 [ Path ] ) ; } 263 classes list Class ] : : { } ; }

: c l a s s SEMICOLON { $$ [ l s t C l a s s ] = $1 [ | c l a s s SEMICOLON c l a s s e s l i s t { $$ [ l s t C l a s s ] = $1 [ C l a s s ] : : $2 [ l s t C l a s s ] ; } /∗ r e s t r i c t i o n IDENT c l a s s d e f T END IDENT SEMICOLON { i f ( not s t r i n g E q u a l ( $2 , $5 ) ) then p r i n t ( Types . printInfoError ( info ) + ” E r r o r : The i d e n t i f i e r a t s t a r t and end a r e d i f f e r e n t ’ ” + $2 + ” ’ ” ) ; t r u e = ( $2 == $5 ) ; end i f ; $$ [ C l a s s ] = Absyn . CLASS( $2 , f a l s e , f a l s e , f a l s e , $1 [ R e s t r i c t i o n ] , $3 [ ClassDef ] , i n f o ) ; } ∗/

265

267

269

271 class 273

275

: r e s t r i c t i o n IDENT c l a s s d e f { $$ [ C l a s s ] = Absyn . CLASS( $2 , f a l s e , f a l s e , f a l s e , $1 [ R e s t r i c t i o n ] , $3 [ C l a s s D e f ] , info ) ; } | c l a s s p r e f i x r e s t r i c t i o n IDENT c l a s s d e f { ( v1Boolean , v2Boolean , v3Boolean ) = $1 [ Boolean3 ] ; $$ [ C l a s s ] = Absyn . CLASS( $3 , v3Boolean , v1Boolean , v2Boolean , $2 [ R e s t r i c t i o n ] , $4 [ C l a s s D e f ] , i n f o ) ; }

277 classdef 279

: s t r i n g ENDCLASS { $$ [ C l a s s D e f ] = Absyn .PARTS( { } , { } , SOME( $1 ) ) ; }

186

APPENDIX F. MODELICA GRAMMAR

| ENDCLASS { $$ [ C l a s s D e f ] = Absyn .PARTS( { } , { } , NONE( ) ) ; } | c l a s s p a r t s ENDCLASS { $$ [ C l a s s D e f ] = Absyn .PARTS( { } , $1 [ C l a s s P a r t s ] ,NONE( ) ) ; } | s t r i n g c l a s s p a r t s ENDCLASS { $$ [ C l a s s D e f ] = Absyn .PARTS( { } , $2 [ C l a s s P a r t s ] ,SOME( $1 ) ) ; } | classdefenumeration { $$ [ C l a s s D e f ] = $1 [ C l a s s D e f ] ; } ; | classdefderived { $$ [ C l a s s D e f ] = $1 [ C l a s s D e f ] ; } ;

281

283

285

287 289 291 classprefix

: FINAL e n c a p s u l a t e d p a r t i a l { $$ [ Boolean3 ] = ( true , $2 [ Boolean ] , $3 [ Boolean ] ) ; } | ENCAPSULATED p a r t i a l { $$ [ Boolean3 ] = ( f a l s e , true , $2 [ Boolean ]) ; } | PARTIAL { $$ [ Boolean3 ] = ( f a l s e , f a l s e , true ) ; }

encapsulated

: ENCAPSULATED { $$ [ Boolean ] = true ; } | /∗ empty ∗/ { $$ [ Boolean ] = f a l s e ; }

partial

: PARTIAL { $$ [ Boolean ] = true ; } | /∗ empty ∗/ { $$ [ Boolean ] = f a l s e ; }

293

295

297 299 301 303 305

final

: FINAL { $$ [ Boolean ] = true ; } | /∗ empty ∗/ { $$ [ Boolean ] = f a l s e ; }

restriction R CLASS ( ) ; }

: CLASS { $$ [ R e s t r i c t i o n ] = Absyn .

307

309

311

313

| MODEL { $$ [ Restriction ] = Absyn . R MODEL( ) ; } | RECORD { $$ [ Restriction ] = Absyn . R RECORD( ) ; } | T PACKAGE { $$ [ Restriction ] = Absyn . R PACKAGE( ) ; } | TYPE { $$ [ Restriction ] = Absyn . R TYPE ( ) ; } | FUNCTION { $$ [ Restriction ] = Absyn . R FUNCTION( )

F.2. PARSERMODELICA.Y

187

; } | UNIONTYPE { $$ [ Restriction ] = Absyn . R UNIONTYPE () ; } | BLOCK { $$ [ Restriction ] = Absyn . R BLOCK( ) ; } | CONNECTOR { $$ [ Restriction ] = Absyn . R CONNECTOR () ; } | EXPANDABLE CONNECTOR { $$ [ Restriction ] = Absyn . R EXP CONNECTOR () ; } | ENUMERATION { $$ [ Restriction ] = Absyn . R ENUMERATION () ; } | OPERATOR FUNCTION { $$ [ Restriction ] = Absyn . R OPERATOR FUNCTION () ; } | OPERATOR RECORD { $$ [ Restriction ] = Absyn . R OPERATOR RECORD () ; } | OPERATOR { $$ [ R e s t r i c t i o n ] = Absyn .R OPERATOR( ) ; }

315

317

319

321

323 325

classdefenumeration comment

:

EQUALS ENUMERATION LPAR e n u m e r a t i o n RPAR { $$ [ C l a s s D e f ] = Absyn .ENUMERATION( $4 [ EnumDef ] ,SOME( $6 [ Comment ] ) ) ; }

327 classdefderived 329

: EQUALS t y p e s p e c e l e m e n t a r g s 2 comment { $$ [ C l a s s D e f ] = Absyn . DERIVED( $2 [ TypeSpec ] , Absyn .ATTR( f a l s e , f a l s e , Absyn .VAR( ) , Absyn . BIDIR ( ) , { } ) , $3 [ ElementArgs ] ,SOME( $4 [ Comment ] ) ) ; }

188

| EQUALS e l e m e n t A t t r t y p e s p e c e l e m e n t a r g s 2 comment { $$ [ C l a s s D e f ] = Absyn . DERIVED( $3 [ TypeSpec ] , $2 [ E l e m e n t A t t r i b u t e s ] , $4 [ ElementArgs ] ,SOME( $5 [ Comment ] ) ) ; }

331

333

APPENDIX F. MODELICA GRAMMAR

enumeration : e n u m l i s t { $$ [ EnumDef ] = Absyn . ENUMLITERALS( $1 [ E n u m L i t e r a l s ] ) ; } | COLON { $$ [ EnumDef ] = Absyn .ENUM COLON( ) ; }

335

337

enumlist : e n u m l i t e r a l { $$ [ E n u m L i t e r a l s ] = $1 [ EnumLiteral ] : : { } ; } | e n u m l i t e r a l COMMA e n u m l i s t { $$ [ E n u m L i t e r a l s ] = $1 [ EnumLiteral ] : : $3 [ EnumLiterals ] ; }

339

enumliteral : i d e n t comment { $$ [ EnumLiteral ] = Absyn . ENUMLITERAL( $1 [ I d e n t ] ,SOME( $2 [ Comment ] ) ) ; }

341

classparts ]::{}; }

: c l a s s p a r t { $$ [ C l a s s P a r t s ] = $1 [ C l a s s P a r t | c l a s s p a r t c l a s s p a r t s { $$ [ C l a s s P a r t s ] = $1 [ C l a s s P a r t ] : : $2 [ C l a s s P a r t s ] ; }

343

345

classpart : e l e m e n t I t e m s { $$ [ C l a s s P a r t ] = Absyn . PUBLIC( $1 [ ElementItems ] ) ; } | r e s t C l a s s { $$ [ C l a s s P a r t ] = $1 [ C l a s s P a r t ]; }

347 349

351

353

355

357

restClass : PUBLIC e l e m e n t I t e m s { $$ [ C l a s s P a r t ] = Absyn . PUBLIC( $1 [ ElementItems ] ) ; } | PUBLIC { $$ [ C l a s s P a r t ] = Absyn . PUBLIC ( { } ) ; } // adds s h i f t / r e d u c e conflicts | PROTECTED e l e m e n t I t e m s { $$ [ C l a s s P a r t ] = Absyn .PROTECTED( $1 [ ElementItems ] ) ; } | PROTECTED { $$ [ C l a s s P a r t ] = Absyn . PROTECTED( { } ) ; } // adds s h i f t / reduce c o n f l i c t s | EQUATION { $$ [ C l a s s P a r t ] = Absyn . EQUATIONS( { } ) ; } | EQUATION e q u a t i o n s e c t i o n { $$ [ C l a s s P a r t ] = Absyn .EQUATIONS( $1 [ EquationItems ] ) ; } | INITIAL EQUATION e q u a t i o n s e c t i o n { $$ [ C l a s s P a r t ] = Absyn . INITIALEQUATIONS( $1 [ E q u a t i o n I t e m s ] ) ; } | T ALGORITHM { $$ [ C l a s s P a r t ] = Absyn . ALGORITHMS( { } ) ; } | T ALGORITHM a l g o r i t h m s e c t i o n { $$ [ C l a s s P a r t ] = Absyn .ALGORITHMS( $1 [ AlgorithmItems ] ) ; }

F.2. PARSERMODELICA.Y

| INITIAL T ALGORITHM a l g o r i t h m s e c t i o n { $$ [ C l a s s P a r t ] = Absyn . INITIALALGORITHMS( $1 [ A l g o r i t h m I t e m s ]) ; } | e x t e r n a l { $$ [ C l a s s P a r t ] = $1 [ ClassPart ] ; }

359

361

189

external : EXTERNAL e x t e r n a l D e c l SEMICOLON { $$ [ C l a s s P a r t ] = Absyn .EXTERNAL( $2 [ E x t e r n a l D e c l ] ,NONE( ) ) ; } | EXTERNAL e x t e r n a l D e c l SEMICOLON a n n o t a t i o n SEMICOLON { $$ [ C l a s s P a r t ] = Absyn .EXTERNAL( $2 [ E x t e r n a l D e c l ] , SOME( $3 [ An not ati on ] ) ) ; }

363

365

367

externalDecl : s t r i n g { $$ [ E x t e r n a l D e c l ] = Absyn . EXTERNALDECL(NONE( ) ,SOME( $1 ) ,NONE( ) , { } ,NONE( ) ) ; } | s t r i n g c r e f EQUALS i d e n t LPAR e x p l i s t 2 RPAR { $$ [ E x t e r n a l D e c l ] = Absyn . EXTERNALDECL(SOME( $4 [ I d e n t ] ) ,SOME( $1 ) ,SOME( $2 [ ComponentRef ] ) , $6 [ Exps ] ,NONE () ) ; } | s t r i n g c r e f EQUALS i d e n t LPAR e x p l i s t 2 RPAR a n n o t a t i o n { $$ [ E x t e r n a l D e c l ] = Absyn .EXTERNALDECL(SOME( $4 [ I d e n t ] ) , SOME( $1 ) ,SOME( $2 [ ComponentRef ] ) , $6 [ Exps ] ,SOME( $8 [ An no tat ion ] ) ) ; } | s t r i n g i d e n t LPAR e x p l i s t 2 RPAR a n n o t a t i o n { $$ [ E x t e r n a l D e c l ] = Absyn .EXTERNALDECL(SOME( $2 [ I d e n t ] ) ,SOME( $1 ) ,NONE( ) , $4 [ Exps ] ,SOME( $6 [ A nn ota tio n ]) ) ; } | s t r i n g i d e n t LPAR e x p l i s t 2 RPAR { $$ [ E x t e r n a l D e c l ] = Absyn .EXTERNALDECL( SOME( $2 [ I d e n t ] ) ,SOME( $1 ) ,NONE( ) , $4 [ Exps ] ,NONE( ) ) ; }

369 /∗ ALGORITHMS ∗/ 371

373

375

algorithmsection : a l g o r i t h m i t e m SEMICOLON { $$ [ A l g o r i t h m I t e m s ] = $1 [ AlgorithmItem ] : : { } ; } | a l g o r i t h m i t e m SEMICOLON a l g o r i t h m s e c t i o n { $$ [ A l g o r i t h m I t e m s ] = $1 [ AlgorithmItem ] : : $2 [ AlgorithmItems ] ; } algorithmitem

: algorithm comment { $$ [ AlgorithmItem ] = Absyn . ALGORITHMITEM( $1 [ Algorithm ] ,SOME( $2 [ Comment ] ) , i n f o ) ; }

377

379

algorithm : simpleExp ASSIGN exp // TOREV: c r e f o r any exp ? { $$ [ Algorithm ] = Absyn . ALG ASSIGN( Absyn .CREF( $1 [ ComponentRef ] ) , $2 [ Exp ] ) ; } { $$ [ Algorithm ] = Absyn . ALG ASSIGN( $1 [ Exp ] , $2 [ Exp ] ) ; } | cref functioncall

190

APPENDIX F. MODELICA GRAMMAR

381

| 383 | 385 | 387 | 389 | 391 | 393

| 395

397

399

401

403

{ $$ [ Algorithm ] = Absyn . ALG NORETCALL( $1 [ ComponentRef ] , $2 [ F u n c t i o n A r g s ] ) ; } t u p l e ASSIGN exp { $$ [ Algorithm ] = Absyn . ALG ASSIGN( $1 [ Exp ] , $3 [ Exp ] ) ; } RETURN { $$ [ Algorithm ] = Absyn .ALG RETURN( ) ; } BREAK { $$ [ Algorithm ] = Absyn .ALG BREAK( ) ; } if algorithm { $$ [ Algorithm ] = $1 [ Algorithm ] ; } when algorithm { $$ [ Algorithm ] = $1 [ Algorithm ] ; } FOR f o r i t e r a t o r s LOOP a l g o r i t h m s e c t i o n ENDFOR { $$ [ Algorithm ] = Absyn . ALG FOR( $3 [ F o r I t e r a t o r s ] , $5 [ A l g o r i t h m I t e m s ]) ; } WHILE exp LOOP a l g o r i t h m s e c t i o n ENDWHILE { $$ [ Algorithm ] = Absyn . ALG WHILE( $3 [ Exp ] , $5 [ A l g o r i t h m I t e m s ] ) ; }

if algorithm : IF exp THEN ENDIF { $$ [ Algorithm ] = Absyn . ALG IF ( $2 [ Exp ] , { } , { } , { } ) ; } // warning empty i f | IF exp THEN a l g o r i t h m s e c t i o n ENDIF { $$ [ Algorithm ] = Absyn . ALG IF ( $2 [ Exp ] , $4 [ AlgorithmItems ] , { } , { } ) ; } | IF exp THEN a l g o r i t h m s e c t i o n ELSE a l g o r i t h m s e c t i o n ENDIF { $$ [ Algorithm ] = Absyn . ALG IF ( $2 [ Exp ] , $4 [ A l g o r i t h m I t e m s ] , { } , $6 [ A l g o r i t h m I t e m s ]) ; } | IF exp THEN a l g o r i t h m s e c t i o n a l g e l s e i f s ENDIF { $$ [ Algorithm ] = Absyn . ALG IF ( $2 [ Exp ] , $4 [ A l g o r i t h m I t e m s ] , $5 [ AlgElseifs ] ,{}) ; } | IF exp THEN a l g o r i t h m s e c t i o n a l g e l s e i f s ELSE a l g o r i t h m s e c t i o n ENDIF { $$ [ Algorithm ] = Absyn . ALG IF ( $2 [ Exp ] , $4 [ A l g o r i t h m I t e m s ] , $5 [ A l g E l s e i f s ] , $7 [ AlgorithmItems ] ) ; } algelseifs AlgElseif ] : : { } ; }

: |

a l g e l s e i f { $$ [ A l g E l s e i f s ] = $1 [ a l g e l s e i f a l g e l s e i f s { $$ [ A l g E l s e i f s ] = $1 [ A l g E l s e i f ] : : $2 [ A l g E l s e i f s ] ; }

405 algelseif : ELSEIF exp THEN a l g o r i t h m s e c t i o n [ A l g E l s e i f ] = ( $2 [ Exp ] , $4 [ A l g o r i t h m I t e m s ] ) ; }

{ $$

407 when algorithm 409

:

WHEN exp THEN a l g o r i t h m s e c t i o n ENDWHEN { $$ [ Algorithm ] = Absyn .ALG WHEN A( $2 [ Exp ] , $4 [ A l g o r i t h m I t e m s ] , { } ) ; }

F.2. PARSERMODELICA.Y

| WHEN exp THEN a l g o r i t h m s e c t i o n a l g e l s e w h e n s ENDWHEN { $$ [ Algorithm ] = Absyn .ALG WHEN A( $2 [ Exp ] , $4 [ A l g o r i t h m I t e m s ] , $5 [ AlgElsewhens ] ) ; }

411

413

191

algelsewhens : a l g e l s e w h e n { $$ [ AlgElsewhens ] = $1 [ AlgElsewhen ] : : { } ; } | a l g e l s e w h e n a l g e l s e w h e n s { $$ [ AlgElsewhens ] = $1 [ AlgElsewhen ] : : $2 [ AlgElsewhens ] ; }

415 algelsewhen : ELSEWHEN exp THEN a l g o r i t h m s e c t i o n { $$ [ AlgElsewhen ] = ( $2 [ Exp ] , $4 [ A l g o r i t h m I t e m s ] ) ; } 417 419

421

423

/∗ EQUATIONS ∗/ equationsection : e q u a t i o n i t e m SEMICOLON { $$ [ E q u a t i o n I t e m s ] = $1 [ EquationItem ] : : { } ; } | e q u a t i o n i t e m SEMICOLON e q u a t i o n s e c t i o n { $$ [ E q u a t i o n I t e m s ] = $1 [ EquationItem ] : : $2 [ E q u a t i o n I t e m s ] ; } equationitem

:

equation comment { $$ [ EquationItem ] = Absyn . EQUATIONITEM( $1 [ Equation ] ,SOME( $2 [ Comment ] ) , i n f o ) ; }

equation

: exp EQUALS exp { $$ [ Equation ] = Absyn . EQ EQUALS( $1 [ Exp ] , $3 [ Exp ] ) ; } | if equation { $$ [ Equation ] = $1 [ Equation ] ; } | when equation { $$ [ Equation ] = $1 [ Equation ] ; } | CONNECT LPAR c r e f COMMA c r e f RPAR { $$ [ Equation ] = Absyn .EQ CONNECT( $3 [ ComponentRef ] , $5 [ ComponentRef ] ) ; } | FOR f o r i t e r a t o r s LOOP e q u a t i o n s e c t i o n ENDFOR { $$ [ Equation ] = Absyn . EQ FOR( $3 [ F o r I t e r a t o r s ] , $5 [ E q u a t i o n I t e m s ]) ; } | c r e f f u n c t i o n c a l l { $$ [ Equation ] = Absyn .EQ NORETCALL( $1 [ ComponentRef ] , $2 [ F u n c t i o n A r g s ] ) ; }

425 427

429 431 433

435

437 when equation 439

441

:

WHEN exp THEN e q u a t i o n s e c t i o n ENDWHEN { $$ [ Equation ] = Absyn . EQ WHEN E( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] , { } ) ; } | WHEN exp THEN e q u a t i o n s e c t i o n e l s e w h e n s ENDWHEN { $$ [ Equation ] = Absyn . EQ WHEN E( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] , $5 [ Elsewhens ] ) ; }

192

443

APPENDIX F. MODELICA GRAMMAR

elsewhens Elsewhen ] : : { } ; }

:

e l s e w h e n { $$ [ Elsewhens ] = $1 [

| e l s e w h e n e l s e w h e n s { $$ [ Elsewhens ] = $1 [ Elsewhen ] : : $2 [ Elsewhens ] ; } 445 elsewhen : ELSEWHEN exp THEN e q u a t i o n s e c t i o n [ Elsewhen ] = ( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] ) ; }

{ $$

447

449

451

foriterators : f o r i t e r a t o r { $$ [ F o r I t e r a t o r s ] = $1 [ ForIterator ] : : { } ; } | f o r i t e r a t o r COMMA f o r i t e r a t o r s { $$ [ F o r I t e r a t o r s ] = $1 [ F o r I t e r a t o r ] : : $2 [ ForIterators ] ; } foriterator : IDENT { $$ [ F o r I t e r a t o r ] = Absyn .ITERATOR ( $1 ,NONE( ) ,NONE( ) ) ; } | IDENT T IN exp { $$ [ F o r I t e r a t o r ] = Absyn .ITERATOR( $1 ,NONE( ) ,SOME( $3 [ Exp ] ) ) ; }

453

455

457

459

if equation : IF exp THEN e q u a t i o n s e c t i o n ENDIF { $$ [ Equation ] = Absyn . EQ IF ( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] , { } , { } ) ; } | IF exp THEN e q u a t i o n s e c t i o n ELSE e q u a t i o n s e c t i o n ENDIF { $$ [ Equation ] = Absyn . EQ IF ( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] , { } , $6 [ E q u a t i o n I t e m s ] ) ; } | IF exp THEN e q u a t i o n s e c t i o n e l s e i f s ENDIF { $$ [ Equation ] = Absyn . EQ IF ( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] , $5 [ E l s e i f s ] , { } ) ; } | IF exp THEN e q u a t i o n s e c t i o n e l s e i f s ELSE e q u a t i o n s e c t i o n ENDIF { $$ [ Equation ] = Absyn . EQ IF ( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] , $5 [ E l s e i f s ] , $7 [ E q u a t i o n I t e m s ] ) ; } elseifs }

:

e l s e i f { $$ [ E l s e i f s ] = $1 [ E l s e i f ] : : { } ; | e l s e i f e l s e i f s { $$ [ E l s e i f s ] = $1 [ E l s e i f ] : : $2 [ E l s e i f s ] ; }

461 elseif : ELSEIF exp THEN e q u a t i o n s e c t i o n E l s e i f ] = ( $2 [ Exp ] , $4 [ E q u a t i o n I t e m s ] ) ; }

{ $$ [

463 /∗ E x p r e s s i o n s and Elements ∗/ 465

467

elementItems : e l e m e n t I t e m { $$ [ ElementItems ] = $1 [ ElementItem ] : : { } ; } | e l e m e n t I t e m e l e m e n t I t e m s { $$ [ ElementItems ] = $1 [ ElementItem ] : : $2 [ ElementItems ] ; }

469

471

473

elementItem : e l e m e n t SEMICOLON { $$ [ ElementItem ] = Absyn .ELEMENTITEM( $1 [ Element ] ) ; } | a n n o t a t i o n SEMICOLON { $$ [ ElementItem ] = Absyn .ANNOTATIONITEM( $1 [ A nno tat ion ] ) ; } element

: componentclause

F.2. PARSERMODELICA.Y

193

{ $$ [ Element ] = $1 [ Element ] ; } | classElement2 { $$ [ Element ] = $1 [ Element ] ; } | importelementspec { $$ [ Element ] = Absyn .ELEMENT( f a l s e ,NONE ( ) , Absyn . NOT INNER OUTER( ) , ”IMPORT” , $1 [ ElementSpec ] , i n f o ,NONE( ) ) ; } | extends { $$ [ Element ] = Absyn .ELEMENT( f a l s e ,NONE ( ) , Absyn . NOT INNER OUTER( ) , ”EXTENDS” , $1 [ ElementSpec ] , i n f o ,NONE( ) ) ; } | FINAL f i n a l E l e m e n t { $$ [ Element ] = $2 [ Element ] ; }

475 477

479

481 483 485

classElement2

487

: classelementspec { $$ [ Element ] = Absyn .ELEMENT( f a l s e ,NONE ( ) , Absyn . NOT INNER OUTER( ) , ”CLASS” , $1 [ ElementSpec ] , i n f o ,NONE( ) ) ; } | REDECLARE c l a s s e l e m e n t s p e c { $$ [ Element ] = Absyn .ELEMENT( f a l s e ,SOME ( Absyn .REDECLARE( ) ) , Absyn . NOT INNER OUTER( ) , ”CLASS” , $1 [ ElementSpec ] , i n f o ,NONE( ) ) ; }

489 finalElement 491

493

495

: innerouter ident elementspec { $$ [ Element ] = Absyn .ELEMENT( true ,NONE( ) , $1 [ I n n e r O u t e r ] , $2 [ I d e n t ] , $3 [ ElementSpec ] , i n f o ,NONE( ) ) ; } | ident elementspec { $$ [ Element ] = Absyn .ELEMENT( true ,NONE( ) , Absyn . NOT INNER OUTER( ) , $1 [ I d e n t ] , $2 [ ElementSpec ] , i n f o ,NONE( ) ) ; } // | e l e m e n t s p e c // c a s u e s r e d u c e / r e d u c e conflicts // { $$ [ Element ] = Absyn .ELEMENT( t r u e , NONE( ) , Absyn . NOT INNER OUTER( ) , ” ELEMENTSPEC” , $1 [ ElementSpec ] , i n f o ,NONE( ) ) ; }

497 499

501

503

componentclause

: elementspec { $$ [ Element ] = Absyn .ELEMENT( f a l s e ,NONE ( ) , Absyn . NOT INNER OUTER( ) , ” ELEMENTSPEC” , $1 [ ElementSpec ] , i n f o , NONE( ) ) ; } | innerouter elementspec { $$ [ Element ] = Absyn .ELEMENT( f a l s e ,NONE( ) , $1 [ I n n e r O u t e r ] , ”ELEMENTSPEC” , $2 [ ElementSpec ] , i n f o ,NONE( ) ) ; } | redeclarekeywords final innerouter ident elementspec

194

505

APPENDIX F. MODELICA GRAMMAR

{ $$ [ Element ] = Absyn .ELEMENT( $2 [ Boolean ] ,SOME( $1 [ RedeclareKeywords ] ) , $3 [ I n n e r O u t e r ] , $4 [ I d e n t ] , $5 [ ElementSpec ] , i n f o ,NONE( ) ) ; } | redeclarekeywords final ident elementspec { $$ [ Element ] = Absyn .ELEMENT( $2 [ Boolean ] ,SOME( $1 [ RedeclareKeywords ] ) , Absyn . NOT INNER OUTER( ) , $3 [ I d e n t ] , $4 [ ElementSpec ] , i n f o ,NONE( ) ) ; }

507 509

componentitems : componentitem { $$ [ ComponentItems ] = $1 [ ComponentItem ] : : { } ; } | componentitem COMMA componentitems { $$ [ ComponentItems ] = $1 [ ComponentItem ] : : $2 [ ComponentItems ] ; }

511

513

componentitem : component comment { $$ [ ComponentItem ] = Absyn .COMPONENTITEM( $1 [ Component ] ,NONE( ) ,SOME( $2 [ Comment ] ) ) ; } | component c o m p o n e n t c o n d i t i o n comment { $$ [ ComponentItem ] = Absyn .COMPONENTITEM( $1 [ Component ] ,SOME( $2 [ ComponentCondition ] ) , SOME( $3 [ Comment ] ) ) ; }

515

c o m p o n e n t c o n d i t i o n : IF exp { $$ [ ComponentCondition ] = $1 [ Exp ] ; }

517

component : i d e n t a r r a y S u b s c r i p t s m o d i f i c a t i o n { $$ [ Component ] = Absyn .COMPONENT( $1 [ I d e n t ] , $2 [ ArrayDim ] ,SOME( $3 [ Modification ] ) ) ; } | i d e n t a r r a y S u b s c r i p t s { $$ [ Component ] = Absyn .COMPONENT( $1 [ I d e n t ] , $2 [ ArrayDim ] , NONE( ) ) ; }

519

521

modification : EQUALS exp { $$ [ M o d i f i c a t i o n ] = Absyn . CLASSMOD( { } , Absyn .EQMOD( $2 [ Exp ] , i n f o ) ) ; } | ASSIGN exp { $$ [ M o d i f i c a t i o n ] = Absyn . CLASSMOD( { } , Absyn .EQMOD( $2 [ Exp ] , i n f o ) ) ; } | c l a s s m o d i f i c a t i o n { $$ [ M o d i f i c a t i o n ] = $1 [ Modification ] ; }

523 525

527

529

c l a s s m o d i f i c a t i o n : elementargs { $$ [ M o d i f i c a t i o n ] = Absyn .CLASSMOD( $1 [ ElementArgs ] , Absyn .NOMOD( ) ) ; } | e l e m e n t a r g s EQUALS exp { $$ [ M o d i f i c a t i o n ] = Absyn .CLASSMOD( $1 [ ElementArgs ] , Absyn .EQMOD( $3 [ Exp ] , i n f o ) ); } annotation : T ANNOTATION e l e m e n t a r g s { $$ [ A nn ota tio n ]= Absyn .ANNOTATION( $1 [ ElementArgs ] ) ; }

F.2. PARSERMODELICA.Y

195

531

elementargs : LPAR a r g u m e n t l i s t RPAR { $$ [ ElementArgs ] = $1 [ ElementArgs ] ; }

533

elementargs2 : LPAR a r g u m e n t l i s t RPAR { $$ [ ElementArgs ] = $1 [ ElementArgs ] ; } | /∗ empty ∗/ { $$ [ ElementArgs ] = { } ; }

535

537

argumentlist : e l e m e n t a r g { $$ [ ElementArgs ] = { $1 [ ElementArg ] } ; } | e l e m e n t a r g COMMA a r g u m e n t l i s t { $$ [ ElementArgs ] = $1 [ ElementArg ] : : $2 [ ElementArgs ] ; }

539

elementarg

: eachprefix final cref { $$ [ ElementArg ] = Absyn . MODIFICATION( $2 [ Boolean ] , $1 [ Each ] , $3 [ ComponentRef ] , NONE( ) ,NONE( ) ) ; } | eachprefix final cref modification { $$ [ ElementArg ] = Absyn . MODIFICATION( $2 [ Boolean ] , $1 [ Each ] , $3 [ ComponentRef ] , SOME( $4 [ M o d i f i c a t i o n ] ) ,NONE( ) ) ; }

eachprefix

: EACH { $$ [ Each]= Absyn .EACH( ) ; } | /∗ empty ∗/ { $$ [ Each]= Absyn .NON EACH( ) ; }

541

543 545 547 redeclarekeywords REDECLARE( ) ; }

: REDECLARE { $$ [ RedeclareKeywords ] = Absyn . | REPLACEABLE { $$ [ RedeclareKeywords ] = Absyn .REPLACEABLE( ) ; } | REDECLARE REPLACEABLE { $$ [ RedeclareKeywords ] = Absyn . REDECLARE REPLACEABLE( ) ; }

549

551 innerouter INNER ( ) ; } 553

555

: INNER { $$ [ I n n e r O u t e r ] = Absyn . | T OUTER { $$ [ I n n e r O u t e r ] = Absyn .OUTER( ) ; } | INNER T OUTER { $$ [ I n n e r O u t e r ] = Absyn . INNER OUTER( ) ; } // | /∗ empty ∗/ { $$ [ I n n e r O u t e r ] = Absyn . NOT INNER OUTER( ) ; }

557 importelementspec : import comment { $$ [ ElementSpec ] = Absyn .IMPORT( $1 [ Import ] ,SOME( $2 [ Comment ] ) , i n f o ) ; } 559

561

563

classelementspec : c l a s s { $$ [ ElementSpec ] = Absyn . CLASSDEF( f a l s e , $1 [ C l a s s ] ) ; } | REPLACEABLE c l a s s { $$ [ ElementSpec ] = Absyn . CLASSDEF( true , $2 [ C l a s s ] ) ; } import : IMPORT path { $$ [ Import ] = Absyn . QUAL IMPORT( $2 [ Path ] ) ; } | IMPORT path STAR EW { $$ [ Import ] = Absyn . UNQUAL IMPORT( $2 [ Path ] ) ; }

196

| IMPORT i d e n t EQUALS path { $$ [ Import ] = Absyn .NAMED IMPORT( $2 [ I d e n t ] , $4 [ Path ] ) ; }

565

567

APPENDIX F. MODELICA GRAMMAR

extends

569

571

573

: EXTENDS path { $$ [ ElementSpec ] = Absyn .EXTENDS( $2 [ Path ] , { } ,NONE( ) ) ; } | EXTENDS path a n n o t a t i o n { $$ [ ElementSpec ] = Absyn .EXTENDS( $2 [ Path ] , { } ,SOME( $3 [ A nn ota tio n ] ) ) ; } | EXTENDS path LPAR a r g u m e n t l i s t RPAR { $$ [ ElementSpec ] = Absyn .EXTENDS( $2 [ Path ] , $4 [ ElementArgs ] ,NONE( ) ) ; } | EXTENDS path LPAR a r g u m e n t l i s t RPAR annotation { $$ [ ElementSpec ] = Absyn .EXTENDS( $2 [ Path ] , $4 [ ElementArgs ] ,SOME( $3 [ An no tat ion ]) ) ; }

575

577

579

581

583

585

587

589

591

elementspec : e l e m e n t A t t r t y p e s p e c componentitems // arraydim from t y p e s p e c s h o u l d be i n e l e m e n t A t t r arraydim { ( $1 [ E l e m e n t A t t r i b u t e s ] , $2 [ TypeSpec ] ) = f i x A r r a y ( $1 [ E l e m e n t A t t r i b u t e s ] , $2 [ TypeSpec ] ) ; $$ [ ElementSpec ] = Absyn .COMPONENTS( $1 [ E l e m e n t A t t r i b u t e s ] , $2 [ TypeSpec ] , $3 [ ComponentItems ] ) ; } | t y p e s p e c componentitems // arraydim from t y p e s p e c s h o u l d be i n e l e m e n t A t t r arraydim { ( v 1 E l e m e n t A t t r i b u t e s , $1 [ TypeSpec ] ) = f i x A r r a y ( Absyn .ATTR( f a l s e , f a l s e , Absyn .VAR( ) , Absyn . BIDIR ( ) , { } ) , $1 [ TypeSpec ] ) ; $$ [ ElementSpec ] = Absyn .COMPONENTS( v 1 E l e m e n t A t t r i b u t e s , $1 [ TypeSpec ] , $2 [ ComponentItems ] ) ; } elementAttr

: direction { $$ [ E l e m e n t A t t r i b u t e s ] = Absyn .ATTR( f a l s e , f a l s e , Absyn .VAR( ) , $1 [ Direction ] ,{}) ; } | variability { $$ [ E l e m e n t A t t r i b u t e s ] = Absyn .ATTR( f a l s e , f a l s e , $1 [ V a r i a b i l i t y ] , Absyn . BIDIR ( ) , { } ) ; } | variability direction { $$ [ E l e m e n t A t t r i b u t e s ] = Absyn .ATTR( f a l s e , f a l s e , $1 [ V a r i a b i l i t y ] , $2 [ Direction ] ,{}) ; } | STREAM v a r i a b i l i t y d i r e c t i o n { $$ [ E l e m e n t A t t r i b u t e s ] = Absyn .ATTR( f a l s e , true , $2 [ V a r i a b i l i t y ] , $3 [ Direction ] ,{}) ; } | FLOW v a r i a b i l i t y d i r e c t i o n { $$ [ E l e m e n t A t t r i b u t e s ] = Absyn .ATTR( true , f a l s e , $2 [ V a r i a b i l i t y ] , $3 [

F.2. PARSERMODELICA.Y

197

Direction ] ,{}) ; } | FLOW { $$ [ E l e m e n t A t t r i b u t e s ] = Absyn .ATTR( true , f a l s e , Absyn .VAR( ) , Absyn . BIDIR () ,{}) ; }

593

595 597

: PARAMETER { $$ [ V a r i a b i l i t y ] = Absyn .PARAM

variability () ; }

| CONSTANT { $$ [ V a r i a b i l i t y ] = Absyn .CONST () ; } | DISCRETE { $$ [ V a r i a b i l i t y ] = Absyn . DISCRETE ( ) ; } // | /∗ empty ∗/ { $$ [ V a r i a b i l i t y ] = Absyn . VAR( ) ; }

599

601 direction 603

: T INPUT { $$ [ D i r e c t i o n ] = Absyn . INPUT ( ) ; } | T OUTPUT { $$ [ D i r e c t i o n ] = Absyn .OUTPUT( ) ; } // | /∗ empty ∗/ { $$ [ D i r e c t i o n ] = Absyn . BIDIR ( ) ; }

605 /∗ Type s p e c i f i c a t i o n ∗/ 607

609

typespec : path a r r a y S u b s c r i p t s { $$ [ TypeSpec ] = Absyn .TPATH( $1 [ Path ] ,SOME( $2 [ ArrayDim ] ) ) ; } | path arrayComplex { $$ [ TypeSpec ] = Absyn . TCOMPLEX( $1 [ Path ] , $2 [ TypeSpecs ] ,NONE( ) ) ; }

611

arrayComplex : LESS t y p e s p e c s GREATER { $$ [ TypeSpecs ] = $1 [ TypeSpecs ] ; }

613

typespecs ]::{}; }

: t y p e s p e c { $$ [ TypeSpecs ] = $1 [ TypeSpec | t y p e s p e c COMMA t y p e s p e c s { $$ [ TypeSpecs ] = $1 [ TypeSpec ] : : $2 [ TypeSpecs ] ; }

615 arraySubscripts [ ArrayDim ] ; }

| /∗ empty ∗/ { $$ [ ArrayDim ] = { } ; }

617 619

: LBRACK arrayDim RBRACK { $$ [ ArrayDim ] = $1

: s u b s c r i p t { $$ [ ArrayDim ] = $1 [

arrayDim Subscript ] : : { } ; }

| s u b s c r i p t COMMA arrayDim { $$ [ ArrayDim ] = $1 [ S u b s c r i p t ] : : $2 [ ArrayDim ] ; } 621 subscript Exp ] ) ; }

: exp { $$ [ S u b s c r i p t ] = Absyn . SUBSCRIPT( $1 [ | COLON { $$ [ S u b s c r i p t ] = Absyn .NOSUB( ) ; }

623 625

/∗ f u n c t i o n c a l l s ∗/

627

functioncall : LPAR f u n c t i o n a r g s RPAR { $$ [ F u n c t i o n A r g s ] = $1 [ F u n c t i o n A r g s ] ; }

629

functionargs

: namedargs

198

APPENDIX F. MODELICA GRAMMAR

{ $$ [ F u n c t i o n A r g s ] = Absyn .FUNCTIONARGS ( { } , $1 [ NamedArgs ] ) ; } | f u n c t i o n a r g s 2 { $$ [ F u n c t i o n A r g s ]= $1 [ FunctionArgs ] ; }

631

633

635

637

functionargs2 : f u n c t i o n a r g s 3 { $$ [ F u n c t i o n A r g s ]= $1 [ FunctionArgs ] ; } | explist2 { $$ [ F u n c t i o n A r g s ] = Absyn .FUNCTIONARGS( $1 [ Exps ] , { } ) ; } | e x p l i s t 2 COMMA namedargs // TODO: Test f o r LALR grammar , may not work s h i f t / r e d u c e conflict { $$ [ F u n c t i o n A r g s ] = Absyn .FUNCTIONARGS( $1 [ Exps ] , $2 [ NamedArgs ] ) ; }

639 641

functionargs3

:

exp FOR f o r i t e r a t o r s { $$ [ F u n c t i o n A r g s ] = Absyn . FOR ITER FARG( $1 [ Exp ] , $3 [ F o r I t e r a t o r s ] ) ; }

namedargs ]::{}; }

: namedarg { $$ [ NamedArgs ] = $1 [ NamedArg

643 645

| namedarg COMMA namedargs { $$ [ NamedArgs ] = $1 [ NamedArg ] : : $2 [ NamedArgs ] ; } 647 649

namedarg : i d e n t EQUALS exp { $$ [ NamedArg ] = Absyn . NAMEDARG( $1 [ I d e n t ] , $2 [ Exp ] ) ; }

651

/∗ e x p r e s s i o n s ∗/

653

exp

: simpleExp { $$ [ Exp ] = $1 [ Exp ] ; }

655 657

659

| i f e x p { $$ [ Exp ] = $1 [ Exp ] ; } | matchcont { $$ [ Exp ] = $1 [ Exp ] ; } matchcont : MATCH exp c a s e s ENDMATCH { $$ [ Exp ] = Absyn . MATCHEXP( Absyn .MATCH( ) , $2 [ Exp ] , { } , $3 [ Cases ] ,NONE( ) ) ; } | MATCH exp m a t c h l o c a l c a s e s ENDMATCH { $$ [ Exp ] = Absyn .MATCHEXP( Absyn .MATCH( ) , $2 [ Exp ] , $3 [ ElementItems ] , $4 [ Cases ] ,NONE( ) ) ; } | MATCHCONTINUE exp c a s e s ENDMATCHCONTINUE { $$ [ Exp ] = Absyn .MATCHEXP( Absyn .MATCHCONTINUE ( ) , $2 [ Exp ] , { } , $3 [ Cases ] ,NONE( ) ) ; } | MATCHCONTINUE exp m a t c h l o c a l c a s e s ENDMATCHCONTINUE { $$ [ Exp ] = Absyn .MATCHEXP ( Absyn .MATCHCONTINUE( ) , $2 [ Exp ] , $3 [ ElementItems ] , $4 [ Cases ] ,NONE( ) ) ; }

661 663

if exp : IF exp THEN exp ELSE exp { $$ [ Exp ] = Absyn . IFEXP( $2 [ Exp ] , $4 [ Exp ] , $6 [ Exp ] , { } ) ; } | IF exp THEN exp e x p e l s e i f s ELSE exp { $$ [ Exp ] = Absyn . IFEXP( $2 [ Exp ] , $4 [ Exp ] , $7 [ Exp ] , $5 [

F.2. PARSERMODELICA.Y

199

ExpElseifs ] ) ; } 665 expelseifs ]::{}; }

e x p e l s e i f { $$ [ E x p E l s e i f s ] = $1 [ E x p E l s e i f

| e x p e l s e i f e x p e l s e i f s { $$ [ E x p E l s e i f s ] = $1 [ E x p E l s e i f ] : : $2 [ E x p E l s e i f s ] ; }

667

669

:

expelseif : ELSEIF exp THEN exp ] , $4 [ Exp ] ) ; }

{ $$ [ E x p E l s e i f ] = ( $2 [ Exp

671 matchlocal : LOCAL e l e m e n t I t e m s { $$ [ ElementItems ] = $1 [ ElementItems ] ; } 673 cases

: case { $$ [ Cases ] = $1 [ Case ] : : { } ; } | case c a s e s { $$ [ Cases ] = $1 [ Case ] : : $2 [ Cases ]; }

case

: CASE c a s e a r g THEN exp SEMICOLON { $$ [ Case ] = Absyn . CASE( $2 [ Exp ] , i n f o , { } , { } , $4 [ Exp ] , i n f o ,NONE( ) , i n f o ) ; } | CASE c a s e a r g EQUATION THEN exp SEMICOLON { $$ [ Case ] = Absyn . CASE( $2 [ Exp ] , i n f o , { } , { } , $4 [ Exp ] , i n f o ,NONE( ) , i n f o ) ; } | CASE c a s e a r g EQUATION e q u a t i o n s e c t i o n THEN exp SEMICOLON { $$ [ Case ] = Absyn . CASE( $2 [ Exp ] , i n f o , { } , $4 [ E q u a t i o n I t e m s ] , $6 [ Exp ] , i n f o ,NONE( ) , info ) ; } | ELSE THEN exp SEMICOLON { $$ [ Case ] = Absyn . ELSE ( { } , { } , $3 [ Exp ] , i n f o ,NONE( ) , i n f o ) ; } | ELSE EQUATION e q u a t i o n s e c t i o n THEN exp SEMICOLON { $$ [ Case ] = Absyn . ELSE ( { } , $3 [ E q u a t i o n I t e m s ] , $5 [ Exp ] , i n f o ,NONE( ) , info ) ; }

675

677

679

681

683

685

687 casearg

: exp { $$ [ Exp ] = $1 [ Exp ] ; }

simpleExp

: l o g i c e x p { $$ [ Exp ] = $1 [ Exp ] ; } | rangeExp { $$ [ Exp ] = $1 [ Exp ] ; } | h e a d t a i l { $$ [ Exp ] = $1 [ Exp ] ; } | i d e n t AS simpleExp { $$ [ Exp ] = Absyn . AS( $1 [ I d e n t ] , $2 [ Exp ] ) ; }

689 691 693

695

697

699

headtail : l o g i c e x p COLONCOLON l o g i c e x p { $$ [ Exp ] = Absyn .CONS( $1 [ Exp ] , $3 [ Exp ] ) ; } | l o g i c e x p COLONCOLON h e a d t a i l { $$ [ Exp ] = Absyn .CONS( $1 [ Exp ] , $3 [ Exp ] ) ; } rangeExp : l o g i c e x p COLON l o g i c e x p { $$ [ Exp ] = Absyn . RANGE( $1 [ Exp ] ,NONE( ) , $3 [ Exp ] ) ; } | l o g i c e x p COLON l o g i c e x p COLON l o g i c e x p { $$ [ Exp ] = Absyn .RANGE( $1 [ Exp ] ,SOME( $3 [ Exp ] ) , $5 [ Exp ] ) ; }

200

APPENDIX F. MODELICA GRAMMAR

701 logicexp 703

705

logicterm

: l o g i c t e r m { $$ [ Exp ] = $1 [ Exp ] ; } | l o g i c e x p T OR l o g i c t e r m { $$ [ Exp ] = Absyn . LBINARY( $1 [ Exp ] , Absyn .OR( ) , $2 [ Exp ] ) ; } : l o g f a c t o r { $$ [ Exp ] = $1 [ Exp ] ; } | l o g i c t e r m T AND l o g f a c t o r { $$ [ Exp ] = Absyn . LBINARY( $1 [ Exp ] , Absyn .AND( ) , $2 [ Exp ]) ; }

707 logfactor 709

711

relterm

: r e l t e r m { $$ [ Exp ] = $1 [ Exp ] ; } | T NOT r e l t e r m { $$ [ Exp ] = Absyn .LUNARY( Absyn .NOT( ) , $1 [ Exp ] ) ; } : addterm { $$ [ Exp ] = $1 [ Exp ] ; } | addterm r e l O p e r a t o r addterm { $$ [ Exp ] = Absyn . RELATION( $1 [ Exp ] , $2 [ O p e r a t o r ] , $3 [ Exp ] ) ; }

713 addterm 715

: term { $$ [ Exp ] = $1 [ Exp ] ; } | u n o p e r a t o r term { $$ [ Exp ] = Absyn .UNARY( $1 [ O p e r a t o r ] , $2 [ Exp ] ) ; } | addterm w o p e r a t o r term { $$ [ Exp ] = Absyn . BINARY( $1 [ Exp ] , $2 [ O p e r a t o r ] , $3 [ Exp ] ) ; }

717 term 719

721

: f a c t o r { $$ [ Exp ] = $1 [ Exp ] ; } | term s o p e r a t o r f a c t o r { $$ [ Exp ] = Absyn . BINARY( $1 [ Exp ] , $2 [ O p e r a t o r ] , $3 [ Exp ] ) ; }

factor

: expElement { $$ [ Exp ] = $1 [ Exp ] ; } | expElement power f a c t o r { $$ [ Exp ] = Absyn . BINARY( $1 [ Exp ] , $2 [ O p e r a t o r ] , $3 [ Exp ] ) ; }

expElement

: number { $$ [ Exp ] = $1 [ Exp ] ; } | c r e f { $$ [ Exp ] = Absyn .CREF( $1 [ ComponentRef ] ) ; } | T FALSE { $$ [ Exp ] = Absyn .BOOL( f a l s e ) ; } | T TRUE { $$ [ Exp ] = Absyn .BOOL( true ) ; } | s t r i n g { $$ [ Exp ] = Absyn . STRING( $1 ) ; } | t u p l e { $$ [ Exp ] = $1 [ Exp ] ; } | LBRACE e x p l i s t 2 RBRACE { $$ [ Exp ] = Absyn . ARRAY( $2 [ Exps ] ) ; } | LBRACK m a t r i x RBRACK { $$ [ Exp ] = Absyn . MATRIX( $2 [ Matrix ] ) ; } | c r e f f u n c t i o n c a l l { $$ [ Exp ] = Absyn . CALL( $1 [ ComponentRef ] , $2 [ F u n c t i o n A r g s ] ) ; } | DER f u n c t i o n c a l l { $$ [ Exp ] = Absyn . CALL( Absyn . CREF IDENT( ” d e r ” , { } ) , $2 [ FunctionArgs ] ) ; } | LPAR exp RPAR { $$ [ Exp ] = $2 [ Exp ] ; } | T END { $$ [ Exp ] = Absyn .END( ) ; }

723 725

727 729

731

733

735

F.2. PARSERMODELICA.Y

737

201

number : UNSIGNED INTEGER { $$ [ Exp ] = Absyn . INTEGER( s t r i n g I n t ( $1 ) ) ; } | UNSIGNED REAL { $$ [ Exp ] = Absyn .REAL( s t r i n g R e a l ( $1 ) ) ; }

739 : e x p l i s t 2 { $$ [ Matrix ] = { $1 [ Exps ] } ; } | e x p l i s t 2 SEMICOLON m a t r i x { $$ [ Matrix ] = $1 [ Exps ] : : $3 [ Matrix ] ; }

matrix 741

743

tuple $2 [ Exps ] ) ; }

: LPAR e x p l i s t RPAR { $$ [ Exp ] = Absyn .TUPLE(

745

explist ]};

: exp COMMA exp { $$ [ Exps ] = { $1 [ Exp ] , $3 [ Exp } | exp COMMA e x p l i s t { $$ [ Exps ] = $1 [ Exp ] : : $3 [ Exps ] ; } | /∗ empty ∗/ { $$ [ Exps ] = { } ; }

747 749

explist2

751 753

755

757

759

cref

: exp { $$ [ Exps ] = { $1 [ Exp ] } ; } | exp COMMA e x p l i s t 2 { $$ [ Exps ] = $1 [ Exp ] : : $3 [ Exps ] ; } | /∗ empty ∗/ { $$ [ Exps ] = { } ; }

: i d e n t a r r a y S u b s c r i p t s { $$ [ ComponentRef ] = Absyn . CREF IDENT( $1 [ I d e n t ] , $2 [ ArrayDim ] ) ; } | i d e n t DOT c r e f { $$ [ ComponentRef ] = Absyn . CREF QUAL( $1 [ I d e n t ] , { } , $3 [ ComponentRef ] ) ; } | DOT c r e f { $$ [ ComponentRef ] = Absyn . CREF FULLYQUALIFIED( $2 [ ComponentRef ] ) ; } | WILD { $$ [ ComponentRef ] = Absyn .WILD( ) ; } | ALLWILD { $$ [ ComponentRef ] = Absyn . ALLWILD( ) ; }

unoperator

761

: PLUS { $$ [ O p e r a t o r ] = Absyn . UPLUS( ) ; } | MINUS { $$ [ O p e r a t o r ] = Absyn .UMINUS( ) ; } | PLUS EW { $$ [ O p e r a t o r ] = Absyn .UPLUS EW () ; } | MINUS EW { $$ [ O p e r a t o r ] = Absyn .UMINUS EW () ; }

763 765

woperator ADD( ) ; }

: PLUS { $$ [ O p e r a t o r ] = Absyn . | MINUS { $$ [ O p e r a t o r ] = Absyn . SUB( ) ; } | PLUS EW { $$ [ O p e r a t o r ] = Absyn .ADD EW( ) ; } | MINUS EW { $$ [ O p e r a t o r ] = Absyn .SUB EW( ) ; }

767

769 771

soperator MUL( ) ; }

: STAR { $$ [ O p e r a t o r ] = Absyn . | SLASH { $$ [ O p e r a t o r ] = Absyn . DIV ( ) ; }

202

APPENDIX F. MODELICA GRAMMAR

| STAR EW { $$ [ O p e r a t o r ] = Absyn .MUL EW( ) ; } | SLASH EW { $$ [ O p e r a t o r ] = Absyn . DIV EW( ) ; }

773

775 power 777

779

: POWER { $$ [ O p e r a t o r ] = Absyn .POW( ) ; } | POWER EW { $$ [ O p e r a t o r ] = Absyn .POWEW( ) ; } : LESS { $$ [ O p e r a t o r ] = Absyn .

relOperator LESS ( ) ; }

| LESSEQ { $$ [ O p e r a t o r ] = Absyn . LESSEQ ( ) ; } | GREATER { $$ [ O p e r a t o r ] = Absyn .GREATER( ) ; } | GREATEREQ { $$ [ O p e r a t o r ] = Absyn . GREATEREQ( ) ; } | EQEQ { $$ [ O p e r a t o r ] = Absyn .EQUAL( ) ; } | LESSGT { $$ [ O p e r a t o r ] = Absyn .NEQUAL( ) ; }

781

783 785

: i d e n t { $$ [ Path ] = Absyn . IDENT( $1 [ I d e n t ] )

path ; }

| i d e n t DOT path { $$ [ Path ] = Absyn . QUALIFIED( $1 [ I d e n t ] , $2 [ Path ] ) ; } | DOT path { $$ [ Path ] = Absyn . FULLYQUALIFIED( $2 [ Path ] ) ; }

787

789 ident

:

IDENT { $$ [ I d e n t ] = $1 ; }

791 string : STRING { $$ = t r i m q u o t e s ( $1 ) ; } // t r i m the quote o f the s t r i n g 793 comment ( ) ,SOME( $1 ) ) ; }

: s t r i n g { $$ [ Comment ] = Absyn .COMMENT(NONE | /∗ empty ∗/ { $$ [ Comment ] = Absyn .COMMENT (NONE( ) ,NONE( ) ) ; }

795

797 %% 799 801 803 805

807 809

public function t r i m q u o t e s ” removes c h a r s i n charsToRemove from i n S t r i n g ” input String i n S t r i n g ; output String o u t S t r i n g ; algorithm i f ( s t r i n g L e n g t h ( i n S t r i n g ) >2) then o u t S t r i n g := System . s u b s t r i n g ( i n S t r i n g , 2 , s t r i n g L e n g t h ( i n S t r i n g ) −1) ; else o u t S t r i n g := ” ” ; end i f ; end t r i m q u o t e s ;

811 813 815

function f i x A r r a y input E l e m e n t A t t r i b u t e s v 1 E l e m e n t A t t r i b u t e s ; input TypeSpec v2TypeSpec ; output E l e m e n t A t t r i b u t e s v 1 E l e m e n t A t t r i b u t e s 2 ; output TypeSpec v2TypeSpec2 ;

F.2. PARSERMODELICA.Y

817 819 821

823 825 827

Boolean f l o w P r e f i x , b1 , b2 ” f l o w ” ; Boolean s t r e a m P r e f i x ” stream ” ; // Boolean i n n e r ” i n n e r ” ; // Boolean o u t e r ” o u t e r ” ; V a r i a b i l i t y v a r i a b i l i t y , v1 ” v a r i a b i l i t y ; parameter , constant etc . ” ; D i r e c t i o n d i r e c t i o n , d1 ” d i r e c t i o n ” ; ArrayDim arrayDim , a1 ” arrayDim ” ; Path path , p1 ; Option oa1 ; algorithm Absyn .ATTR( f l o w P r e f i x=b1 , s t r e a m P r e f i x=b2 , v a r i a b i l i t y=v1 , d i r e c t i o n=d1 ) := v 1 E l e m e n t A t t r i b u t e s ;

829

Absyn .TPATH( path=p1 , arrayDim=oa1 ) := v2TypeSpec ;

831

a1 := match oa1 l o c a l ArrayDim l 1 ; case SOME( l 1 ) then ( l 1 ) ; case NONE( ) then ( { } ) ; end match ;

833 835

203

837 839

v 1 E l e m e n t A t t r i b u t e s 2 := Absyn .ATTR( b1 , b2 , v1 , d1 , a1 ) ; v2TypeSpec2 := Absyn .TPATH( p1 ,NONE( ) ) ;

841

end f i x A r r a y ;

843

function p r i n t C o n t e n t S t a c k input A s t S t a c k a s t S t k ; l i s t skToken ; l i s t skPath ; l i s t s k C l a s s D e f ; l i s t s k I d e n t ; l i s t s k C l a s s ; l i s t skProgram ; l i s t s k l s t C l a s s ; l i s t s k S t r i n g ; l i s t s k I n t e g e r ; algorithm ASTSTACK( stackToken=skToken , s t a c k P a t h=skPath , s t a c k C l a s s D e f= s k C l a s s D e f , s t a c k I d e n t=s k I d e n t , s t a c k C l a s s=s k C l a s s , stackProgram=skProgram , s t a c k l s t C l a s s=s k l s t C l a s s , s t a c k S t r i n g =s k S t r i n g , s t a c k I n t e g e r=s k I n t e g e r ) := a s t S t k ;

845 847 849 851 853 855

857 859 861 863 865 867

print ( ” \n S t a c k c o n t e n t : ” ) ; print ( ” skToken : ” ) ; print ( intString ( l i s t L e n g t h ( skToken ) ) ) ; print ( ” skPath : ” ) ; print ( intString ( l i s t L e n g t h ( skPath ) ) ) ; print ( ” s k C l a s s D e f : ” ) ; print ( intString ( l i s t L e n g t h ( s k C l a s s D e f ) ) ) ; print ( ” s k I d e n t : ” ) ; print ( intString ( l i s t L e n g t h ( s k I d e n t ) ) ) ; print ( ” s k C l a s s : ” ) ; print ( intString ( l i s t L e n g t h ( s k C l a s s ) ) ) ; print ( ” skProgram : ” ) ;

204

869 871 873 875

APPENDIX F. MODELICA GRAMMAR

print ( intString ( l i s t L e n g t h ( skProgram ) ) ) ; print ( ” s k l s t C l a s s : ” ) ; print ( intString ( l i s t L e n g t h ( s k l s t C l a s s ) ) ) ; print ( ” s k S t r i n g : ” ) ; print ( intString ( l i s t L e n g t h ( s k S t r i n g ) ) ) ; print ( ” s k I n t e g e r : ” ) ; print ( intString ( l i s t L e n g t h ( s k I n t e g e r ) ) ) ; end p r i n t C o n t e n t S t a c k ;

Appendix G

Additional Files G.1

SCRIPT.mos Listing G.1: SCRIPT.mos

2

4 6 8 10 12 14 16

// setCommandLineOptions ({”+ g=MetaModelica ” ,”+d=rml , n o e v a l f u n c ” } ) ; getInstallationDirectoryPath () ; // setCommandLineOptions (”+g=MetaModelica ” ,”+d=rml , n o e v a l f u n c ” ) ; l o a d F i l e ( ” Types . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / FrontEnd /Absyn . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l / E r r o r . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l / E r r o r E x t . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / FrontEnd /Dump . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l / P r i n t . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l /RTOpts . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l / U t i l . mo” ) ; l o a d F i l e ( ” . . / . . / Compiler / U t i l / System . mo” ) ;

22

l o a d F i l e ( ” TokenModelica . mo” ) ; l o a d F i l e ( ” Le xTabl eMod elic a . mo” ) ; l o a d F i l e ( ” LexerCodeModelica . mo” ) ; l o a d F i l e ( ” L e x e r M o d e l i c a . mo” ) ; l o a d F i l e ( ” ParseCodeModelica . mo” ) ; l o a d F i l e ( ” P a r s e r M o d e l i c a . mo” ) ; l o a d F i l e ( ” P a r s e T a b l e M o d e l i c a . mo” ) ;

24

l o a d F i l e ( ”Main . mo” ) ;

26

// l o a d F i l e ( ” Program . mo” ) ;

18 20

28 30 32

// Main . main ( { ” ” , ” 1 0 ” } ) ; // Main . main ( { ” ” , ” 4 ” } ) ; // Main . main ( { ” program . mo” , ” 1 0 ” } ) ; Main . main ( { ”TestGrammar . mo” , ” M o d e l i c a ” } ) ;

205

206

34 36 38 40 42 44

APPENDIX G. ADDITIONAL FILES

// Main . main ( { ” Absyn . mo” , ” M o d e l i c a ” } ) ; /∗ Main . main ( { ” P a r s e r G e n e r a t o r . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” L e x e r G e n e r a t o r . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” P a r s e r M o d e l i c a . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” L e x e r M o d e l i c a . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” L exTab leMo deli ca . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” LexerCodeModelica . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” P a r s e r M o d e l i c a . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” ParseCodeModelica . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” P a r s e T a b l e M o d e l i c a . mo” , ” M o d e l i c a ” } ) ; Main . main ( { ” TokenModelica . mo” , ” M o d e l i c a ” } ) ; ∗/ // Main . main ( { ” program4 . t x t ” , ” 4 ” } ) ; getErrorString () ;

G.2

Main.mo Listing G.2: Main.mo

1

c l a s s Main

3

import L e x e r ; import P a r s e r ;

5 7 9 11

import L e x e r M o d e l i c a ; import P a r s e r M o d e l i c a ; import import import import

Util ; RTOpts ; System ; Types ;

13 15 17

19 21 23

public function main ” f u n c t i o n : main This i s t h e main f u n c t i o n t h a t t h e MetaModelica Compiler (MMC) r u n t i m e system c a l l s t o s t a r t the t r a n s l a t i o n . ” input l i s t i n S t r i n g L s t ; l i s t t o k e n s ; P a r s e r M o d e l i c a . AstTree a s t T r e e M o d e l i c a ; t y p e Mcode MCodeLst = l i s t ; algorithm

25 27 29 31 33 35

:= matchcontinue ( i n S t r i n g L s t ) local String v e r s t r , e r r s t r , f i l e n a m e , p a r s e r , a s t ; l i s t a r g s 1 , a r g s , c h a r s ; String s , s t r , omhome , o l d p a t h , newpath , u n p a r s e d ; Boolean r e s u l t ; Real t l , tp , t t ; case a r g s a s :: equation { f i l e n a m e , p a r s e r } = RTOpts . a r g s ( a r g s ) ;

G.2. MAIN.MO

37 39

41 43 45 47 49 51 53

55 57 59 61 63

65 67

69 71 73

” Modelica ” = p a r s e r ; f a l s e=(0==s t r i n g L e n g t h ( f i l e n a m e ) ) ; print ( ” \ n P a r s i n g M o d e l i c a with f i l e ” + f i l e n a m e + ” \n” ); // c a l l t h e l e x e r // t o k e n s = L e x e r M o d e l i c a . s c a n S t r i n g ( ” H e l l o ” , t r u e ) ; System . s t a r t T i m e r ( ) ; tokens = LexerModelica . scan ( filename , false ) ; System . stopTimer ( ) ; t l = System . g e t T i m e r I n t e r v a l T i m e ( ) ; print ( ” \n Time L e x e r : ” + r e a l S t r i n g ( t l ) ) ; // p r i n t (OMCCTypes . p r i n t T o k e n s ( t o k e n s , ” ” ) ) ; print ( ” \n Tokens p r o c e s s e d : ” ) ; print ( intString ( l i s t L e n g t h ( t o k e n s ) ) ) ; // c a l l t h e p a r s e r System . s t a r t T i m e r ( ) ; ( result , astTreeModelica ) = ParserModelica . parse ( tokens , filename , f a l s e ) ; System . stopTimer ( ) ; // p r i n t ( s t r : : a r g s 1 ) ; tp = System . g e t T i m e r I n t e r v a l T i m e ( ) ; print ( ” \n Time P a r s e r : ” + r e a l S t r i n g ( tp ) ) ; t t = t l+tp ; print ( ” \n TOTAL Time : ” + r e a l S t r i n g ( t t ) ) ; print ( ” \n” ) ; // p r i n t t h e AST i f ( r e s u l t ) then // u n p a r s e d = Dump . u n p a r s e S t r ( a s t T r e e M o d e l i c a , f a l s e ) ; // p r i n t ( u n p a r s e d ) ; print ( ” \nSUCCEED” ) ; System . w r i t e F i l e ( ” UnParsed ” + f i l e n a m e , Dump . u n p a r s e S t r ( a s t T r e e M o d e l i c a , true ) ) ; // printAny ( u n p a r s e d ) ; else // p r i n t ( E r r o r . g e t M e s s a g e s S t r ( ) ) ; print ( ” \n” +E r r o r . p r i n t M e s s a g e s S t r ( ) ) ; end i f ; // Run t h e machine f o r e x e r c i s e 10

75 77

// printAny ( a s t T r e e ) ; print ( ” \ n a r g s : ” + f i l e n a m e ) ;

79 81 83 85 87

207

printUsage () ; then ( ) ; case {} equation print ( ” no a r g s ” ) ; printUsage () ; then ( ) ; case equation

208

89 91 93 95 97 99

APPENDIX G. ADDITIONAL FILES

print ( ” \n ∗∗∗∗∗∗∗∗∗∗ E r r o r ∗∗∗∗∗∗∗∗∗∗∗∗∗ ” ) ; printUsage () ; then ( ) ; end matchcontinue ; end main ; public function p r i n t U s a g e Integer n ; L i s t s t r s ; algorithm print ( ” \nOMCC v0 . 9 . 2 ( OpenModelica Compiler− Compiler ) L e x e r and P a r s e r G e n e r a t o r − 2011 ” ) ; end p r i n t U s a g e ;

101 103 105

107

109 111 113 115 117 119

121

123 125 127

129 131

protected function r e a d S e t t i n g s ” function : readSettings author : x02lucpo Checks i f ’ s e t t i n g s . mos ’ e x i s t and u s e s handleCommand with runScript ( . . . ) to execute i t . Checks i f ’− s < f i l e >.mos ’ has been r e t u r n s I n t e r a c t i v e . I n t e r a c t i v e S y m b o l T a b l e which i s used i n t h e r e s t of the loop ” input l i s t i n S t r i n g L s t ; output String s t r ; algorithm s t r := matchcontinue ( i n S t r i n g L s t ) local l i s t a r g s ; case ( a r g s ) equation outSymbolTable = I n t e r a c t i v e . emptySymboltable ; ” ” = Util . f l a g V a l u e ( ”−s ” , a r g s ) ; // t h i s i s out−commented b e c a u s e a u t o m a t i c a l l y r e a d i n g s e t t i n g s . mos // can make a system bad // outSymbolTable = r e a d S e t t i n g s F i l e ( ” s e t t i n g s . mos ” , I n t e r a c t i v e . emptySymboltable ) ; then outSymbolTable ; case ( a r g s ) equation s t r = Util . f l a g V a l u e ( ”−s ” , a r g s ) ; s t r = System . t r i m ( s t r , ” \” ” ) ; outSymbolTable = r e a d S e t t i n g s F i l e ( s t r , I n t e r a c t i v e . emptySymboltable ) ; then outSymbolTable ; end matchcontinue ; end r e a d S e t t i n g s ;

133 135

end Main ;

Glossary Abstract Syntax Tree (AST) A data structure representing something which has been parsed, often used as a compiler or interpreter’s internal representation of a program while it is being optimised and from which code generation is performed. The range of all possible such structures is described by the abstract syntax. [Howe, 2010, http://foldoc.org/abstract+syntax+tree]. 2, 69 Backus-Naur Form BNF is a formal metasyntax used to express contextfree grammars. [Howe, 2010, http://foldoc.org/Backus-Naur+Form]. 29 Compiler-Compiler A utility to generate the source code of a parser, interpreter or compiler from an annotated language description (usually in BNF). Most so called compiler-compilers are really just parser generators. [Howe, 2010, http://foldoc.org/compiler-compiler]. 1, 35 Extended Backus-Naur Form EBNF is a variation on the basic BNF meta-syntax notation with (some of) the following additional constructs: square brackets surrounding optional items, suffix ”*” for Kleene closure (a sequence of zero or more of an item), suffix ”+” for one or more of an item, curly brackets enclosing a list of alternatives, and super/subscripts indicating between n and m occurrences [Howe, 2010, http://foldoc.org/Extended+Backus-Naur+Form]. . 25 Functional Programming A program in a functional language consists of a set of (possibly recursive) function definitions and an expression whose value is output as the program’s result. Functional languages are one kind of declarative language. They are mostly based on the typed lambda-calculus with constants. There are no side-effects to expression evaluation so an expression, e.g. a function applied to certain arguments, will always evaluate to the same value (if its evaluation terminates). Furthermore, an expression can always be replaced by its

209

210

Glossary

value without changing the overall result (referential transparency). [Howe, 2010, http://foldoc.org/functional+programming]. 46 GNU Bison Bison is a general-purpose parser generator that converts an annotated context-free grammar into an LALR(1) or GLR parser for that grammar [Donnelly and Stallman, 2010, http://www.gnu.org/ software/bison/]. 23, 28–32, 46, 48, 49, 58, 64, 65, 81 Lexer A Lexer is a program that performs the Lexical Analysis in a Compiler.. 2, 8, 33, 35, 37, 41, 44, 53, 69 MetaModelica MetaModelica is an extension of the Modelica language created with the purpose of allowing people from the Modelica community to contribute to the development of the OpenModelica compiler (OMC) [Pop and Fritzson, 2006].. 1, 5, 19, 46, 58, 64–66 Modelica Modelica is an object-oriented equation-based programming language that allows specification of mathematical models of complex natural or man-made systems [Fritzson, 2004].. 1, 5, 18, 19, 64, 66 Parser A Parser is a program that performs the Syntax Analysis in a Compiler.. 1, 2, 33, 39, 41, 44, 51–53, 69 UTF-8 (UCS Transformation Format 8) An 8-Bytes ASCII-compatible multibyte Unicode and UCS encoding. [Howe, 2010, http://foldoc.org/ utf-8]. 28, 35, 37, 38, 42

Acronyms ANTLR Another Tool for Language Recognition. 1, 2, 23–26, 54, 62, 65, 69, 70 AST Abstract Syntax Tree. 1, 7, 11, 12, 26, 29, 31, 35, 40–44, 46, 48, 49, 62, 64, 70 BNF Backus-Naur Form. 25, 28, 29, 44, 209 CFG Context Free Grammar. 10 DFA Deterministic Finite Automata. 9, 10, 27, 33, 35, 37–39, 45, 46, 58 EBNF Extended Backus-Naur Form. 25, 209 FLEX Fast Lexical Analyzer Generator. 23, 28, 32–34, 44–46, 48, 58, 64, 65, 67, 81 LALR Look Ahead LR parser. 12, 13, 15, 28, 30, 31, 33, 58, 67 LEX Lexical Analyzer Generator. 27, 28 LL Left to right, Left most derivation parser. 11, 12, 25 LR Left to right, Right most derivation parser. 11–13, 15, 65 NFA Non-Deterministic Finite Automata. 9 OMC OpenModelica Compiler. 1–3, 5, 23, 26, 27, 33, 34, 49, 54, 57, 61, 64–66, 70 OMCCp OpenModelica Compiler-Compiler parser generator. 2, 23, 33, 34, 39, 44–46, 49, 50, 54, 55, 57–59, 61, 62, 64–66, 69–71, 81 OSMC Open Source Modelica Consortium. 1, 2 PDA Push-Down Automata. 10, 11, 29, 41, 46, 49, 58 211

212

SLR Simple LR parser. 12 YACC Yet Another Compiler-Compiler. 28

Acronyms

Avdelning, Institution Division, Department

Datum Date

IDA, Dept. of Computer and Information Science 581 83 Link¨ oping Spr˚ ak

Rapporttyp Report category

ISBN

 Svenska/Swedish

 Licentiatavhandling

ISRN

×  Engelska/English

×  Examensarbete

Language

 C-uppsats  D-uppsats ¨  Ovrig rapport



2011-05-31

– LIU-IDA/LITH-EX-A–11/019–SE Serietitel och serienummer ISSN Title of series, numbering -

 URL f¨ or elektronisk version

Link¨ oping Studies in Science and Technology Thesis No. LIU-IDA/LITH-EX-A–11/019–SE

http://urn.kb.se/resolve?urn=urn: nbn:se:liu:diva-68863 Titel Title

OMCCp: A MetaModelica Based Parser Generator Applied to Modelica F¨ orfattare Author

Edgar Alonso Lopez-Rojas Sammanfattning Abstract

The OpenModelica Compiler-Compiler parser generator (OMCCp) is an LALR(1) parser generator implemented in the MetaModelica language with parsing tables generated by the tools Flex and GNU Bison. The code generated for the parser is in MetaModelica 2.0 language which is the OpenModelica compiler implementation language and is an extension of the Modelica 3.2 language. OMCCp uses as input an LALR(1) grammar that specifies the Modelica language. The generated Parser can be used inside the OpenModelica Compiler (OMC) as a replacement for the current parser generated by the tool ANTLR from an LL(k) Modelica grammar. This report explains the design and implementation of this novel Lexer and Parser Generator called OMCCp. Modelica and its extension MetaModelica are both languages used in the OpenModelica environment. Modelica is an Object-Oriented Equation-Based language for Modeling and Simulation.

Nyckelord Keywords

Compiler, Parser, Lexer, Modelica, MetaModelica, OpenModelica, Simulation,LALR, Flex, Bison, Parser Generator,OMCC,OMCCp