messengers-c compiler msgr-06 - CiteSeerX

3 downloads 0 Views 129KB Size Report
dent byte code becomes an autonomous object (messenger) and travels from ... Messengers compiler is responsible for translation of the high level Mes-.
MESSENGERS-C COMPILER MSGR-06 Bozhena Bidyuk Department of Information and Computer Science University of California, Irvine e-mail: [email protected] Summer 1995

1

Abstract

The purpose of this research is an object-oriented design and implementation of the compiler for a language utilized by a new distributed computing framework called Messengers developed at the Department of Information and Computer Science at University of California, Irvine, by joint e orts of the research groups of Prof. Bic and Prof. Dillencourt. Messengers system utilizes a new autonomous-object programming approach to the distributed computing and incorporates enhanced autonomy and network coordination features. Messengers programs are written in high level Messengers language. Once compiled and assembled, machine independent byte code becomes an autonomous object (messenger) and travels from one node of the distributed system to another carrying the data and code that de ne their behavior. At each physical node, a continuously running Messengers Daemon process interprets the behavior and transmission of the messengers to other nodes. Messengers compiler is responsible for translation of the high level Messenger code into Messengers intermediate language, a three-address assembler code. 1

2

Introduction

Messengers language is a subset of C with enhanced set of instructions for network navigation and coordination and operations on strings. Messengers compiler translates Messengers programs into intermediate Messengers code which is further processed by Messengers assembler developed by another member of the research team { Alexander Thornton. Lexical analyzer for the compiler is written in ex, a GNU version of lex [6] { a lexical analyzer generator. A grammar for Messengers language is written usingbison, a GNU version of yacc [6] { a parser generator. The rest of the compiler is written in gnu version of C++ [7] { gcc, version 2.7.2. Compiler has been implemented and tested in SunOS, version 4.1.4, release 2, environment. In the current version of Messengers compiler, the e orts were concentrated on the de nition of the stable version of Messengers language { simple and yet complete, development of the Messengers grammar, implementation of the functional Messengers compiler that will allow to test integrated Messsengers system and discover possible weaknesses in the original design, and preparation of usable user documentation describing Messengers language and Messengers compiler design. At the present time, Messengers compiler is a one-pass compiler. It performs optimizations on expressions such as constant folding and algebraic transformations. The changes in the future versions of the compiler are planned to incorporate a second pass, through intermediate code, and data ow analysis for constant propagation, copy propagation, and dead-code elimination. Planned optimizations are expected to improve the performance of Messengers program not only by reducing the number of instructions but also by reducing the amount of code and data that must be transferred between network nodes. 3

Messengers Language Description

3.1

Format

Messengers language inherited the format of variable declarations, statements, and expressions from C with following additional rules imposed:  no preprocessor directives used in C are implemented;

2

 node le declaration (see section 2.5.3), if present, must be a rst in-

struction of the messenger's program;

 messengers variables must be declared at the beginning of the program

immediately following the node le declaration (if it is present) and preceding any expressions or statements instructionsn [2].

3.2

Reserved Words

The following keywords are reserved for Messengers language construct usage and cannot be used as variable names: rstwordcolumn seconwordcolumn thirdwordcolumn forthwordcolumn break le name weight2 char func node weight3 create hop NODE weight4 delete if out WEIGHT0 do int physical WEIGHT1 double in short WEIGHT2 else link struct WEIGHT3

oat LINK weight0 WEIGHT4 exec long weight1 while exit The above list contains a subset of reserved words from standard C language as well as the reserved words speci c to Messengers language which are outlined in bold. 3.3

Data Types

Standard scalar types as well as struct and array types are implemented in Messengers language similarly to C. However, union type and enumerated types are eliminated as well as pointer types for the sake of simplicity and proper functionality on the network level. Pointer types would be particularly dangerous to use because their values are only meaningful within the limits of a computer system on which they are de ned. 3

3.3.1

Scalar Types

All standard data types such are implemented: char, int, short, long, oat, double. Note, however, that unsigned data type is not implemented. The byte sizes of speci ed above data types are the same as those in C language but they do depend on the hardware platform used. Currently, Messengers system is run on sparc sun4 stations. The table below presents Messengers types and their sizes in bytes in current implementation : rstcolumn secondcolumn thirdcolumn Data Type Size in Bytes Size in Bits char short int long

oat double 3.3.2

1 2 4 4 4 8

8 16 32 32 32 64

Complex Data Types

Arrays are implemented in Messengers with syntax and semantics identical to C language with the exception of dynamic arrays. Number of elements in the array must be an integer and it must be speci ed at the time of variable declaration. Dynamic speci cation of the number of array elements is not allowed. Structures are implemented in Messengers equivalently to their implementation in C language with the exception of bit-size eld declarations. However, Messengers compiler guarantees that structure's elds will be allocated in a memory in the order in which they are declared. The union and enumerated types are not implemented. 3.4

Constants

Character constants, string constants, integer and long integer constants are implemented in Messengers language equivalently to C implementation. Due to the lack of the pointer types, string type representation is limited to array 4

of characters which makes it a user's responsibility to reserve the proper number of characters in the array for their data. Integer numbers are assigned type integer. Long constants must be explicitly speci ed with letters l or L at the end. The representation of oating point constants is currently limited to the standard decimal point format. Exponential representation of oating point constants may be implemented in the future. All oating constants are assigned type double. 3.5

Variables

In Messengers language all variables are global. Therefore, speci cations of memory classes such as auto, register, and static are not allowed. The types of the variables can be of any scalar type, array type, or structure type as de ned above. Void type is not available. In the current version of compiler, variable initialization at the time of declarations is not allowed. However, this standard C feature will be implemented in the next version. All variables in a Messengers program can be subdivided into three groups: messenger variables, network variables, node variables, and command line arguments. Below we provide a detailed description of the speci cs of those variables declarations and utilization. 3.5.1

Messenger variables

Messenger variables include all variables declared within Messengers program. All messenger variables are global and are carried from node to node with a messenger [2]. 3.5.2

Command line arguments

Messengers program deals slightly di erently with command line arguments than C. Speci cally, user must declare command line arguments along with messenger variables. All command line arguments must have a name of the form $argi where i is a positive integer number specifying the number of the argument on command line [2]. All command line arguments are treated in the program just like messenger variables.

5

3.5.3

Node variables and #node declaration

A special category of variables is introduced in the Messengers language called node variables. The major di erence between node variables and messenger variables is that they are maintained only at the same logical node and cannot be carried to another node by the messenger. Node variables must be declared in a separate le which must be speci ed in the beginning of the Messengers program in the following format: #node le-location-and-name; where le-location-and-name is a string constant specifying path to le location if not in current directory and le name [2]. For example: #node "/usr/john/application/my- le.node" 3.5.4

Network variables

Network variables reside at the logical network node and provide information about the logical node and link. All network variables are read-only variables. They are prede ned, and a user should not attempt to declare them. Following is a list of the prede ned network variables with their types and purpose described [2]: $node | the current logical node name, type - char [64]; $address | the current logical node address, type - double; $link | the name of the last passed link, type - char [64]; $sign | the direction of the last passed link, type - int; $weight0, $weight1, $weight2, $weight3, $weight4 | the weights of the last passed link, type - int. User should note that even though all network variables are read-only, value of $address variable can be modi ed by a hop statement.

6

3.6

Expressions

The following set of operators is implemented in Messengers language:  binary operators: + - * / % = += -= *= /= %= > < == >= (less than), = (less than or equal) are de ned as in C and on strings in which case they compare the lengths of the strings;

 operator