See it at CiteSeerX

4 downloads 9124 Views 212KB Size Report
This paper focuses on implicit concurrent object-oriented programming of massively parallel programs. ...... The Btree program has one processor dedicated for.
Object Oriented Systems 4, 1997, 53–81

SYMPAL: a software environment for implicit concurrent object-oriented programming Yariv Aridor,1 Shimon Cohen and Amiram Yehudai Computer Science Department, Tel-Aviv University, Tel-Aviv 69978, Israel fyarivag,[email protected]; [email protected] 1 Current address: IBM Research, Tokyo Research Laboratory, 1623-14, Shimotsuruma, Yamato-shi, Kanagawa-ken 242, Japan

Large-scale parallel machines hold great potential for attaining high-performance computing. However, writing explicit parallel programs that correctly manage parallelism among thousands of processes, thus utilizing the power of parallel machines, is a highly complicated task. This paper presents a practical parallel programming environment, SYMPAL , designed to achieve a high level of parallel performance while simplifying the parallel programming task. S YMPAL incorporates the advantages of both object-oriented and functional programming paradigms, with the goal of supporting multiparadigm and implicit parallel programming. The SYMPAL environment consists of a programming language, an optimizing compiler and a run-time system. The overall complexity of the programming task is handled through a division of labour among these components. The language’s inherent parallelism facilitates the extraction of potential parallelism, while the optimizing compiler and run-time system efficiently manage the available parallelism. S YMPAL has been efficiently implemented on a MIMD machine with eight processors and on several uniprocessors. Performance analysis of several ‘real’ programs such as the SYMPAL compiler itself and Nbody simulations is included. Keywords: Object-oriented programming, concurrency, functional programming, efficiency, implicit programming

Large scale machines, capable of executing thousands of processes in parallel, have already become commercially available (Shimizu et al. 1992, TM Corporation 1991). Writing explicit parallel programs that efficiently exploit the power of these machines is a highly complicated task; programmers must extract large amounts of parallelism in their programs and correctly manage the communication and synchronization among a vast number of processes. This task is usually left to expert programmers, thus preventing the widespread use of parallel processing. Extensive research on functional programming and object-oriented programming has been carried out over the years, with the intention of simplifying parallel programming. Functional approaches to parallel programming offer the advantages of implicit parallelism, implicit synchronization, and determinacy of execution (Peyton Jones 1989). However, they are less practical for developing programs that are either best expressed or more efficiently expressed by using mutable data, or can benefit from the advanced concepts of inheritance and encapsulation used in object-oriented programming (OOP). This work is part of the first author’s PhD thesis (Aridor 1995), supervised by the other two authors. It was supported in part by a grant from the German–Israeli Foundation for Scientific Research and Development.

0969–9767 c 1997 Chapman & Hall

54

Yariv Aridor, Shimon Cohen and Amiram Yehudai sequential language

optimizing compiler

parallel code

(a)

inherently parallel language

parallelizing

optimizing compiler

reducing parallelism

parallel code

(b)

Figure 1: Types of implicit programming: (a) traditional implicit programming; (b) expose-and-then-reduce implicit programming.

OOP provides a natural approach to parallel programming in which objects are considered as concurrency units, to be executed in parallel (Yonezawa 1990). Consequently, many concurrent object-oriented programming (COOP) languages have been developed in recent years, most of them derived either from models of sequential languages, such as C++ and SMALLTALK, or from the Actor model (Agha 1986). All COOP languages comprise an extension of these computational models, with special language constructs for concurrency control, thus supporting explicit concurrent OOP. This paper focuses on implicit concurrent object-oriented programming of massively parallel programs. It presents a practical programming environment, SYMPAL, that incorporates the advantages of both the above-mentioned programming paradigms, with the goal of supporting implicit programming and high efficiency in massively concurrent OOP. While implicit programming with conventional languages (mainly HPF) focuses on regular programs and data-parallel algorithms (Foster 1994), SYMPAL is based on a new approach of expose-and-then-reduce parallelism for implicit COOP, characterized by irregular programs and task-parallelism. It is a hybrid approach (see Fig. 1) in which an inherently parallel language facilitates the extraction of maximum parallelism, while a compiler and a run-time system are responsible for optimization and efficient management of parallelism. As will be shown, this approach strikes a successful trade-off between extraction of massive parallelism, high performance, and ease of use for implicit concurrent object-oriented programs. This paper is organized as follows. The rest of this section describes the paper’s original contributions. Section 1 describes the language. Section 2 describes the compilation process. Section 3 presents performance results obtained on an MIMD machine. Section 4 includes a comparison of SYMPAL and other related systems. Section 5 includes a summary and discussion of future work. The following is a summary of the paper’s original contributions:

 

While most COOP languages provide explicit programming constructs for communication, synchronization and object consistency, SYMPAL is an inherently parallel language that supports these features implicitly. It is based on the solid programming model of Actor (Agha 1986). S YMPAL presents a new approach for implicit programming of concurrent object-oriented pro-

SYMPAL: a software environment for implicit concurrent object-oriented programming

55

EXTENSIONS FOR CONCURRENT OBJECT-ORIENTED PROGRAMMING

FUNCTIONAL LANGUAGE

Figure 2: The S YMPAL language

grams, implementing irregular computations.

 

1

S YMPAL’s compiler applies several optimizations (for efficient method invocations and optimized communication) which are applicable to other COOP languages. S YMPAL was implemented on a MIMD computer. The paper describes extensive experience in developing concurrent object-oriented applications, in Sympal, which include ‘real’ ones such as the SYMPAL compiler ( 5000 lines) and Nbody simulations ( 1000 lines) and presents parallel performance evaluation of these applications. Such experience has rarely been reported to date.

The SYMPAL language

S YMPAL is an untyped parallel language based on the synthesis of functional and object-oriented paradigms. As shown in Fig. 2, it is composed of a pure functional language, defined with inherently parallel semantics, and extensions to concurrent OOP. The language is based on the Actor computational model (Agha 1986). The composition of OOP on top of a parallel functional base language yields the following features.

 



Multiparadigm programming Parallel programs can be written in a purely functional style or an object-oriented style. Moreover, method invocations and function calls are treated (semantically) in the same way, providing better linguistic support for concurrency in Actor-based languages (explained later). Implicit concurrent OOP Parallelism in SYMPAL is the default execution mode. Sequencing occurs only when it is necessitated by data-dependencies. Specifically, all communication between objects is asynchronous. In addition, the language is based on implicit synchronization (i.e. synchronization implemented by special code which is generated automatically by the compiler) for values of parallel subcomputations (e.g. the reply values of messages) and implicit intraobject parallelism (i.e. simultaneous method invocations on the same object). Simplicity and high expressive power ming constructs : function call/send if-then-else

S YMPAL has a design based on a minimal set of program-

message-passing, parallelism

synchronization

56

Yariv Aridor, Shimon Cohen and Amiram Yehudai finally

atomic update of objects, intraobject parallelism

defclass/defmethod/defun

definition of classes, methods and functions.

Such a design maintains high expressive power, in comparison with SYMPAL’s explicit counterparts, and makes the language easy to use, much like a sequential language. These features are demonstrated in later sections.

1.1 Parallel functional programming The parallel functional subset of SYMPAL is a pure language that inherits its data types, primitive functions and forms for the definition of functions and macros from Common-Lisp. However, its main constructs are defined to have parallel semantics, so that every expression can be evaluated in parallel. These constructs are: function call (F a1 : : : an ) or (LET ((v1 a1 ) : : : (vn an )) body) In both forms, the actual arguments expressions a1 ; : : : ; an and the body of the function are evaluated in parallel. if-then-else (IF cond then else) The cond expression is evaluated first. If its value is non-nil, then is evaluated: otherwise, else is evaluated. if-then-else is the synchronization construct (see example below). POR (POR c1 : : : cn ) All c1 : : : cn expressions are evaluated in parallel. The return value is that of the first expression ci that terminates with a non-nil value, although the computation of the other expressions continues. (A way of aborting the computations of all the irrelevant expressions is described in Aridor (1995), Chapter 2.) If all these expressions are evaluated as nil, the return value is nil. As an example, the program in Fig. 3 implements a parallel monitor of streams. (The first and rest functions are equivalent to the corresponding car and cdr built-in functions of COMMON-LISP.) The streams-monitor function examines two streams, s1 and s2, in parallel and returns a list of output data items corresponding to the input data items in both streams. As soon as input is obtained from one of the streams, the monitor is immediately released to continue. The reason for this is that after the condition part of the corresponding IF statement is evaluated, the evaluation of its then branch spawns a new recursive call for the stream-monitor function, in parallel, to immediately wait for new input. The do-work function splits each input data item, composite-data, into two parts (by the part1 and part functions), which are processed in parallel. The combine function combines the intermediate output results (out1 and out2) to generate the corresponding output data item. In general, sequencing occurs either because of if-then-else (i.e. the condition part must terminate prior to the execution of any of the then/else branches) or because the values of parallel subcomputations are required by primitive functions. In these cases, implicit synchronization is applied to wait for the required values.

1.2 Concurrent object-oriented programming The computational model of the object-oriented subset of SYMPAL is based on the Actor model (Agha 1986), which describes computation by autonomous actors (objects), of any granularity, that communicate via asynchronous message-passing. In SYMPAL, as in the Actor model:

SYMPAL: a software environment for implicit concurrent object-oriented programming

57

(DEFUN streams-monitor (s1 s2) (POR (IF s1 (cons (do-work (first s1)) (streams-monitor (rest s1) s2))) (IF s2 (cons (do-work (first s2)) (streams-monitor s1 (rest s2)))))) (DEFUN do-work (composite-data) (LET ((out1 (process (part1 composite-data))) (out2 (process (part2 composite-data)))) (combine out1 out2)))

Figure 3: Streams monitor.

   

Objects are dormant when created and perform actions only in response to the arrival of messages. Messages are handled by an object one at a time, in the order of their arrival (incoming messages are enqueued in a private queue of the object, if necessary). As a result of processing a message, an object may update its instance variables, send new messages to other objects (or to itself), or create new objects. The behaviour of an object is defined by a finite set of methods. Every method determines the actions to be performed as a result of receiving a message.

The object-oriented subset includes extensions to the functional base language to support class and method definitions, dynamic object creation, message-passing and updating of objects. In addition, SYM PAL supports single inheritance. In general, a SYMPAL object-oriented program is composed of a collection of class definitions, with separate definitions of methods, and possibly functions and macros.

1.2.1 Communication and synchronization Messages between objects are sent via SEND expressions of the form: (send(SEND O msg e1 : : : en )

where O is an expression whose value is the target object, msg is the name of the message and e1 : : : en are the parameters of the message. A parameter can be any valid SYMPAL expression. All the parameters are evaluated in parallel. In SYMPAL, SEND expressions are treated, semantically, as regular function calls:

  

all message arguments are evaluated in parallel the message is sent without waiting for the completion of the evaluation of all its parameters the value of a SEND expression is the return value of the corresponding method that is invoked.

Consequently,



Owing to the parallel semantics of the functional subset (i.e. all expressions, including SEND ones) are evaluated in parallel), all communication is asynchronous.

58

Yariv Aridor, Shimon Cohen and Amiram Yehudai





SEND expressions can be nested in expressions of any type. In addition, variables can be bound to values of SEND expressions via a let construct, thus enabling the use of the reply values of messages, directly, at different places in the program. This is a very useful linguistic extension to the basic Actor model (Agha 1986), simplifying the development and understanding of programs. An example is shown in section 2.2.3. Synchronization (via special code generated by the compiler) is applied to suspend the execution of any subcomputation that requires a not-yet-computed reply value of an asynchronous message.

As an example, consider a program for evaluating a numeric expression represented as a tree of objects. Each internal object represents a binary operation, and each leaf represents a value. A subtraction operation is represented by an object of the class difference. Figure 4 includes three versions of this class definition in SYMPAL and two ancestor Actor-based languages, SAL and ABCL. The SAL language (Agha 1986) is a minimal Actor-based language without any extensions. The eval method is completed immediately after two asynchronous eval messages have been sent. The object is unlocked to receive the values of the operands, which are sent back via get-value messages. The get-value method is a continuation of the eval method in which the subtraction operation is applied. Since the final value is computed and sent back from the get-value method and not from the eval method, two additional instance variables must be used. The father variable saves the address of the sender object to which the reply value should be sent. The value variable is used to accumulate the reply values. The ABCL (Yonezawa 1990) version is based on ‘future type’ messages that is an extension to the basic Actor model. Every asynchronous message is associated with a special future object that is created explicitly to save its reply value. The execution of the current eval method is suspended, by a special built-in next-value primitive, to wait for future reply values. The variables L and R, which are associated with the future values allow to subtract the reply values in the correct order. The explicit ‘!’ construct is used to return the final value. The SYMPAL version is, actually, an implicit version of the corresponding ABCL program. Implicit synchronization is applied in order to wait for the values of the SEND expressions before they are used within the subtraction. The result of the subtraction is automatically sent back as the reply value. The code looks exactly like sequential code. To summarize, SYMPAL can further simplify programming using asynchronous message-passing and implicit synchronization, in comparison with its ancestor languages such as ABCL and SAL.

1.2.2 The FINALLY concept S YMPAL uses a powerful construct named FINALLY to support assignments to instance variables in methods. The FINALLY expressions are used as tail expressions of a method, so at most one FINALLY expression is executed during an invocation of a method. Execution of a method is divided into two stages: actions and finally. The actions stage includes all actions, such as message-passing and function calls, excluding assignments to instance variables. During this stage the object is locked. The second stage is an execution of a FINALLY expression that updates the instance variables and unlocks the object. A FINALLY expression has the form (FINALLY

E

((v1

E1 ) : : : (vn En )))

SYMPAL: a software environment for implicit concurrent object-oriented programming

59

def difference (value,right,left,father) [case operation of eval : (customer) get-value : (v,sender) end case] if operation = eval then send eval request to right send eval request to left become difference(0,right,left,customer) fi if operation = get-value then let new-value = if sender = right then value - v else value + v fi { if value =/ 0 then send to father become difference(0,right,left,nil) else become difference(new-value,right,left,father) fi } fi end def (a) [object difference (state left,right) (script (=> [:eval] (temporary [future1:=(make-future)] [future2:=(make-future)] L R) [left count 0) (FINALLY (SEQUENCE (sleep 1) (SEND self :ticks)) ((count (- count 1)))) (PARALLEL (SEND Person :time-is-up) T)))

(DEFMETHOD (AlarmClock :wake-me) (person time) (FINALLY (send self :Ticks) ((count time) (person person))))

(DEFMETHOD (AlarmClock :time-is-left) nil count)

Figure 7: An alarm clock.

program to send ticks self messages, recursively, to the same object, and to handle an interleaved sequence of time-is-left and ticks messages.

1.3.3 Project team The program in Fig. 9 creates a set of objects which collaborate to solve a given task. The program has complex communication and synchronization patterns, and thus conveys the flavour of SYMPAL and a general impression of its expressive power. This program can be compared with an equivalent ABCL program in (Yonezawa 1990). Figure 8 shows a diagram of possible interaction among the objects (thick lines denote busy objects and arrows denote messages flow). A client object activates a project team to solve a problem by a given deadline (by sending a start message). A project team consists of a project leader object and multiple worker objects. The project leader activates all the workers to solve the problem (by sending solve messages). Each worker uses a different strategy and works independently. The project leader also tries to solve the problem by itself. When the deadline is reached, it requests the client to extend the deadline (by sending a extend-deadline message). If the deadline is extended, it resets the alarm clock object with the new deadline (by sending a wake-me message). As soon as a solution is found (indicated by the horizontal dashed line), the project leader sends it to the client (sending a finish message)

SYMPAL: a software environment for implicit concurrent object-oriented programming CLIENT

ALARM

PROJECT

CLOCK

LEADER

WORKER n

WORKER i

WORKER 1

63

START

WAKE-ME

SOLVE

SOLVE SOLVE SOLVE

TIME-IS-UP

EXTEND_DEADLINE

WAKE-ME

STOP TIME

FINISH

STOP STOP

STOP

Figure 8: Interaction diagram for the project team program.

and instruct all the workers to stop (by sending stop messages). In the source code:

  

Waiting for one-of-many replies from the workers, in solve-by-team, is expressed by means of the POR construct. Intra-object parallelism enables the project leader object and the worker objects to handle messages such as time-is-up and stop while they are busy trying to solve the problem. Once a solution has been found, all the workers should stop work. Therefore, each worker periodically examines a local stop variable that is set by the leader object when a solution is found.

1.3.4 Moving to parallel programs As can be seen in the above examples, minimal efforts are required to move from sequential objectoriented programs to concurrent object-oriented programs in SYMPAL. These efforts include:

 

Making decisions regarding sequential blocks (defined by PARALLEL). By default, blocks are parallel (defined by PARALLEL). Writing extra code for high-level synchronization between different phases of computations, e.g. successive Nbody simulation steps. (This was not required for any of the programming examples given in this paper.) This type of code is unavoidable with any parallel language.

The relatively easy transformation to parallel programs in SYMPAL is due to the inherent parallelism and implicit mechanisms of the language.

64

Yariv Aridor, Shimon Cohen and Amiram Yehudai

(DEFCLASS Worker ((stop nil))) (DEFCLASS ProjectLeader :Worker

((solution nil) team time-keeper client stop spec))

(DEFMETHOD (ProjectLeader :ProjectLeader) (report-to program-spec) (FINALLY T ((spec program-spec) (team (list self)) (alarm-clock (new :AlarmClock)) (client report-to)))) (DEFMETHOD (ProjectLeader :add-team-member) (worker) (FINALLY T ((team (cons worker team))))) (DEFMETHOD (ProjectLeader :start) (spec deadline) (SEQUENCE (SEND alarm-clock :wake-me self deadline) (LET ((sol (solve-by-team Spec team))) (FINALLY (SEQUENCE sol (SEND self :set-solution sol)) nil)))) (DEFMETHOD (ProjectLeader :set-solution) (sol) (PARALLEL (SEND client :finish sol) (stop-all-workers team) (FINALLY T ((solution sol))))) (DEFUN solve-by-team (spec team) (IF team (POR (SEND (first team) :solve spec) (solve-by-team spec (rest team))))) (DEFUN stop-all-workers (team) (IF team (PARALLEL (SEND (first team) :stop) (stop-all-workers (rest team))))) (DEFMETHOD (ProjectLeader :time-is-up) (UNLESS solution ;; (unless ) == (if (not ) ) ( ;;; ask for a new deadline. otherwise, (PARALLEL (SEND client :finish ’failure) (stop-all-workers team))))) (DEFMETHOD (Worker :stop) (FINALLY T ((stop t)))) (DEFMETHOD (Worker :solve) (spec) (UNLESS stop (SEQUENCE ... start/continue solving the problem. (send self :solve spec)))); to examine the stop variable

Figure 9: A project team.

SYMPAL: a software environment for implicit concurrent object-oriented programming

65

(DEFMETHOD (c1 m1) () ... (IF (PARALLEL (SEND ) .. (SEND ) (FINALLY (...))) ; update instance variables ...) ) (a) (DEFMETHOD (c1 m1) () ... (LET ((v (SEND ))) ... (IF v ... ; use the reply of the message ) (b) (DEFMETHOD (c1 m1) () ... (PARALLEL ... (SEND self ) ; continuation (FINALLY (...))) ; update instance variables ) (c) (DEFMETHOD (c1 m1) () ... (FINALLY (SEND ) ; reply (...)) ; update instance variables ) (d)

Figure 10: Idioms in SYMPAL concurrent object-oriented programs: (a) full asynchronous message-passing; (b) future-based communication; (c) early reply; (d) message chaining.

66

Yariv Aridor, Shimon Cohen and Amiram Yehudai Object A

Object B

decoding message creation communication

message transfer message scheduling blocking method dispatch method execution reply message creation communication reply message transfer

resuming

Figure 11: Asynchronous message-passing between objects A and B.

1.4 Method idioms As a summary of the language description, we present the following method idioms based on our experience of programming several applications of over 10 000 lines of source code. These idioms demonstrate the style of programming in SYMPAL. In addition, they indicate potential places for the optimizations discussed in the next section. Figure 10a includes asynchronous message-passing to activate multiple objects in parallel. The replies to these messages are never used by the current method. Figure 10b includes sending an asynchronous message and later synchronizing, awaiting its reply. Figure 10c shows a case of an early reply. The reply to the current message is sent immediately while a new message is being sent to the object itself as a continuation of the current method. Figure 10d shows a case of message chaining in which the reply to the current message is the reply to a new message sent by the object.

2

The optimizing compiler

S YMPAL belongs to a class of parallel languages in which parallelism is clearly defined by the language. Parallelism in SYMPAL is inherent, so no compile-time analysis is needed in order to extract it. However, naive implementation of SYMPAL often creates orders of magnitude more parallel tasks than can be exploited by a parallel machine. A program is broken into an enormous number of fine-grained subcomputations, which create very frequent parallel events of synchronization, communication, context switching, and heap memory allocation. Figure 11 shows a scenario of asynchronous message-passing between objects A and B. The required operations are:

SYMPAL: a software environment for implicit concurrent object-oriented programming SYMPAL program

67

SOURCE TRANSFORMATIONS TYPE INFERENCE

STRICTNESS ANALYSIS

OPTIMIZING COMMUNICATION

CODE GENERATION

parallel code

Figure 12: Compilation phases of SYMPAL

  

decoding, to obtain the target object location, message creation, to pack all the message arguments and to create a placeholder for the future reply value message transfer, to deliver the message to the target object.

The operations at the processor of the receiver object are

 

method scheduling, to spawn a new task for method invocation, method dispatching (dynamic binding), and return of a reply to the message via message creation and message transfer.

If the reply to the message is needed by the current method in object A, synchronization (blocking, resuming) is also performed. Unless the execution grain-size of the remote method invocation is large enough, the overhead of these operations can overwhelm any speedup gained by activating objects A and B in parallel. Thus, the main goal of the compiler is to minimize the overhead and the frequency of these events. The compiler is written in SYMPAL itself. It produces object code in C with calls to run-time services for task management and communication. Figure 12 shows the compilation phases: 1. Source code transformations are performed to guarantee intraobject parallelism. 2. Type analysis is performed for objects, providing type information for object variables. 3. Strictness analysis is carried out to determine which expressions should be executed in parallel. 4. Interobject communication is optimized. 5. The object code is generated. Throughout the rest of this section, the terms ‘evaluation’ and ‘execution’ are used interchangeably.

68

Yariv Aridor, Shimon Cohen and Amiram Yehudai

2.1 Type inference Inferring the types of object variables during compile-time (Dean et al. 1996, Plevyak and Chien 1995) is essential for carrying out static method dispatching and interprocedural strictness analysis (see below). A complete type inference has not yet been implemented. Currently, we exploit only the information about method invocations for which there is a single alternative to the actual method that is dispatched.

2.2 Strictness analysis This analysis determines for every expression whether it should be executed in parallel, eliminating parallelism wherever it is guaranteed that no potential speedup can be gained. It was originally used in the context of side-effect free functional languages (Gray 1986). It is both intraprocedural and interprocedural (wherever the invoked methods can be determined statically) analysis, based on a strictness property of expressions: given a variable v, an expression E is strict with respect to v, if v is always touched. (Touching is a run-time operation applied to a variable in order to check whether it has been assigned a real value. Otherwise, the execution of the current subcomputation is suspended until its value has been computed.) during the evaluation of E. In that case, v is said to be strictly used in E. The use of this property eliminates parallel overhead in several cases.

2.2.1 Already touched variables If a variable V is strictly used in an expression E, no special code for touching v is needed wherever v is strictly used in other expressions that are control-dependent on the value of E. Consider the following if-then-else expression: (IF (> n 0) (SEND ... (- n 1)) T)

Once n has been touched before the comparison, it is redundant to touch it again before the subtraction (, n 1). To fully exploit this optimization, the compiler traverses all the methods of a program and eliminate touching of instance variables that are determined to be always touched.

2.2.2 Unblocked expressions If all the strictly used variables of an expression are touched, its evaluation time depends only on its granularity (i.e. the evaluation will not be suspended for an unpredictable amount of time in order to wait for future values of these variables). Thus, it is more efficient to evaluate such an expression in sequence, unless its granularity justifies spawning a new task to speed up the program execution. Two kinds of expression fall into this category: 1. Side-effect free expressions in which all the variables are touched. The subexpression (, n 1) of the if-then-else expression in section 2.2.1 is one such example. 2. Method invocations that are never suspended while awaiting future values. The corresponding methods are termed unblocked methods. For local asynchronous messages, invocation of unblocked methods can be done sequentially, without affecting the semantics of the program. (In the general case, a sequential invocation of non-unblocked method M by an object O may cause deadlock due to suspension of the invocation of M in order to wait for replies to messages sent to object O while it is still locked by the invocation of M.) The following TerminateStep method, taken from the Nbody TREE program (see section 3), is an example of an unblocked method (x is determined by the compiler to be always touched).

SYMPAL: a software environment for implicit concurrent object-oriented programming

69

(DEFMETHOD (Cell TerminateStep) nil (IF (= x 0) (FINALLY T ((x 1))) (PARALLEL (SEND father :TerminateStep) (FINALLY T ((x 0))))))

A special boolean value in every entry of the method tables is used to indicate unblocked methods.

2.2.3 Immediate strictness Given an expression E1 , it is immediately strict (a term borrowed from (Gray 1986)) with respect to the value of another expression E2 if the following conditions are satisfied:

 

The granularity of the beginning of E1 up to the point where the value of E2 is used is smaller than the granularity of a task creation. Sequential evaluation of E2 is guaranteed not to cause any deadlocks (e.g. E2 is side-effect-free).

In this case, E2 is evaluated in sequence. As an example, consider the following code segment (also the method pattern in Fig. 10a): (LET ((v (SEND ....))) ; body starts here .. // do something... (IF (v) ... ... ; body ends here )

Let E1 be the body of LET and E2 be the SEND expression. If the above conditions are satisfied, the message is sent synchronously. If it is a local one, the corresponding method is invoked in sequence prior to the execution of the LET body. It is not always possible to exploit immediate strictness without some restructuring of the object code. As an example, consider the following method: (DEFMETHOD (...) nil (LET ((a (send o1 :m1 ..)) (b (send o2 :m2 ..))) (+ a b)))

Owing to immediate strictness, one of the SEND expressions can be evaluated in sequence. In practice, the execution time is improved only if one of these expressions results in a local method invocation. Such an invocation is guaranteed by the following structure of the object code:

70

Yariv Aridor, Shimon Cohen and Amiram Yehudai reply value

O1

O2 M1

Mi

....

Oi

....

On Mn

Figure 13: Tail-recursion elimination. if f b < , reply value of asynchronous message m2 (...); a < , m1 (...); g else f a m1(); o2->m2(); l3=o3->m3(l1); finally ((v l3)) f return o4->m4(); g g

The m1 and m2 member-functions invocation are executed in parallel, since no data-dependencies exist. However, the m3 member-function is invoked asynchronously after the execution of the m1 memberfunction invocation is completed since the former is data-dependent on the return value of the latter (assigned to the l1 local variable). Once the value of l3 has been computed and bound to the instance variable v, the object is unlocked and the execution proceeds with the invocation of the m4 member-function. Its return value becomes that of the method.

5.2 Future work In the near future we will focuses on extending the language and further improving performance, specifically,





Extending SYMPAL with new features such as synchronization constraints (to allow delays in the handling of specific messages). In the current version, SYMPAL does not support the notion of delayed messages. If messages cannot be handled, some error code is returned. We plan to investigate available schemes (Matsuoka et al. 1993) for synchronization constraints that avoid cases of inheritance anomalies. Studying the composition of programs with policies for efficient placement of objects. Given a SYMPAL program, efficient placement of its objects among the processors of the target parallel

SYMPAL: a software environment for implicit concurrent object-oriented programming

79

machine can improve load balancing and locality among objects, resulting in better parallel performance. Our idea is to organize objects into hierarchal containers for which placement policies are specified. In practice, programs will include definitions of container objects that control the placement of their internal objects, based on specific placement policies. Consequently, the specification of policies for object placement and the program code are kept separated, maintaining the modularity of both parts. Our initial work on containers is described in Aridor (1995).

 

Exploring run-time optimizations based on profiling execution of improve performance (Telem 1996).

SYMPAL

programs, to further

Exploring ways (e.g. language annotations) to control the granularity of computations to achieve better performance (and portability) across different parallel platforms. While the compiler can manage synchronization and communication efficiently, it cannot automatically control and tune the granularity of computations.

Acknowledgment We wish to thank Mike MacDonald from IBM Japan for checking the wording of this paper.

References Agha, G. (1986) ACTORS:A Model for Concurrent Computation in Distributed Systems. MIT Press, Boston, MA. Aridor, Y. (1995) An efficient software environment for implicit parallel programming with a multiparadigm language. PhD thesis, Tel-Aviv University. Athas, W.C. (1987) Fine grain concurrent computations. Technical Report TR-87-5242, Computer Science Dept, California Institute of Technology. Bacon, D., Graham, S. and Sharp, O. (1994) Compiler transformation for high-performance computing. ACM Computing Surveys, 26(4), 345–420. Barak, A. and Ami, L. (1985) MOS: a multicomputer distributed operating system. Software – Practice and Experience, 15(8). Bershad, B., Lazowska, E. and Levy, H. (1988) Presto: A system for object-oriented parallel programming. Software – Practice and Experience, 18(8), 713–32. Blume, W., Eigenmann, J., Hoeflinger, D., Padua, D., Peterson, P., Rauchwerger, L. and Tu, P. (1994) Automatic detection of parallelism: a grand challenge for high-performance computing. IEEE Parallel and Distributed Technology: Systems and Applications, 2(3), 37–47. Chien, A., Karamcheti, V., Plevyak, J. and Feng, W. (1992) Techniques for efficient execution of finegrained concurrent programs. In Proceeding of the Workshop on Language and Compilers for Parallel Machines.

80

Yariv Aridor, Shimon Cohen and Amiram Yehudai

Chien, A., Karamcheti, V. and Plevyak, J. (1993) The CONCERT system – compiler and runtime support for efficient fine-grained concurrent object-oriented programs. Technical Report 93-1815, Computer Science Dept, University of Illinois. Chien, A., Reddy, U. S., Plevyak, J. and Dolby, J. (1996) ICC++ – a C++ dialect for high performance parallel computing. In Proceedings of Second International Symposium on Object Technologies for Advanced Software. Corradi, A. and Leonardi, L. (1990) Parallelism in OOP languages. In Proceeding of the IEEE international Conference on Computer Languages, March. Dally, W. J. (1990) The J-machine system. In Expending Frontiers, Volume 1 (eds P.H. Winston and S.A. Shellard), MIT Press, Boston, MA. Dean, J., DeFouw, G., Grove, D., Litvinov, V. and Chambers, C. (1996) Vortex: An optimizing compiler for object-oriented languages. In Proceeding of the 11th Conference on Object-Oriented Programming Systems, Languages and Applications, pp. 83–100. Foster, I. (1994) Task parallelism and high-performance languages. IEEE Parallel and Distributed Technology, Fall, 27–36. Gannon, D. and Lee, J.K. (1991) Object oriented parallelism: PC++ ideas and experiments. In Proceedings of Japan Society for Parallel Processing, pp. 13–23. Gray, S. (1986) Using futures to exploit parallelism in Lisp. Master’s thesis, Computer Science Dept, MIT. Grimshaw, A. (1993) Easy-to-use object-oriented parallel processing with MENTAT. IEEE Computer, 26(5). Halstead, R. H. Jr and Gujita, T. (1988) MASA : a multithreaded processor architecture for parallel symbolic computing. In Proceedings of the 15th Annual Symposium on Computer Architecture. IEEE Computer Society, New York. Kale, L.V. and Krishnan, S. (1993) Medium grained execution in concurrent object-oriented systems. In Proceedings of the OOPSLA workshop on Efficient Implementation of Concurrent Object-Oriented Languages. Kim, W.Y. and Agha, G. (1992) Compilation of a highly parallel Actor-based language. In Languages and Compilers for Parallel Computing. Springer-Verlag, Berlin. Larus, J., Richards, B. and Viswanathan, G. (1992) C : A large grain, object-oriented, data parallel programming language. Technical Report 1126, Computer Science Dept, University of WisconsinMadison. Lehman, P.L. (1981) Efficient locking for concurrent operations on B-tree. ACM Transactions on Database Systems, 6(4).

SYMPAL: a software environment for implicit concurrent object-oriented programming

81

Matsuoka, S., Taura, K. and Yonezawa, A. (1993) Highly efficient and encapsulated re-use of synchronization code in concurrent object-oriented languages. In Proceedings of the conference on ObjectOriented Programming Systems, Languages and Applications, pp. 109–26. Moon, D. (1986) Object-oriented programming with FLAVORS. In Proceedings of the Conference on Object-Oriented Programming Systems, Languages and Applications, pp. 1–8. Peyton Jones, S.L. (1989) Parallel implementation of functional programming languages. Computer Journal, 32(4). Plevyak, J. and Chien, A. (1994) Precise concrete type inference for object-oriented languages. In Proceedings of the Conference on Object-Oriented Programming Systems, Languages and Applications. Sargeant, J. (1993) Uniting functional and object-oriented programming. In Proceeding of the First JSSST International Symposium on Object Technologies for Advanced Software (eds S. Nishio and A. Yonezawa). Segall, A. (1983) Distributed network protocols. IEEE Transactions on Information Theory, 29(1), 23–35. Seitz, C.L. (1985) The cosmic cube. Communications of the ACM, 28(1), 22–33. Shimizu, T., Horie, T. and Ishihata, H. (1992) Low-latency message communication support for the AP1000. In Proceedings of the International Symposium on Computer Architecture, pp. 288–97. Taura, K. and Yonezawa, A. (1996) SCHEMATIC: A concurrent object-oriented extension to scheme. In Proceedings of Object Based Parallel and Distributed Computation. To appear. Taura, K., Matsuoka, S. and Yonezawa, A. (1993) An efficient implementation scheme of concurrent object-oriented languages on stock multicomputers. In Proceedings of the Fourth ACM Sigplan Symposium on Principles and Practice of Parallel Programming. Telem, Y. (1996) Profile-based object assignment in fine-grained parallel object-oriented environments. Master’s thesis, Computer Science Dept, Tel-Aviv University. TM Corporation. (1991) The connection machine CM-5 technical summary. Technical report, Thinking Machines Corporation, Cambridge, MA. Tripathi, A. and Berge, E. (1988) An implementation of the object-oriented concurrent programming language SINA. Software – Practice and Experience, 19(3), 235–56. Wyatt, B.B., Kavi, K. and Hufnagel, S. (1992) Parallelism in object-oriented languages: a survey. IEEE Software, November, 56–6. Yasugi, M. and Yonezawa, A. (1992) An object-oriented parallel algorithm for the newtonian n-body problem. Technical report, Dept of Information Science, University of Tokyo. Yokote, Y. and Tokoro, M. (1986) The design and implementation of CONCURRENT SMALLTALK. In Proceeding of the conference on Object-Oriented Programming Systems, Languages and Applications, p. 311–40. Yonezawa, A. (1990) ABCL: An Object-Oriented Concurrent System. MIT Press, Boston, MA.