An Extensible Program Representation for Object ... - CiteSeerX

An Extensible Program Representation for Object-Oriented Software1 Brian A. Malloy John D. McGregor Anand Krishnaswamy Murali Medikonda Dept. of Computer Science Clemson University Clemson, SC 29634-1906

Abstract

An extensible representation for object-oriented programs is presented. It is based on the concept of a program dependency graph and elaborated to include both control ow and data ow information. The representation takes advantage of the basic incremental philosophy of the object-oriented approach to develop a more compact representation that is useful with practical programs. The basic approach reported here provides a static view of an object-oriented program. The approach can be expanded to provide dynamic information for tools such as interactive debuggers and other runtime tools. The outline of this extension is also presented.

1 Introduction A number of software support tools and program analysis tools have been constructed using a program dependency graph (PDG) as the underlying representation for the program being analyzed. In this work we adapt the basic concept of a PDG to a representation for object-oriented software. This paper presents several contributions. Most important is a representation of dynamically bound messages. Inclusion polymorphism is widely used in object-oriented design and dynamic binding of messages to methods is an important aspect of the implementation of that type of polymorphism. No realistic program representation can be developed for object-oriented software without support for dynamic binding. A second contribution is the incremental approach used in building the representation. A class can be de ned in terms of other classes using the inheritance feature of an object-oriented language. Our representation follows this same incremental strategy. The representation for a class is built from the representation of its parent classes. This allows a more compact representation that in turn permits the representation of real programs of signi cant size. A third is that the program representation is built in layers that dierentiate between the static, compiletime semantics of the program and the dynamic, execution time behavior of the program. This allows tools, that perform static analysis, to use a simple representation and be compactly written. The representation easily expands to support the needs of runtime tools. The representation described in this paper has been developed with several objectives in mind. The representation should be: suciently expressive to support a range of program analysis activities. layered so that only the portion of the representation needed for a particular task is built. extensible so that additional types of nodes and edges may be added to adapt to the special characteristics of a speci c language. In the next section we provide the background needed to understand the representation. This includes a brief description of the basic concepts of object-oriented systems and the de nitions and terminology of 1

This research was partially supported by a grant from COMSOFT, IBM and BNR.

1

program dependency graphs. In section 3 we provide a framework for a complete, static and dynamic, representation of object-oriented programs. We present the basic syntax in the representation for the objectoriented concepts including polymorphism. Section 4 provides an example that illustrates the static portion of the representation. (In this paper we will limit ourselves to a subset of the full representation.) In section 5 we consider existing work on representations of object-oriented systems and compare these techniques with ours. We also present work with other forms of program dependency graphs. Finally, in section 6, we summarize the use of the representation and present an outline of further work exploring and applying the representation.

2 Background

2.1 Object-oriented Concepts

We assume that the reader is familiar with the basic concepts of objects and will not present a comprehensive introduction here. Korson and McGregor[17] provide a survey of basic terms and concepts. We will discuss the special characteristics of object technology that in uenced our development of the representation. Classes and objects are central to the development of object-oriented software. A class is a de nition of a concept with all of the data items and the operations on that data. A class generates objects that contain the data and operations for a speci c instance of the concept. An object provides an enclosure for a speci c set of data and the functionality that is a directly addressable syntactic unit in a program. For example, a Ford Aerostar class would de ne all of the characteristics of a certain brand of minivan. An object created from that class might be one speci c minivan owned by one of the authors. Object-oriented design techniques use an incremental approach to software development which in turn supports an incremental approach to validation[13]. This incremental approach results in pieces of code being reused in several class de nitions and this in turn results in a smaller source code for a given project. Families of programs can be grown by incrementally adding to the functionality of one system to build another. New classes can be de ned based on existing classes via inheritance and composition mechanisms. Inheritance is a relationship based on the specialization of an existing class to de ne a new concept that is a special case of the existing class. For example, instead of simply writing a de nition of a Ford Aerostar, we can begin with a de nition of a generic minivan and add speci c information to describe the Ford Aerostar. This is useful if we must also de ne classes for other speci c variants of the generic minivan. An operation on data de ned within a class is referred to as a method. A method in an object is invoked by another object sending a message to that object. Many of the relationships among classes are represented by messaging patterns between the objects de ned by those classes. A useful result of object-oriented design practices is that fewer parameters are passed via the messages between objects. This reduced data ow results in a program that is simpler to understand. Object-oriented design techniques use inclusion polymorphism to decouple many of the relationships between de nitions. This approach essentially allows many dierent implementations of the same speci cation. Further, object-oriented languages allow the decision about which implementation to use to be delayed until the execution of the program. This dynamic binding of messages to methods supports the development of systems that are highly interactive, but the runtime binding does not allow static, compile time, resolution of messages between objects. Objects retain their state across messages. This is an essential part of their description since an object behaves dierently over time in response to the messages that it receives. This characteristic leads most object-oriented analysis and design methods to include some model of the possible dynamic behaviors of an object. The concept of object state combined with the dynamic binding of messages imply that the complete behavior of an object-oriented system cannot be determined until runtime. Any technique proposing to provide a complete representation of such a system must account for this time-sensitive aspect of the system.

2.2 Program Dependency Graphs

A program dependency graph (PDG) [16] is a graphical representation of a program that encodes both control dependencies and data dependencies into a single structure. The graph contains nodes that repre2

S1:

R1

n = 2;

P2,C2: while(n