Parallel and Distributed Computing Using the Java Language Paradigm

27 downloads 93511 Views 238KB Size Report
recently, the Java programming language 26] has been proposed to augment the Web with ..... The best way to achieve multiple copies of an ..... Just as when the desktop computers freed individuals from depending upon a single mainframe.
Parallel and Distributed Computing Using the Java Language Paradigm Stella C.S. Porto

Departamento de Engenharia de Telecomunicaco~es Pos-graduaca~o em Computaca~o Aplicada e Automaca~o Universidade Federal Fluminense Rua Passos da Patria 156, 5o andar 24210-240 Niteroi, RJ Brasil [email protected] .br (021)620-7070 x.352 (Voice) (021)620-7070 x.328 (Fax)

Jo~ao Paulo Kitajima

Departamento de Ci^encia da Computaca~o Universidade Federal de Minas Gerais Caixa Postal 702 30161-970 Belo Horizonte, MG Brasil [email protected] (031)499 5871 (Voice) (031)499 5858 (Fax)

Katia Obraczka

University of Southern California Information Science Institute 4676 Admiralty Way Marina del Rey, CA 90292, USA [email protected] (310)822-1511 (Voice) (310)823-6714 (Fax)

Abstract Parallel processing has emerged as a key enabling technology in modern computers, driven by the ever-increasing demand for higher performance, lower costs and sustained productivity in real-life applications. At the same time, the Internet, whose exponential growth in the last 4 years can be greatly attributed to the ever-increasing popularity of the World-Wide Web, has become a virtual in nite source of information and processing resources. More recently, the Java programming language has been proposed to turn the Web into an active stream of resources by enabling code to be transferred over the network and executed remotely. This report is a research project proposal, where the main question we aim to answer is whether parallel and distributed application developers are able to make ecient use of the Internet and its resources through an active Web augmented with the Java programming paradigm. We plan to investigate how the Java programmingparadigm can improve the processes of speci cation, development, execution, and performance analysis of parallel and distributed programs. This document is organized as follows. Section 1 introduces the context of our work and presents some problems associated to parallel and distributed computing. Our research goals are described in Section 2. A detailed description of background issues is presented in Section 3, 1

in which we brie y present the state-of-the-art in the elds of parallel/distributed computing, and computer communication. Research issues are put in evidence in order to provide some hints about the complexity of this work in Section 4. Finally, an up-to-date list of references closes this document.

1 Introduction and Motivation Parallel processing technology is the outgrowth of ve decades of research and technology advances in microelectronics, printed circuits, high-density packaging, advanced processors, memory systems, peripheral devices, communication channels, language evolution, compiler technology, operating systems, programming environments, and application challenges. The rapid progress made in hardware technology has signi cantly increased the economical feasibility of building a new generation of computers adopting a parallel processing paradigm. However, the major barrier preventing parallel processing from entering the production mainstream relies on the software and application side. Currently, it is still very dicult and painful to program parallel and vector computers. We need to strive for major progress in the software arena in order to create a user-friendly environment for high-performance computers. For the past decade, the eld of computer communications has been the subject of intense research and technology development. One of the most signi cant results of these years of research and development is the Internet, an interconnection of networks connecting computers all over the world, which has made world-wide distributed computing a reality. For instance, a researcher in Rio (Brazil) can launch a set of simulations using clusters of high-performance workstations located around the US, let them run overnight, and in the morning of the next day look at the resulting graphs displayed in his/her local PC. Projects such as ISI's Prospero Resource Manager [23] already try to take advantage of processing resources available through the Internet by enabling users to run sequential and parallel applications on processors connected to a local or wide-area network. In the last 4 years, Internet resource discovery tools, in particular the World-Wide Web [2], have been responsible for the ubiquitous Internet phenomenon. The popularity of the Web can be greatly attributed to Mosaic, the Web's rst browser, which provided a very user-friendly, \point-and-click" interface. With the Web, the Internet has seen an exponential growth of users, services, available resources, and trac [24]. Commercial services such as America-On-Line and CompuServe may soon connect every home in the US to the Internet. To take advantage of this market, service providers are rapidly making their services available and advertising them through the Web. Traditional Web browsers, such as NCSA's Mosaic [25] and Netscape's Navigator [27], allow users to surf the Web by browsing and searching through the Web's information space. More recently, the Java programming language [26] has been proposed to augment the Web with remote execution capabilities. A Java-capable browser, such as Sun's Hot Java or Netscape's Navigator 2.0, can transfer Java applets over the network to be executed on the user's local machine. One of Java's main goals is to transition the Web from a passive information discovery and retrieval tool to an active software environment. Proponents and many users of the object-oriented computing paradigm claim that objectoriented languages provide more support for information hiding and software reuse than conventional languages. To explore whether object orientation can help reduce the diculty of programming parallel machines, sequential languages have been extended to support parallel objects. The Java language is similar to C and C++, optimized for object-oriented, distributed, multithreaded computing. However, Java was not designed to be a parallel language. While the Java language is fairly xed, the class and other libraries may change as the Java programming and execution environment needs change. To support cooperative parallel objects, Java must be extended to incorporate object concurrent communication and management features. The main goal of this research is to solve some of the challenges in parallel and distributed computing using a Java-featured active Web environment. We believe that the Java programming 2

environment can help bridge the gap between the traditional software development process and parallel/distributed machines. In this sense, we propose to design and implement the following tools:  Tools for developing parallel and distributed applications under a suitable programming paradigm;  Tools for exploiting the parallelism of the underlying execution environment (distributed operating system and execution environment of a given compiler/interpreter);  Tools for evaluating the performance of parallel/distributed applications on the corresponding platform. We envision making contributions in both the scienti c and technological arenas. We intend not only to build a software development environment, but also to analyze and propose new approaches to bridge the parallel/distributed software gap. These approaches should be independent of speci c languages, browsers, architectures and operating systems. Developing a tool or an environment is essentially a proof of concept exercise. Below, we place our research in the context of the state-of-the-art in the elds of parallel and distributed computing and computer communication.

Parallel/distributed program developing issues. Nowadays, it is impossible to talk

about high-performance computers without considering parallel systems. A countless number of problems can be solved faster if independent and cooperative steps (conducted toward the solution) are executed simultaneously. For computer-solved problems in particular, things are not di erent. An e ective simultaneous task execution is only achieved having more than one processor (considering the processor as the processing element inside a computer). Therefore, since the early days of electronic computers, several projects of parallel machines were proposed in which parallelism was exploited in di erent levels: mainframes with a main processor connected to di erent I/O processors, vector supercomputers, synchronous array processors and, nally, multiprocessors and multicomputers composed of interconnected independent processors sharing or not a global memory. More recently, LAN's (Local Area Networks) have also been used as parallel virtual machines. Parallel computer architectures and distributed systems are evidently converging. Except for some architectures, parallel computers are considered clusters of workstations and vice-versa (e.g. IBM Scalable Power Systems). Each processor has an Internet address and control of this processor can be given to the nal programmer. On the other hand, symmetric multiprocessor workstations have a unique address. Their processors share a common memory and it is up to the operating system to manage these processors. The exploitation of parallelism has created a new dimension in computer science. In order to move parallel processing into the mainstream of computing, it is necessary to make signi cant progress in three key areas: computation models , for parallel computing, interprocessor communication in parallel architectures, and system integration for incorporating parallel systems into general computing environments [9]. Software support is needed for the development of ecient programs in high-level languages. The diculty in parallel programming is due to the fact that existing languages were originally developed for sequential computers. Programmers are often forced to program hardware-dependent features instead of programming parallelism in a generic and portable way. Ideally, we need to develop a parallel programming environment with architecture-independent languages, compilers and software tools. To develop a parallel language, we aim for eciency in its implementation, portability across di erent machines, compatibility with existing sequential languages, expressiveness of parallelism, and ease of programming. Most systems until now have chosen the language extension approach due to compatibility issues [9]. Besides parallel languages and compilers, the operating system must be also extended to support parallel activities. An e ective operating system manages the allocation and deallocation of resources during the execution of user programs. Mapping is a bidirectional process matching algorithmic structure with hardware architecture, and vice-versa. Ecient mapping 3

will bene t the programmer and produce better source codes. The mapping of algorithmic and data structures onto machine architecture includes processor scheduling [13, 16, 17], memory maps, interprocessor communications, etc. These activities are usually architecture dependent. Optimal mappings are sought for various computer architectures. The implementation of these mappings relies on ecient compiler and operating system support. Parallelism can be exploited at algorithm design time, at program time, at compile time, and at run time. Techniques for exploiting parallelism at these levels form the core of parallel processing technology [9]. Load balancing, fault tolerance, homogeneous interface, on the other hand, form the core of distributed systems technology [20]. The object programming approach has many bene ts to large-grained parallelism, thus a number of object programming languages have been developed during the 1980s. In many of these languages, for example, \objects" are implemented as processing agents which communicate by sending messages. However, an e ective model for the object-oriented parallel processing paradigm and its associated developing tools have not yet been settled in accordance to the development rate of distributed memory MIMD systems.

Performance evaluation of parallel/distributed applications. The ideal performance of a computer system demands a perfect match between machine capability and program behavior. Machine capability can be enhanced with better hardware technology, innovative architectural features, and ecient resources management. However, program behavior is dicult to predict due to its heavy dependence on application and run-time conditions. There are also many factors a ecting program behavior, including algorithm design, data structures, language eciency, programmer skill, and compiler technology. It is impossible to achieve a perfect match between hardware and software by merely improving only a few factors without touching the others [9]. In this context, performance may be de ned as a composition of two main factors, namely: application execution time and resource utilization. Notice that performance evaluation is necessary during the entire project of a distributed application, mainly in the beginning, when performance problems should be detected as soon as possible, so less costly changes can be accomplished. In order to perform this initial evaluation, friendly modeling and experimental tools should be available. Making use of distributed resources across the Internet. Meanwhile in the in-

dustry of telecommunications, another technology has taken hold that has a profound future impact on computers. Fiber optics and high-bandwidth electronics together have enabled telecommunications to shift from a technology designed for 10Hz signals to one designed for 1GHz signals, and to achieve the new bandwidth at low cost. The cost of delivering conventional voice channels using the new technology may be 1000 times less expensive than when delivered by the old technology. Moreover, the new technology is available virtually overnight, so that we should be able to reap these bene ts in a few years instead of in a few decades. Just as computing technology has achieved an astounding advance over a short period of time, the rate of advance in communications technology is even greater [19]. What will be the impact on computers when we have access to multi-GHz ber optic networks? If we do business as usual, we will feel the impact mainly through lowered cost to operate computers in a network and a higher bandwidth within the network. But why do business as usual? Digital communications networks are much more likely to create new product opportunities. Systems that formerly interchanged characters and documents can be supplanted by systems that interchange individual images and video sequences. Starting with this premise, the role of the computer evolves from a processing center to an information server [19]. Finally, Internet technology leaves us the possibility of achieving high-performance, putting together both parallel and distributed worlds, using standard interfaces (such as browsers and network languages, like Java).The Internet is today available for the commonuser, who eventually will not be a passive entity \sur ng" the web, but he/she can develop distinct sequential, parallel and/or distributed applications, with di erent computational demands. 4

2 Goals In this work, we focus on parallel application development and performance evaluation. We believe that the gap between software and hardware is so large that hardware improvement by itself will not yield substantial and sucient improvements. In this direction, we want to investigate how can parallel/distributed application developers, under a networking object-oriented programming paradigm (which characterizes the Java language), make ecient use of the Internet technology and its available resources, through a user friendly environment. Thus, our goals may be stated as follows.  Goal 1: Propose a development environment to support the design and implementation of parallel/distributed programs that run on and pro t from the resources available in network environments, eventually under an object-oriented paradigm of an in terpreted language such as Java. This development environment requires: { An abstract parallel paradigm involving distributed memory for explicit parallelism, and objects for implicit parallelism (abstraction, encapsulation, program structure support and reusability); { Textual and/or graphical languages in order to express the parallel algorithm and its translation (computer-aided) to a parallel program in a target language for a target machine (if it is not interpreted). This feature involves the development of an interface (textual or graphical to the user), possibly executing inside a browser.  Goal 2: Develop or adapt a run-time environment to support the execution of the parallel program generated by the development environment previously described. The run-time environment requires: { Task management allowing local and remote execution; { Memory management supporting local and remote memory accesses; { Network management supporting message exchange; { Load balancing intra- and inter-program, following static or dynamic approaches [11].  Goal 3: Develop a performance measurement tool for the proposed environment. The performance evaluation tool requires: { A tool for estimated performance evaluation of di erent parallelization strategies for the same basic algorithm; { A tool for monitoring the performance of a program during its execution; { A visualization tool to help in the entire program development process.  Other goals: { National and international cooperation; { Contribution to the research programs of the participating institutions; { Elaboration of M.Sc. dissertations and Ph.D thesis; { Keeping the local scienti c community up-to-date with the state-of-the-art research and technology in the areas relevant to this work.

3 Background and Related Work It is presented in this section the background and related work of our research. The goal is to detail the trends inside each research topic of this project.

3.1 Parallel and distributed computing

Parallel Computers Sequential computing has bene ted from the fact that there has been a single model of computation, widely known as the von Neumann model, on which architects and software and algorithm designers have based their work. Parallel computing has not been 5

so fortunate in that there has not been a single model of computation; as a result a variety of di erent architectures and programming paradigms have been proposed [15]. The main issue a ecting the architectural model for parallel computers is how to organize multiple processors to execute in parallel. This has given rise to a wide range of models, chief of which is the multiple instruction stream, multiple data stream (MIMD) model [7]. These MIMD systems employ multiple processors that execute independent and multiple instruction streams (MI = \multiple instruction stream") accessing data autonomously (MD = \multiple data stream"). The design of such systems requires careful consideration of the number of processors and their interconnection network topology [15]. Other models of systems exist: SIMD (Single Instruction Stream, Multiple Data stream: synchronous operations working on di erent data sets), systolic synchronous arrays, pipeline (available on vector machines and recent microprocessors), and data ow machines (whose control ow is driven by the data ow). However, the MIMD has been considered as the most exible of these models and itself supporting a plethora of distinct implementations [6]. The shared-memory MIMD model (or multiprocessors), where the processors are connected to a number of memory modules to form a common global shared memory, was the basis of the earliest MIMD machines. One of the major problems with this model was that severe memory contention can occur when processors try to access data residing in the same memory module. This memory contention problem can be solved by using the distributed-memory model (multicomputers) where the memory is distributed among processors. If data needs to be accessed from the local memory of another processor, then they are transferred using the interconnection network. Such a con guration is scalable to higher orders of parallelism than that of the shared memory model. However, initial programming experience indicates that shared-memory systems are easier to use and to program compared to distributed-memory systems which require more intricate programming [15]. One trend to merge these two approaches is the development of shared distributed memories or NUMA (Non Uniform Memory Access) machines, where multicomputers have specialized software/hardware to manage distributed memories as there were a single shared-memory. Another possibility of merging is the development of clusters of symmetric multiprocessors (SMP). Symmetric multiprocessors share a common memory and each processor is mutually equivalent with each other. There is not a master-slave relation. Although not scalable, these machines perform well for high granularity parallel applications and currently many workstations and PCs are symmetric multiprocessors. There are studies in order to break the scalability problem through substitution of the internal bus by a crossbar switch. A cluster is then a network of symmetric multiprocessors. Inside a SMP, you have a shared-memory model. Among SMPs, you have a distributed memory model. Indeed, workstations clusters o er many bene ts when compared to other computing solutions [21].

Parallel Programming Paradigms As veri ed from above, there are di erent kinds of

parallel machines. This is due to the fact that there are di erent ways, di erent approaches to solve a certain problem in parallel. It is important to observe that this is true not only in a parallel context, but in a sequential one also. Naturally, this multiplicity of paradigms are re ected on parallel software, where we have a lot of di erent programming paradigms, such as: shared memory, data parallel, SPMD (Single Program working on Multiple Data: replication of an object le on distinct processors), message passing, functional, object-oriented, pipeline. All these paradigms have di erent structural and behavioral features working in di erent abstraction levels. Among these paradigms, object-oriented programming has grown from a radical concept of the 1960s to routine practice among serial programmers in the 1990s. It is de ned as any programming technique in which the primary components of the application program are objects. Objects are de ned as instances of a class, and a class is de ned as a collection of procedures called methods , and the data types they operate on. The concept of object programming is strongly related to the concept of data type, restricted operations on types, and information hiding in the form of encapsulation of both data and the implementation of operations that access the data. Encapsulation is done in a variety of ways in di erent programming languages. In all cases, the concept of information hiding is used to prevent access to an object's state by other objects, 6

except through the method de ned on the object. The best way to achieve multiple copies of an object is to use the concept of inheritance. An object inherits the methods of its class when it is instantiated. Two or more objects of the same type share one copy of the code for their methods. Inheritance can be cumulative if we organize the classes in an inheritance hierarchy [12]. There are several approaches for parallel and distributed object orientation. For example: Concurrent OOP. In this model, objects are dynamically created and manipulated. Processing is performed by sending and receiving messages among objects. Concurrent programming models are built up from low-level objects such as processes, queues, and semaphores into high-level objects like monitors and program modules. The popularity of concurrent object-oriented programming (COOP) is attributed to three application demands [9]: 1. The increased use of intercalating processes by individual users, such as the use of multiple X windows. 2. Workstation networks have become a cost-e ective mechanism for resource sharing and distributed problem solving. 3. Multiprocessor technology has advanced to the point of providing supercomputing power at a fraction of the traditional cost. As a matter of fact, program abstraction leads to program modularity and software reusability as is often found in OOP. Objects are program entities which encapsulate data and operations into single computational units. It turns out that concurrency is a natural consequence of the concept of objects. In fact, the concurrent use of coroutines in conventional programming is very similar to the concurrent manipulation of objects in OOP. The development of concurrent object-oriented programming provides an alternative model for concurrent computing on multiprocessors or on multicomputers. Various object models di er in the internal behavior of objects and in how they interact with each other. The Actor Model. COOP must support patterns of reuse and classi cation, for example, through the use of inheritance which allows all instances of a particular class to share the same property. An actor model developed at MIT is presented as another framework for COOP. Actors are self-contained, interactive, independent components of a computing system that communicate by asynchronous message passing. In an actor model, message passing is attached with semantics. Basic actor primitives include [9]: 1. Create : Creating an actor from a behavior description and a set of parameters. 2. Send-to : Sending a message to another actor. 3. Become : An actor replacing its own behavior by a new behavior. State changes are speci ed by behavior replacement. The replacement mechanism allows one to aggregate changes and to avoid unnecessary control- ow dependences. Concurrent computations are visualized in terms of concurrent actor creations, simultaneous communication events, and behavior replacements. Each message may cause an object (actor) to modify its state, create new objects, and send new messages. Concurrency control structures represent particular patterns of message passing. The actor primitives provide a low-level description of concurrent systems. High-level constructs are also needed for raising the granularity of descriptions and for encapsulating faults. The actor model is particularly suitable for multicomputer implementations. The Server Paradigm. A server is a task, or cluster of tasks, which implement the methods of an object as cooperating subtasks. Servers execute the methods of an object in parallel. They provide a natural way to implement object programs with a minimum of synchronization and locking. This means that servers are relatively self-contained programs that implement one or more tasks of a parallel program. A server is an object in motion. In the server model, a parallel program is a collection of miniature programs running independently, but occasionally interacting with one another by message passing. 7

Perhaps the most natural extension of object programming is that of the server paradigm. In this view of parallel programming, every object is a miniature program that communicates with other objects by message passing. When objects are activated in this manner, we call them servers [12]. Other approaches exist. It is also important to remark that, like imperative parallel programming, di erent strategies can be employed to solve problems using concurrent object orientation. According to Hwang [9], three common patterns of parallelism have been found in the practice of COOP: 1. pipeline concurrency involves the overlapped enumeration of successive solutions and concurrent testing of the solutions as they emerge from an evaluation pipeline. 2. divide-and-conquer concurrency involves the concurrent elaboration of di erent subprograms and the combining of their solutions to produce a solution to overall problem. In this case, there is no interaction between the procedures solving the subproblems. 3. cooperative problem solving, where all objects interact with each other; intermediate results are stored in objects and shared by passing messages between them. Object orientation means more abstraction: ideally, solutions should be problem oriented and not architecture oriented.

Parallel Software The construction of distributed memory MIMD systems is now commercially viable, and it is feasible to build systems involving large numbers of processors. However, consistent with previous trends, e ective software methods for programming such systems lag the hardware developments, resulting in their inecient use in many instances [15]. Four main approaches have been suggested as a basis for designing programming languages that would promote the wider use of parallel systems: 1. invent a completely new language (e.g. Occam2); 2. use a coordination language (e.g. Linda); 3. enhance an existing sequential language compiler to detect parallelism (e.g. Cray Fortran), and 4. introduce features to an existing language that deal explicitly with parallelism (e.g. PVM). The last approach composes the essentials of the underlying software platform we are focusing while using the Java paradigm (see Section 3.2). The approach based on introducing features to an existing language that deal explicitly with parallelism should enable existing software to be adapted and transferred to parallel machines, where appropriate, by existing programmers. This category is the one receiving most attention at present. This is a consequence of the fact that the majority of the users of parallel machines are scientists and engineers whose main working language has been Fortran. Many of the extensions have been developed by di erent groups using the same language base, leading to the de nition of nonstandard variants; this makes the production of a standard for such languages dicult. Although, variants are nonstandard, the underlying language is a standard. For example, the PVM library can be used by Fortran77 and C programs. However, the production of a standard for such languages is not impossible (e.g. MPI - Message Passing Interface - a standard for message passing libraries). However, considering that Java is an object-oriented language, the extended features, which will eventually be introduced to it, must follow the trend of concurrent object paradigm. In this sense, the main characteristics of an OO approach, such as encapsulation, information hiding, inheritance and the potential for software reuse should reinforced. Nevertheless, may it be completely transparent, object communication in distributed systems is actually implemented in its lowest level through message passing schemes. Distributed Operating Systems According to Tanenbaum [20], the development of cheap

powerful microprocessors and the development of high-speed computer networks made feasible and easy to put together computing systems composed of large numbers of processors connected 8

by a high-speed network. These systems are usually called distributed systems, in contrast to the previous centralized systems, based on a single CPU, memory, peripherals, and some terminals. It has been remarked that these systems are being employed as shared parallel computers. Operating systems for these environments can be called distributed operating systems. Distributed systems involve two essential features: (i) a collection of independent computers (ii) the appearance to end users of a single computer. Therefore, as the work environment in this project, we consider independent computers implying on a distributed memory model. The \single computer" idea is less restrictive in our context: distributed tasks will be executed remotely, but eventually the processor each one will execute on is not known, alleviating the user from the worries concerning the underlying computer system. There will always be an available processor to execute a certain task. It is the role of the distributed operating system to support this transparency. Distributed operating systems are software components with special functions. They can be programmed using the approaches presented in the previous paragraphs about parallel paradigms and parallel softwares. Mainly, they should implement correctly and eciently the interface between raw hardware and parallel application software. Communication, resources allocation, task scheduling, lightweight processes, all of these elements are performed by operating systems. Current commercial operating systems have support for distributed applications: Solaris, Windows NT, Unix BSD4.4, Linux, all of them support networking and service distribution. Solaris and Windows NT support also multiprocessing, allowing threads (or lightweight processes) to execute on di erent processors of a symmetric multiprocessor (SMP).

3.2 Networking and Communications Infrastructure

Due to the rapid evolution of communications and networking technology, parallel computer architectures and distributed systems are converging. Not too long ago, parallel applications typically ran in multiprocessor machines. More recently, it is common to use a cluster of workstations connected through a LAN to cooperate in executing a parallel task. Fiber optics and high-bandwidth electronics together have enabled telecommunications to shift from a technology designed for 10Hz signals to one designed for 1GHz signals, and to achieve the new bandwidth at low cost. The cost of delivering conventional voice channels using the new technology may be 1000 times less expensive than when delivered by the old technology. Moreover, the new technology is available virtually overnight, so that we should be able to reap these bene ts in a few years instead of in a few decades. Just as computing technology has achieved an astounding advance over a short period of time, the rate of advance in communications technology is even greater [19]. These advances in communication infra-structure allowed the development of the Internet. The Internet has made wide-area distributed applications a reality. Take for example the Domain Name Service (DNS) [5], the Internet's name service. DNS servers scattered throughout the world maintain portions of the DNS naming database and cooperate in resolving Internet host names and addresses. Below, we overview the evolution of the Internet: from electronic mail and le transfer to the Web and Java.

Internet Evolution In the early 1970's, data communication had already become a fun-

damental part of computing. At the time, most existing networks were independent entities dedicated to the needs of isolated groups of people. The US Department of Defense's Defense Advanced Research Projects Agency (DARPA), who was the primary funding agency for packet-switched network research and has pioneered many ideas in packet switching with its ARPANET, was one of the rst to recognize the need for a universal network. The big challenge was to deal with the hardware and protocol heterogeneity of the existing networking infra-structure, since replacing it was economically impractical. The only other way was to accommodate heterogeneity by interconnecting the existing networks, or internetworking. By the late 1970's, the DARPA-sponsored TCP/IP Internet protocol suite became the de-facto standard for interconnecting networks and routing trac. The connected 9

Internet started around 1980 when DARPA began converting its research networks to use the new TCP/IP protocols. The ARPANET then became the backbone of the Internet and was used for many of the early TCP/IP experiments. A couple of years later, the ARPANET was split into two separate networks: one for research use which kept the name ARPANET, and one for military communication which is known as the MILNET. To encourage university and research institutions to adopt the TCP/IP protocol suite, DARPA made an implementation available at low cost. At that time, most university computer science departments were running a version of the Berkeley UNIX operating system, or BSD UNIX. DARPA was able to reach more than 90% of the research community by funding a TCP/IP implementation for BSD UNIX. Besides the TCP/IP protocols, this BSD UNIX release also included a set of network services that were similar to the standard UNIX commands. It also provided a new operating system abstraction known as socket that allows application programs to access the kernel-level communication protocols. The introduction of the socket abstraction, implemented as a generalization of the I/O mechanism, allowed programmers easy access to the TCP/IP protocols. The success of the TCP/IP technology among the computer science research community motivated other groups to use it. The US National Science Foundation (NSF) took an active role in expanding the TCP/IP Internet to reach other scienti c communities. In 1986, it funded a new long haul backbone, the NSFNET, that eventually connected all its supercomputer centers to the Internet. It also provided funds for many regional networks in the US, each of which interconnecting major local scienti c institutions among them and to the Internet. Within seven years from its inception, the Internet already spanned hundreds of individual networks located throughout the US and Europe, connecting nearly 20,000 computers at universities, laboratories, and government agencies. The size and use of the Internet continued to grow much faster than expected. By 1990, the connected Internet included 3,000 active networks and over 200,000 hosts. The rapid growth resulted in scalability problems not anticipated in the original design. For instance, the names and addresses of all computers connected to the Internet were kept in a single le, which was manually updated and then distributed to every Internet site. This le was consulted every time a host name had to be resolved, that is, translated to its corresponding Internet address. By mid 1980's, it was clear that a centralized database would not scale. The Domain Name System (DNS) was proposed as a solution to the Internet naming problem. The Internet name space was partitioned, distributed and replicated in several DNS servers. These servers are responsible for performing host name to Internet address mapping. In terms of its network and transport level services, the TCP-IP Internet provides two basic types of service:  The connectionless packet delivery service provides best-e ort delivery of packets between a source and a destination. It does not guarantee reliability nor in-order delivery. This service is useful to applications that do not need to pay the overhead incurred by the connection-oriented service described below.  The reliable stream transport protocol provides the \virtual circuit" abstraction in a datagram network like the Internet. At the transport level, the communication end points establish a connection to send a data stream reliably. These services were quite adequate for the Internet of the 1980's, in which most of the available services were electronic mail, remote le transfer, and remote login. More recent distributed real-time applications, such as multimedia conferencing have more demanding service requirements. Moreover, many of these applications perform multipoint communication with several senders and several receivers. This trend in modern, distributed applications has motivated extensive research in developing a new Internet architecture and service models. The goal of this architecture is to accommodate multipoint communication and di erent quality of services.

Internet Resource Discovery Services and the Web The Internet has become a virtual in nite source of information and computational resources. Resource discovery services 10

have proliferated to help users locate and retrieve resources available through the Internet. According to [14], discovery tools can be classi ed as browsing or indexing tools. Browsing tools organize their information space as a directed graph and users nd relevant information while navigating the information space. Gopher and WWW are probably the most well-known among the browsing tools. Indexing tools, on the other hand, organize searchable information into indexing databases and respond to user queries by searching their databases for relevant information. The Wide Area Information Servers (WAIS) tool is an example of an indexing tool. Currently, the Web is undoubtfully one of the most popular Internet services. The popularity of the Web can be greatly attributed to Mosaic, the Web's rst browser, which provided a very user-friendly, \point-and-click" interface. The World Wide Web (WWW) [2], or the Web, originally developed at CERN (the European Laboratory for Particle Physics) in Switzerland, merges the techniques of information discovery and hypertext. WWW organizes data into a distributed hypertext, where nodes are either full-text objects, directory objects called Web pages, or indices. WWW also supports full-text searches over documents stored at a particular WWW server. The WWW architecture is based on the client-server model. The WWW client provides users with a hypertext-like browsing interface. Besides its native HyperText Transfer Protocol (HTTP), WWW clients understand FTP and the Network News Transfer Protocol. FTP is used for accessing le archives on the Internet, where le directories are browsed as hypertext objects. NNTP allows access to Internet news groups and news articles. News articles may contain references to other articles or news groups, which are represented as hypertext links. HTTP allows document retrieval and full-text search operations. HTTP runs on top of TCP and maps each request to a TCP connection. HTTP objects are identi ed by a URL (Uniform Resource Locator) which include the HTTP protocol type, the corresponding server's name, and the path name to the le where the objects' contents reside. Parts of documents can also be speci ed. If a search operation is requested, the object's URL carries the set of speci ed keywords, instead of a path name. Current implementations of the HTTP protocol understand plain text, simple hypertext (HyperText Markup Language- HTML), and CGI (Common Gateway Interface) [29] formats. A plain HTML document that the Web server daemon retrieves is static, which means it exists in a constant state: a text le that doesn't change. A CGI program, on the other hand, is executed in real-time on the server, so that it can output dynamic information. Each time a client requests a URL corresponding to a CGI program, the server will execute it in real-time. The output of the CGI program will be sent to the client. CGI programs made the Web more dynamic since they allow servers to generate the data required by the client on-demand by executing the appropriate CGI program. The next step towards a dynamic Web is to allow the execution to happen on the client side. That is exactly the gap that the Java programming environment was designed to ful ll.

What is Java? Java provides a completely new way to think about distributed computing.

Just as when the desktop computers freed individuals from depending upon a single mainframe from everyday work, Java frees client computers on the Internet from the dependence upon host computers for the execution of dynamic content. Up to now, everything presented in a Web page was completely static. It had the electronic version of a printed document. However, Java provides functionality by allowing the execution of the code that can be distributed across the Internet in a portable, robust, secure, high-performance environment [18]. The Java architecture provides an overview of how Java works, and the basics of the underlying design. The Java environment refers to the programs you can use to run Java programs. Java is an interpreted language, and therefore needs a run-time system on every computer on which the applications are going to be run. This run-time system, or interpreter, can exist both inside other programs, such as HotJava or a WWW browser, or stand alone. By setting up and using these programs, individuals are able to execute Java programs that exist on Web pages, or are downloaded from an ftp site [18]. Java is also a programming language similar to C and C++, but is completely new, and optimized for object-oriented, distributed, multithreaded computing. While the Java language is fairly xed, the class libraries and other features change as the Java environment evolves [18]. 11

Object-oriented computing is the philosophy that software should act as individual agents, encapsulating a speci c function that can be loaded dynamically when needed and otherwise taking care of themselves by either providing dynamic interaction or working in the background supporting the application that called them. There are in fact several di erent protocols on the market that attempt to provide this kind of service{ Object Model (SOM), CI Lab's OpenDoc, Novell's AppWare Data Bus (ADB), Taligent's CommonPoint, and NeXT's Portable Distributed Objects (PDO). However, each of these models su ers from one or more de ciencies that make it dicult to implement in a heterogeneous networked environment where the object can exist on a Sun SPARCstation in California, and need to be run on a Pentium workstation in Maine [18]. The Java language is a tentative of providing a truly object-oriented, portable, robust, secure, high-performance, development environment for distributing dynamic content over the Internet. The Java language changes the passive nature of the Internet and WWW by allowing architecturally neutral code to be dynamically loaded and run on a heterogeneous network of machines such as the Internet. Java provides this functionality by incorporating the following features into its architecture, which make Java a promising candidate to be a major protocol for the Internet in the near future. portable. This means that it can run on any machine that has Java interpreter ported to it. robust. The feature of the language and run-time environment make sure that the code is well behaved. This comes primarily as a result of the push to portability, and the need for solid applications that will not bring down a system when a user stumbles across a home page with a small animation. secure. In addition to protecting the client against unintentional attacks, the Java environment must protect it against intentional ones as well. object-oriented. The language is object-oriented at the foundations of the language, and allows the inheritance and reuse of code both in a static and dynamic fashion. dynamic. The dynamic nature of Java, which is an extension of its object-oriented design, allows for run-time extensibility. high-performance. The Java language supports several high-performance features such as multithreading, just-in-time compiling, and native code usage. easy. The language itself could be considered a derivative of C and C++, so its familiar. At the same time, the environment takes over many of the error prone tasks from the programmer such as pointers and memory management. Java is aimed at being the universal standard for the transfer of dynamic, executable content over the Web. This has bene ts for the content developer, provider, and end user. Its potential applications are quite diverse:  Java could be used to provide stand-alone applications on an as-needed basis, or for upgrading existing applications.  Java can be used as the principle engine for behaviors and interaction in the next version of the Virtual Reality Modeling Language (VRML) [30].  Java could be used to program intelligent agents to comb the Internet for interesting or necessary information. This last feature is probably the one which has the most interest in our project. Perhaps one of the biggest problems with the Internet is the diculty in nding information. You could use intelligent agents to cruise through the Internet, gathering data that you either sent them out speci cally to get, or as a routine service. The intelligent agents would need to leave the con nes of your one's computer, and venture out. They would actually run on diverse systems, moving from database to database attempting to gather the information you need. Java provides the perfect language for implementing such a system. Because it is dynamically extensible and portable, a Java agent could move from system to system { no matter what underlying hardware platform was being used { link in with a running database, collect data and move on. Java is able to provide a secure environment within which such intelligent agents can run, without risking the security of the host system itself. 12

3.3 The need of a programming support tool

Tools in Parallel/Distributed Environments Today, parallel machines and program-

ming models are evolving rapidly, and there are no clear winters. Therefore, applications frequently must be ported to take advantage of improved hardware and software. To reduce the porting e ort, many tools have been developed to support portable parallel programming. However, many users in the high-performance community use machine-speci c programming interfaces. The meaning of portability seems to have been implicitly understood until real-world parallel applications began to be developed. Portability often comes with the price of reduced performance and tool introduced bugs that a ect program correctness. This results in dicult trade-o s for the developers of parallel programs. The implicit understanding of portability is no longer sucient, and we need criteria for evaluating the tools that support portable, parallel application development. The criteria for measuring portability can vary from one application domain to another. For example, in high-performance computing, accuracy and performance often take precedence over source code preservation, whereas developers for distributed clientserver applications may desire to minimize source code modi cation even at the expense of performance [3]. While standardization and careful implementation can help tools achieve machine-independent semantics, dicult research issues must be addressed for the tools to deliver reasonable performance { especially for the tools that support implicit parallelism. But as stated previously, achieving scalable high performance on massively parallel systems is dicult and users will probably have to take an active role in parallel programming for the foreseeable future. Because of the limitations in current tools, real-world application developers frequently use low-level tools such as libraries, and many of them have reported good performance. While developers and users of language-centered tools have encouraging performance results, more work is needed for these tools to deliver scalable high performance for real-world applications on massively parallel machines. According to Cheng [3], algorithm-level manipulations, currently beyond the state of the art, will be required for these tools to achieve scalable, high, parallel performance. Three criteria for evaluating tool support for portability have been de ned from the survey of high-performance application developers [3]: 1. Machine-independent semantics . The tool ensures that application programs have the same logical and numerical behavior on all platforms. 2. Reasonable performance . The tool allows the user to develop programs with an acceptable level of performance. 3. Machine-independent syntax . The tool enables application programs to be compiled on di erent platforms with an acceptably low level of source code modi cation. Distributed-memory parallel systems usually provide libraries that support the message passing programming model. These machine-speci c libraries usually have a unique combination of syntax and semantics, making application programs non-portable. To reduce programming effort, portable libraries have been developed to hide vendor speci cs in a uni ed message passing interface. It is hoped that modi cations of application source code can be signi cantly reduced where the library is ported. Portable libraries are normally implemented in terms of machinespeci c primitives. Enforcing exact semantic equivalence is dicult, owing to vastly diversi ed semantics supported by vendor-de ned message passing libraries. The semantics of many libraries was determined by early vendor-de ned interfaces such as Intel NX. Library developers have found it dicult to enforce chosen semantics on machines that support very di erent semantics, especially when trade-o s between high performance and semantic consistency must be made. Many researchers believe that providing libraries is not a viable long-term solution for machine-independent parallel programming because library-based approaches require the users operate at too low level and therefore are error prone and tedious. By using modi ed languages, on the other hand, compilers can automatically manage concurrency. One advantage of the language-based approach is that languages can de ne the semantics to preclude some hazards 13

caused by parallel execution. Another advantage is that languages can be designed to capture user-speci ed information necessary for better performance. Ideally, programs should be written expressing only application algorithms, not implementation issues dictated by hardware and system software. The parallelism inherent in an application should be available to the tools so that the potential parallelism can be utilized to take advantage of system resources. Functional languages, logical programming languages, and data ow languages have more of these desirable characteristics than conventional programming languages. These languages usually impose limitations such as requiring that program elements have no side e ects. While these restrictions make compilers analysis and transformations much easier, they often make it dicult to express commonly needed functions such as I/O, and they require the users to change their programming habits. In addition, these restrictions cause programs to use signi cantly more memory than conventional languages. Since memory is still a precious resource for many applications, techniques must be developed for the compilers to optimize memory usage in addition to optimizing performance. As a result, developing compilers for these languages can be a long and costly process. For example, a group of users at NASA Ames Research Center has indicated that for them to adopt a new language, the performance should be at least 5 to 10 times the performance of the current languages, or the development time should be half of what is required by the current practice. One evolutionary approach is to extend existing sequential languages such as Fortran, C, and C++. The rst set of extensions were added by computer vendors for expressing task creation, termination, communication, and synchronization. While these early activities have provided valuable experience in developing compilers, the proliferation of these extensions has made portable programs dicult and kept many users away from using them. Many tools that support extended languages have been developed for many di erent architectures. Proponents of object-oriented programming claim that object-oriented languages provide more support for information hiding and software reuse than conventional languages. To explore whether object orientation can help reduce the diculty of programming parallel machines, sequential languages have been extended to support parallel objects. Tools that support objectoriented parallel programming are relatively straightforward to create. As a result, there are quite a few of them, including several that have generated code for multiple architectures. Some of them have demonstrated reasonable to good performance. Recently, the Java phenomenon tried to group many of these features in an interpreted object oriented language with Internet run-time compatibility.

Visualization Tools and Performance Tools are not only essential for program cod-

ing. Achieving optimal performance requires that all facets of a system { architecture, operating system, programming language, compiler, application program, etc. { work eciently together. Debugging and performance tuning in conventional serial systems are often based on execution trace or pro le information, supplied either automatically by the system or through diagnostic printing inserted by the user. The volume of data produced by execution tracing can be overwhelming, particularly with parallel systems. Moreover, when trace data are produced by multiple processors, they can be extremely dicult to interpret. One proven method for dealing with large volumes of complex data is through visualization, graphically depicting the data for easier human comprehension [8]. Any type of visualization requires data describing the phenomenon to be depicted. In the case of performance visualization, it is the computation itself that is to be displayed graphically rather than the results of the computation. Such data are usually obtained by an execution tracing facility, which may implemented at various levels and take various forms. Some systems have performance monitoring facilities built into the hardware, while others must rely on software instrumentation. The latter may be in the form of macros inserted at compile time or may be embedded in system libraries and invoked at run time. In addition to these more or less portable monitoring facilities, some vendors of parallel systems supply system-speci c performance monitoring facilities as part of their system software. There are also some higher level, application-oriented programming interfaces that o er tracing capability. Execution trace data are usually stored in a le for postmortem analysis but may also be 14

processed on the y in a real-time system. Typically, various monitoring systems use varied trace formats that are usually not directly interchangeable but often can be converted for compatibility with various tools that may use the data. There appears to be some potential for convergence on a standard trace le format, but there may never be universal agreement on this issue. An alternative approach is to use a self-describing format, in which the structure and semantics of the data in a trace le are determined by a header that can be interpreted by a simple parser. This approach permits much greater exibility in targeting a given trace le for use in various performance tools and for ltering out data that may not be needed in a given context. The type of data collected depends on the type of architecture and the level of tracing. Some systems also support high-level tracing of application-speci c functions and data structures. There are numerous additional important issues in performance monitoring, including the intrusiveness of performance probes, synchronization of clocks across multiple processors, and limiting the potentially large volume of data that may be generated. Although visualization has been shown to be a generally e ective tool in improving the performance of parallel programs, many challenges remain. Perhaps the greatest challenge in performance visualization is scalability { the ability to deal with vary large sets, arising from large numbers of processors or long runs, or both. Many of the most e ective graphical displays for small numbers of processors do not scale to larger number of processors because there simply are not enough pixels on the screen to support the level of detail required. One possible alternative is to analyze and present the data by statistical methods, using means, maxima, minima, medians, etc., instead of the underlying raw data. Another alternative is to display cross sections or other subsets of the data or, more generally, to use hierarchical displays that provide a high-level view that can be zoomed in to reveal ner detail in a speci c area of interest. Another challenge in performance visualization is dealing with the wide diversity of parallel architectures and programming paradigms that are currently prevalent. There may eventually be a convergence on a \standard" architecture, but for the near term one must continue to deal with a variety of memory organizations, control structures, and network topologies. Some standards are emerging in parallel software, that will permit some degree of focus in performance visualization e orts, but diversity in parallel programming paradigms is likely to persist for some time. It is doubtful that any single performance visualization tool can (or should) embrace the full range of hardware and software options, but it would still be helpful to users if some commonality could be achieved in the way performance information is presented and the way such tools are used.

3.4 How do we put it together?

At his point, the basic question is: \can we merge these technologies"? Or else, \can we use Internet for parallel and distributed high-performance computing through suitable environments using a high level programming paradigm"? New available tools (such as Java and Internet browsers) are providing new ideas about this merging process. Some of these technologies are:  distributed (and parallel) architectures, based on network of computers (NOWs);  distributed operating systems;  high-level programming languages (e.g., object-oriented languages);  programming environments based on graphical interfaces, supporting correct and systematic coding and performance evaluation;  Internet environments: typically browsers and Java-like languages which allow local execution of remote codes. Clearly, several of these technologies are compatible and currently available. What is not completely available and/or easy-to-use, is a high-level abstract programming language associated to a development/execution environment executing on a distributed platform (architecture + operating system). High-level programming languages and distributed environments. There are two types of programming languages on distributed environments: those with some explicit mechan15

ism for message passing and those where the communication is hidden from the programmer. Examples of the rst one are typically PVM and MPI libraries. On the other hand, DOME [1] is an environment where message passing primitives are encapsulated inside methods. Data partitioning is homogeneous and the result is several SMPD tasks executing on parts of the data. The primitives are not seen by the programmer. As a matter of fact, adding message passing features to a programming language is not the most dicult task. The problem relies on portability, due to the di erent available communication protocols. Java is an interpreted non-parallel object-oriented language, to which classes may be added, in order to provide inter- and intra-object parallelism. High-level programming languages and graphic programming environments. Currently, there is a plethora of programming environments available to sequential programming. Maybe, the most typical of them is the \Turbo"-like environments of Borland (TurboPascal, TurboC) and now the Microsoft \Visual"-like environments (VisualC, VisualBasic). These environments basically support coding, debugging and execution of sequential programs. For parallel programs, there are several programming environments developed, but due to the diversity of programming paradigms, it is dicult to settle a de nitive one. In the sequential context, this is also true. Turbo and Visual tools are suitable for imperative coding, although similar environments exist for functional and logical programming. The main diculty in the parallel arena is to control non-determinism, communication and synchronization, and the ability to display these phenomena on the screen during parallel programming. Debugging and formal proo ng of the programs is more demanding: the existence of monitoring probes may a ect the behavior of the program. Besides, in a parallel programming scenario, the goal is always the speedup relative to sequential execution (except in the case where the algorithm is inherently parallel and there are no counterparts in sequential implementations). Therefore, performance evaluation should be supported during programming and debugging. Normally, this is not done by sequential programming tools. With the increasing popularity of parallel and distributed programming, it is common in the research community to nd proposals of parallel programming environments. In general, however, these environments are used by a very restricted and speci c community, if not useless at all. Parallel programmers seem to use text editors to code programs, trial-and-error (intuitive) techniques to debug and performance tune their applications. In general, software engineering principles are not respected. Unfortunately, this is not an implementation, but a cultural issue. Using a Java-based platform may give us some paths in di erent directions: (i) towards a programming tool based on Java is available by the Internet without installation costs; (ii) towards the easier development of a suitable distributed programming tool, due to the compatibility of Java programs with network environments. In this sense, Java may be a force pushing toward object orientation and a less imperative programming. Imperative program structures are rather di erent from object oriented ones. De nitely, tools should take this fact into account. Distributed environments and graphic programming environments. Since communication mechanisms are available in the language used to develop the proposed tool, the latter may be itself distributed on the network. The integration of this environment with the target operating system and architecture is desirable, considering that the tool should support performance evaluation. In performance evaluation and during execution, for example, workload distribution should be known in advance, in order to appropriately allocate parallel tasks. The conclusion is evident: the aimed graphical distributed tool must be written in a distributed language. This \distributed" language should be platform independent and therefore Java is a current candidate. The tool should not only support coding and eventual debugging, but also performance evaluation, support of alternative implementations and di erent con guration choices, according, for example, to the overall system load.

16

4 Research Issues

4.1 Issues concerning the programming paradigms 



 



Java was not designed to be a parallel language. Therefore, it does not o er task management and/or communication primitives. Our research in object-oriented parallel computing requires a language that allows objects to be created remotely. Objects should also be able to communicate with other existing objects. What is necessary for Java to become an object-oriented, parallel language? If it is feasible to turn Java into an object-oriented parallel language, how does it compare to existing object-oriented parallel languages? Following the trend of extending an existing sequential language using parallel construct libraries, there is currently an e ort to develop a Java+PVM environment. Since MPI is e ectively becoming a standard in the message-passing parallel computing scenario, it would be worthwhile to investigate whether a Java+MPI environment is a viable approach for parallel programming over the Internet, how di erent it is from the Java+PVM proposal, and how it can be used to design parallel object-oriented algorithms. Which would be the best concurrent object-oriented parallel model or models for parallelization under the Java environment? Are there well de ned parallelization strategies under an object-oriented parallel paradigm di erent from those under other parallel programming paradigms, such as message-passing or functional paradigms? Is there a precisely de ned methodology for an object-oriented parallel algorithm design? How necessary is such a methodology for a well-structured design of these algorithms? Is it possible to use existing parallelization strategy models (or templates) for this design process?

4.2 Issues concerning the run-time environment 





Is it possible to develop a comparative performance analysis based on code, parallelization strategies and other estimated data from the execution environment before actually executing the di erent object-oriented parallel programs? Given an object-oriented parallel algorithm using a certain parallelization strategy, is it possible to automatically map it to a new parallel/distributed program using a di erent strategy? How do we deal with the scheduling (static and dynamic) and load-balancing issues when executing interpreted object-oriented parallel programs over a wide-area network? Are these issues di erent from those associated to parallel machine environments with regular communication schemes?

4.3 Issues concerning the performance evaluation tool 

  

What is the impact on performance of a parallel algorithm compared to its sequential counterpart when using an interpreted language such as Java as opposed to a compiled language? Is there any overhead incurred due to the use of an object-oriented programming paradigm? Is there any bene t of using a network-oriented language such as Java to manage data input/output in parallel programs? What is the bene t of using di erent internet sites as potential heterogeneous processing elements for parallel/distributed processing? Is it reasonable and feasible to evaluate the performance of parallel algorithms using on-line and postmortem tools when dealing with interpreted programs? Is it possible to monitor Java programs? Is it necessary to monitor the interpreter as well? 17



Is a tool to collect and display relevant information associated with the performance of parallel/distributed programs running under the proposed paradigm feasible?

References [1] J.N.C. A rabe et al., \DOME: parallel programming in a distributed computing environment", Proceedings of the 10th International Parallel Processing Symposium, Hawaii, 1996. [2] T. Berners-Lee, R. Cailliau, A. Luotonen and A. Secret, \The World-Wide Web", Communications of the ACM, Vol. 37, No. 8 (1994). [3] D.Y. Cheng, Chapter 30 in Parallel & Distributed Computing Handbook (ed. Albert Y.H. Zomaya), McGraw-Hill, 1996. [4] D.E. Comer, Chapter 1 in Internetworking with TCP/IP - Volume 1 , Prentice Hall, 1991. [5] P. Danzig, K. Obraczka, A. Kumar, \An Analysis of Wide-Area Name Server Trac: A study of the Domain Name System", Proc. of the ACM SIGCOMM '92, Baltimore, Maryland, August 1992. [6] R. Duncan, Chapter 23 in Parallel & Distributed Computing Handbook (ed. Albert Y.H. Zomaya), McGraw-Hill, 1996. [7] M.J. Flynn, \Very high-speed computing systems", Proceedings of the IEEE, Vol. 54, No. 12 (1966). [8] M.T. Heath, Chapter 31 in Parallel & Distributed Computing Handbook (ed. Albert Y.H. Zomaya), McGraw-Hill, 1996. [9] K.Hwang, Advanced Computer Architecture: Parallelism, Scalability, Programmability , McGraw-Hill, 1993. [10] J.P. Kitajima and B. Plateau, \Modelling parallel program behaviour in ALPES", Information and Software Technology, Vol. 36, No. 7 (1994). [11] J.P. Kitajima, B. Plateau, P.Bouvry and D. Trystram, \A method and a tool for performance evaluation. A case study: Evaluating mapping strategies", Proceedings of the 1994 CUG - Cray Users Group - Meeting, Tours, France, October 1994. [12] T.G. Lewis and H. El-Rewini, Introduction to Parallel Computing , Prentice-Hall International, 1992. [13] D.A. Menasce, D. Saha, S.C.S. Porto, V.A.F. Almeida and S.K. Tripathi, \Static and Dynamic Processor Scheduling Disciplines in Heterogeneous Parallel Architectures", Journal of Parallel and Distributed Computing, Vol.28, No.1 (1995). [14] K. Obraczka, P. Danzig, S.H. Li, \Internet Resource Discovery Services", IEEE Computer, Vol. 26, No. 9 (1993). [15] R.H. Perrott, Chapter 29 in Parallel & Distributed Computing Handbook (ed. Albert Y.H. Zomaya), McGraw-Hill, 1996. [16] S.C.S. Porto and C.C. Ribeiro, \A Tabu Search Approach to Task Scheduling on Heterogeneous Processors under Precedence Constraints", International Journal of High-Speed Computing , Vol.7, No.2 (1995). [17] S.C.S. Porto and C.C. Ribeiro, \Parallel Tabu Search Message-Passing Synchronous Strategies for Task Scheduling under Precedence Constraints", Journal of Heuristics , Vol. 1 (1995). [18] T. Ritchey, Java!, New Riders, 1995. [19] H.S. Stone, High-Performance Computer Architecture , Addison Wesley, 3rd edition, 1993. [20] A. Tanenbaum, Distributed Operating Systems, Prentice-Hall, 1995. [21] L. Turcotte, Chapter 26 in Parallel & Distributed Computing Handbook (ed. Albert Y.H. Zomaya), McGraw-Hill, 1996. 18

[22] A.Y.H. Zomaya (ed.), Chapter 1 in Parallel & Distributed Computing Handbook , McGrawHill, 1996. [23] http://prospero.isi.edu/info/prospero/. [24] http://cs-www.bu.edu/groups/oceans/papers/Home.html. [25] http://www.ncsa.uiuc.edu/Indices/WebTech/Software.html. [26] http://www.javasoft.com. [27] http://www.netscape.com/. [28] http://www.boutell.com/faq/index.ht. [29] http://hoohoo.ncsa.uiuc.edu/cgi/. [30] http://www.yahoo.com/Computers and Internet/Internet/World Wide Web/Virtual Reality Modeling Language VRML/

19