Large-Scale Network Visualization

3 downloads 172145 Views 128KB Size Report
Jun 21, 1999 - and services on them, but displaying the data asso- ciated with very large ... addressed in the AT&T Infolab, a multi-disciplinary ... active queries combining time, geography, customer and type ... network configured by large business customers, dis- .... numbers and edges represent a call from one number.
Large-Scale Network Visualization

James Abello Emden R. Gansner Stephen C. North AT&T Labs - Research Florham Park, NJ 07932 fabello,erg,[email protected] June 21, 1999 1 Introduction

all the answers, especially when analyzing networkbased data. In this report, we describe two uncusVisualization is increasingly important for under- tomary approaches: a project in large-scale graphics standing the structure of communication networks and network displays, and an experiment with a novel and services on them, but displaying the data asso- graphical representation of networks. ciated with very large networks is dicult. There are abundant research problems in visualization metaphors, methods, algorithms and the engineering 2 Large Displays of scalable interactive systems. This area is being addressed in the AT&T Infolab, a multi-disciplinary One approach to the scale problem is to use a physproject investigating visualization and analysis for ically large display. Figure 1 shows our display AT&T's network and service businesses. The sheer wall, inspired by projects such as the CAVE[3] and magnitude of the data involved makes the challenge Powerwall[5]. interesting: the voice network carries about 300 million calls per day, plus diagnostic information that is generated by the network itself; packet data networks such as frame relay, Asynchronous Transfer Mode (ATM) and Internet Protocol (IP) give rise to upward of millions of records daily, describing network con guration, dynamic behavior and events. The longrange goal for the Infolab work is to be able to provide a high-level, integrated view of all the AT&T networks, handling all the underlying data streams in real time. There are a variety of standard techniques for managing the display of large data sets. These include sampling the data, eliding or coalescing semantically chosen subsets, or relying on distorted views such as sheye[8] or hyperbolic[7] projections. Although Figure 1: The Infolab Wall these methods are useful, they clearly do not provide 1

Our main display wall (6  15 ) is driven by 8 LCD projectors. These are connected through a softwarecontrolled video switch, usually to display the output of two graphics pipes of an SGI Onyx. The same Onyx can drive a smaller (7  9 ) 4-projector wall elsewhere in the same building, using its third pipe. Other compute and disk servers for network data analysis projects are connected on an 800-megabit High Performance Parallel Interface (HIPPI) network, providing 10 terabytes of on-line storage and another 20 terabytes of tape under hierarchical storage management. One important application that runs on the display wall is Swift-3d, an interactive, large-scale network viewer [6]. Swift-3d has modules for data collection, storage, analysis and display of at least a full day's worth of voice call-detail records. Swift-3d itself is network and data-independent and is adapted to applications by scripting and loading geometry les and data-to-geometry maps. It provides interactive selection and aggregation to control the level of information presented in its views. Selection can take the form of a stream database query, graph search or geometric pan-and-zoom. The forms of aggregation supported include computing counts, sums and set unions. Figure 2 shows a view of Swift-3d. In this gure, usage of two network services is compared (encoded by color). Activity is aggregated up to the level of local telephone exchanges. When an animated time series is displayed, one can observe how the balance between the two services shifts through the day. The user interface supports interactive queries combining time, geography, customer and type of service to create selective views for data exploration. Our experiences with this work have shown that a large, dense display qualitatively changes interactive network visualization, beyond the mere presence of many more pixels. The large display favors group collaboration and investigation. It makes it possible to apply an assortment of linked views simultaneously. The underlying scalable analysis and visualization tools enable one to query the entire network activity database graphically. A next step in this work is to learn how to display integrated views of multiple, layered networks. For 0

0

0

0

Figure 2: A view of Swift-3d example, although we can display a voice or wireless network, or the structure of a virtual private data network con gured by large business customers, displaying an informative view of several networks at once so that the relationships between them are obvious is still dicult. Another problem we are investigating is how to interconnect multiple display walls and their applications. The goal is to support collaboration between multiple network operation centers, and other forms of distributed visualization over wide-area networks. The views in Swift-3d are, at present, based on the underlying geography. This is a signi cant limitation for network visualization. Often, many endpoints are located in a few dense metropolitan areas, with large `deserts' in between, so maps do not use the available pixels eciently. Also, maps work well for displays of endpoints (vertices) but not for edges (vertex pairs), let alone groups of more complex structures such as routes, ows, or subgraphs representing virtual networks. With IP networks, geographic coordinates are not always even available. One can move to more abstract topological representations, but the standard visualization techniques rapidly become unusable as the data grows, even though a large display delays 2

this somewhat. This points to the need for additional work in display metaphors and interaction techniques for large graphs, one of which is described next.

3 Graph Surfaces Graph surfaces[1] provide a metaphor that uni es visualization and computation on weighted multidigraphs. They are endowed with a collection of \natural" operations to provide hierarchical browsing. Mapping a graph to a hierarchy of surfaces gives

exibility in the handling of the I/O and screen bottlenecks. Graph surfaces can be updated incrementally. They are suitable for the maintenance and navigation of external memory graphs (cf. [2]) whose vertex sets are hierarchically labeled.

Figure 3: A graph surface

3.1 What is a hierarchical graph surface?

a function from E to the non-negative integers giving the edge multiplicity. For a rooted tree T, we let height(T) be the maximum distance from a vertex to the root of T and let T (i) be the set of vertices of T at distance i from the root of T. Given a multi-digraph G = (V; E; m) and a rooted tree T such that the set of leaves of T equals V (G), the i-slice of G is the multi-digraph with vertex set T(i) and with a multi-edge (p; q) de ned to \represent" the collection of edges in G running from the sub-tree rooted at p to the sub-tree rooted at q. The multiplicity of the edge (p; q) is the sum of the multiplicities of the edges (x; y) that it represents. Similar to the approach of Duncan et al.[4], we construct the hierarchical view of G given by T , H(G; T), as T plus the collection of i-slices. The novelty now is that each slice is represented as a surface and T is used as a road map to move from surface to surface. We describe next the main navigation operations.

The main idea is to view a weighted multi-digraph as a discretization of a two dimensional surface in 3space. Under this view and for a xed ordering of the vertex set, the corresponding rectangular domain is triangulated and each point is lifted to its correct height. This provides a piecewise linear continuous function forming a polyhedral terrain. The terrain is used as an approximation to a surface representing a multi-digraph. An example is shown in Figure 3. In order to handle very large graphs, a hierarchy of surfaces is constructed. Each of them represents a multi-digraph obtained from an equivalence relation de ned on the edge set of the input graph. Operations are provided that allow the user to navigate the surfaces.

3.2 Construction of a hierarchical graph surface

3.3 Navigating the surface hierarchy

To simplify the exposition, we concentrate on multidigraphs. (The adaptation to weighted multidigraphs is straightforward.) A multi-digraph is a triplet G = (V; E; m), where V = V (G) is the set of vertices, E = E(G) is the set of edges, and m is

Given a multi-edge e = (u; v; m(u; v)) in the i-slice of G, the operation ZoomIn(e) deletes e, returns the set of edges in the (i + 1)-slice that are represented by e, displays the corresponding surface and registers in a 3

navigation log le information about this operation. memory index to represent T . Filtering the edges requires now several passes over the data. The I/O An example is exhibited in Figure 4. performance depends strictly on the I/O eciency of the access structure. The visualization of a data set that is several orders of magnitude more than the available screen resolution calls for some form of hierarchical graph decomposition. Graph surfaces by de nition provide a geometric view of a very large graph in a manner that is amenable to hierarchical browsing via the ZoomIn and ZoomOut operations. This makes it feasible to explore deeper hierarchy elements at a ner level of detail. Also, a graph surface provides both a uniform global and local view.

3.5 Implementation

The graphical engine generates polyhedral terrains that correspond to individual edges in H(G; T). Initially, a suitable slice in H(G; T ) is chosen as the root of the visual hierarchy, depending of the available screen resolution. The polyhedral terrains are generated with a triangulation algorithm and are displayed using several visual cues and dynamically generated Figure 4: Graph surface navigation labels. The graphical engine is implemented in C++ uses the OpenGL standard library for the renderThe operation ZoomOut takes as input a subgraph and ing portion. Currently, a mouse/keyboard interface of the (i +1)-slice that is registered in the navigation is used. The use and gestures to navigate log le, deletes the corresponding surface and inserts the environment ofis joysticks currently under consideration. back the corresponding representative edge e.

3.4 Handling the I/O and screen bot- 3.6 Applications Currently, graph surfaces are being used experimentlenecks

tally for the analysis of several large multi-digraphs arising from the AT&T network. These graphs are collected incrementally. For example, in the call-detail multi-graph, vertices correspond to phone numbers and edges represent a call from one number to another. The graph grows by daily increments of about 300 million edges, de ned on a vertex set on the order of 300 million vertices. The aim is to process and visualize this type of multi-digraph at a rate of at least a million \edges" per second. Internet data is another prime example of a hierarchically labeled multi-digraph that ts quite naturally the graph surfaces metaphor. Each i-slice rep-

When G is an external memory graph residing on disk there are three cases to consider:  T ts in main memory  T does not t but V (G) does  V (G) does not t. In the rst case, the edges of G are read in blocks and each one is ltered up, through the levels of T, until it lands in its nal slice. This can be achieved with one pass over the data. The second and third case are handled by setting up a multilevel external

4

resents trac among the aggregate elements that lie editor, Symposium on Graph Drawing GD'98, volat the ith level of the hierarchy. The navigation operume 1547 of Lecture Notes in Computer Science, ations can be enhanced to perform a variety of statispages 384{393, 1998. tical computations in an incremental manner. These in turn can be used to animate the trac behavior [8] M.-A. Storey and H. Muller. Graph Layout Adjustment Strategies. In F.J. Brandenburg, edithrough time. tor, Symposium on Graph Drawing GD'95, volWhen the vertices of the multi-digraph have an unume 1027 of Lecture Notes in Computer Science, derlying geographical location, they can be mapped pages 487{99, 1996. into a linear order using, for example, Peano-Hilbert curves, in order to maintain some degree of geographical proximity. In this way, the constructed surface maintains a certain degree of correlation with the underlying geography.

References [1] J. Abello and S. Krishnan. Graph Surfaces. In Int. Conf. Industrial and Applied Math. (ICIAM), July 1999. [2] J. Abello and J. Vitter, editors. External Memory Algorithms, volume 50. AMS-DIMACS series, 1999. [3] C. Cruz-Neira, D. Sandin, and T. DeFanti. Surround-Screen Projection-Based Virtual Reality: The Design and Implementation of the CAVE. In Siggraph 1993 Conference Proceedings, pages 35{142, 1993. [4] C. Duncan, M. Goodrich, and S. Kobourov. Balanced Aspect Ratio Trees and Their Use for Drawing Very Large Graphs. In S. Whitesides, editor, Symposium on Graph Drawing GD'98, volume 1547 of Lecture Notes in Computer Science, pages 111{124, 1998. [5] Paul Woodward et al. University of Minnesota PowerWall, 1998. http://www.lcse.umn.edu/research/powerwall/powerwall.html. [6] E. Koutso os, S. North, and D. Keim. Visualizing Large Telecommunication Data Sets. IEEE Computer Graphics and Applications, 19(3):16{ 19, 1999. [7] T. Munzner. Drawing Large Graphs with H3Viewer and Site Manager. In S. Whitesides, 5