Digital Libraries - Semantic Scholar

5 downloads 0 Views 2MB Size Report
Dieter W. Fellner, Sven Havemann. {d.fellner,s.havemann}@tu-bs.de. Institute of ..... Robert Schneider, Hans-Peter Seidel, and Wolfgang Straßer. Beiträge der ...
Technical Report TUBS-CG-2002-01

Digital Libraries

Dieter W. Fellner, Sven Havemann {d.fellner,s.havemann}@tu-bs.de

Institute of Computer Graphics University of Technology Mühlenpfordtstr. 23, D-38106 Braunschweig http://graphics.tu-bs.de

c Computer Graphics, TU Braunschweig, 2002 

Digital Libraries Dieter W. Fellner, Sven Havemann Computer Graphics, Braunschweig University of Technology [email protected], http://graphics.tu-bs.de

Abstract: As if large collections of purely textual documents would not still pose a rich set of research challenges (i.e., robust and reliable algorithms for structuring, content extraction and information filtering) for generations of researchers this paper advocates a change in the interpretation of the term ’document’: Rather than seeing a document in a classical context of being a ’paper’ predominantly compiled of text with a few figures interspersed we recommend to adopt a more general view which considers a ’document’ as an entity consisting of any media type appropriate to store or exchange information in a given context. Only this shift in the document paradigm will open new application fields to Digital Library (DL) technology for the mutual benefit of DL’s and application domains: DL’s offering an unprecedented level of functionality and (new) application domains (e.g., digital mock-up in engineering) benefiting from a more powerful DL technology.

1 Introduction According to a study by Lyman et al [15] the world produces between 1 and 2 exabytes (i.e., 1018 bytes or a billion gigabytes) of unique information per year. From that vast amount of data printed documents of all kinds only comprise 0.003%. The major share being taken by images, animations, sound, 3D models and other numeric data. Of course, a large an increasing proportion of the produced material is created, stored and exchanged in digital form – currently ranging at about 90% of the total. Yet, little of this information is accessible through Digital Library collections. This presentation gives a motivation for a ’generalized view’ on the term document and raises several issues stimulating research work in the field of Computer Graphics to make Digital Libraries of the future more accessible.

2 Digital Libraries and ’Generalized Documents’ Digital Libraries have gained much attention over the past years [9, 18, 4, 19]. The interest – not only in the area of Computer Science – is caused by the enormous growth of all kinds of electronic

publications as well as by the widespread availability of advanced desktop computing technology and network connectivity. According to the DELOS brainstorming report [3] Digital Libraries should enable any citizen to access all human knowledge anytime and anywhere, in a friendly, multi-modal, efficient, and effective way, by overcoming barriers of distance, language, and culture and by using multiple Internetconnected devices. Following this vision electronic documents – particularly the multi-modal ones consisting of many different media types like text, diagrams, images, 3D scenes, animations, and audio – are beginning to change the entire publication process in all scientific fields. With the new technologies at hand, authors and educators can now utilize animations and simulations together wich a rich blend of multimedia data to explain complicated phenoma and to distribute them in unprecedented ways. Of course, in the context of Digital Libraries also the term publication process needs to be seen with a wider focus ranging from the classical production of a scientific paper to, for example, the modeling of a (virtual) 3D environment explaining the effects of different BRDF’s (bi-directional reflection distribution functions) on global illumination. Speaking of geometric models it is worth mentioning that engineering disciplines for many centuries had already adopted this ’generalized publishing paradigm’ with technical diagrams (though typically 2D) always being the main media of communication and documentation. Of course, most other disciplines would focus on plain text, occasionally augmented with some figures.

3 Digital Libraries and Computer Graphics To illustrate the tight coupling between Computer Graphics and Digital Libraries as well as the contributions of Computer Graphics to the DL development we can just look at three different approaches to represent 3D objects: discrete elements (i.e., triangle/polygon-based surface approximation with all flavors of mesh-decimation), functional approach (i.e., algebraic surfaces, plant modeling based on procedural Lindenmeyer systems, CSG, generative modeling, . . . ), image based modeling (i.e., light fields, impostors, . . . ). All three techniques, considered core challenges in Computer Graphics are also key technologies to make DL’s usable [5, 8, 7]. In return, Digital Libraries provide a new framework to challenge Computer Graphics with a new level of complexity and user interface quality. This is emphasiszed by bringing in mind the (non-exhaustive) list of relevant research topics for multi-modal, i.e. ’generalized’, documents: • content classification (of non-text documents)

• retrieval and search by similarity (in audio-, image-, video-, 3D-databases): i.e., retrieval takes place in domain-specific databases where objects do not hold textual content descriptions • summarization: e.g., of video sequences, complex 3D models, . . . • navigation: e.g., in distributed 3D models • dissemination (over channels with limited bandwidth): e.g., level-of-detail, progressive transmission, . . . • linking in non-text documents: where and how to attach links, . . .

4 Compact Representation of 3D Documents Both issues, the economic creation and the efficient dissemination and rendering, directly depend on the model representation chosen. The internal representation is in fact a crucial issue, as the tools used for model creation, for storage and dissemination, and for display can only reflect the descriptive power of the model format. Moreover, the 3D representation will determine the effectiveness of the (re)construction assembly line, as all stages will augment the model database with additional information. Consequently, we propose a model representation capturing the full semantic structure from which necessary details can be derived on demand. Considering the body of computer graphics literature, the possibilities for creating and representing 3D models are manifold, such as triangle meshes for polygonal models, NURBS or B-spline patches for free-form geometry, or even unstructured point clouds and image-based methods. Of course, the choice of the model representation depends on the application domain. For example, documents in the cultural heritage context are typically the output from reconstruction applications, where the cost of model acquisition is a major issue. There are basically two ways to create models, either by scanning or by interactively creating 3D models from scratch (with a dedicated program, called modeler). 4.1

Automatic Reconstruction

There is a vast body of literature on the acquisition of real-world shapes through automatic or semiautomatic methods. The two major approaches pursued are 3D scanning and photogrammetry. 3D scanners directly produce massive amounts of points in space, using for instance laser range scanning devices. The pointclouds have to be post-processed to obtain triangulated surfaces as demonstrated, e.g., with the Michelangelo project [13]. With photogrammetric approaches as in [17], photos or image sequences are used as input data. The method attempts to identify feature points in the images, in order to reconstruct the positions from where the images have been taken (camera calibration). Actual point clouds are obtained by comparing the perspective distortion of the feature point positions in the different images. Although

impressive results are possible with this method, it is still prone to error due to noise and to occlusion in regions where no image information is available. This method can also be combined with purely image-based approaches, i.e., where no reconstruction of the underlying geometry is attempted at all. Instead, the images acquired are used for display, using a technique like impostors, as for instance in the Façade system from Debevec et al. [2]. An important drawback common to all automatic reconstruction methods is that object information is synthesized at a low, unstructured level. There is hardly a way of distinguishing a window from a door in such a polygon soup. While this is not of great importance in applications where acquired datasets only need to be rendered, it is a great drawback for all applications where knowledge about the inner structure of objects is essential. For example, in order to build a Digital Library holding ’general’ cultural heritage documents the reconstruction process has to collect as much structural and stylistic information as possible from the remainders of ancient cities, in order to derive the shape and appearance of destroyed buildings from them. To our knowledge, there are no ways to accomplish this task automatically.

Figure 1: Window front from Baunschweig City Hall

4.2

3D Modeling

Most commercial modelers such as 3D Studio Max or Maya give full control over model creation to the user. The downside of this approach is that 3D modeling still is a tedious task requiring too much manual intervention. Every bit of geometry has to be specified by the user. The disadvantages of this approach are apparent: Every 3D model is basically an individual item, and once it is finished, can only be used to create identical replications. This leads to limited changeability, i.e., it is hard to update a 3D model in order to meet changed specifications, and to limited re-usability, i.e., if a similar, but slightly different model is required, much work has to be re-done. The underlying problem here is that the modeler typically gives too many degrees of freedom (DOFs) to the user. The user can change a model in all possible ways, but only at a very low level. The problem of creating models from re-usable, parameterized components has been thoroughly elaborated in the field of parametric modeling. There are basically two approaches for automatic model creation: The constraint-based approach and the feature-based approach. Just focusing on the latter technique, the generative modeling approach can provide a general modeler with the ability to let the user specify sequences of modeling operations (features) which can then be applied automatically to the model. If, for example, a sequence of operations creating a window can be designed interactively to take as parameters not only height and width, but also the number of consecutive windows, the task of modeling a window front like the one from Baunschweig City Hall in Figure 1 can easily be accomplished. Admittedly, the creation of well-defined sequences of parameterized modeling operations might turn out to be a non-trivial task which can only be mastered by trained users.

Figure 2: Two levels of subdivision of a cube and the limit surface

4.3

The Combined BRep

The key element to the generative modeling approach applied in the cultural heritage project CHARISMATIC (http://www.charismatic-project.com) but also for a higly compact representation of 3D documents is a particular mesh representation for three-dimensional models, called Combined

BRep. It combines polygonal mesh modeling with subdivision surfaces in a simple, yet very powerful way. The subdivision scheme used in the Combined BRep has been invented by Catmull and Clark as early as 1978 [1]. It operates on any given input mesh by recursively subdividing its faces, thereby generating new vertices, edges and faces (see Fig. 2). The subdivision process quickly converges to a smooth limit surface which is C2 continuous almost everywhere.

Figure 3: Street lamp model rendered as polygonal model (left) and as (Catmul-Clark) subdivision surface (right) (courtesy Viewpoint Data Labs)

Figure 3 shows a street lamp rendered as a subdivision surface and for comparison as a polygonal model. It can be seen how faithful the freeform representation captures the intended shape, actually synthesizing model detail beyond the resolution of the polygonal model.

Figure 4: Compbined usage of polygonal and of freeform geometry. Profiles are swept along a complicated curve, not just along a straight line. Furthermore, mirror symmetry is involved.

This representation also has the power of representing both polygonal and freeform geometry within the same mesh representation. Figure 4 illustrates this potential with an ornamental model. There, profiles are swept along a smooth curve, not just along a straight line. Despite its consistent framework, the mesh structure makes a clear distinction between ’normal’ edges resulting from tesselation steps (e.g., on free-form surfaces) and ’feature’ edges which essentially control the object’s shape.

4.4

Dealing with digitized objects

So far, the presented approach has not addressed 3D models encoded as triangle meshes, typically resulting from various digitization and 3D reconstruction steps like laser scanning or photogrammetric algorithms. But, of course, particularly the preservation aspects in a cultural heritage context demands the proper support of scanned and over-sampled objects resulting in large triangle meshes.

Figure 5: left: Crocodile, original model (courtesy Viewpoint Data Labs); right: Mesh at 10% of original vertex count after smoothing and conversion to progressive mesh

Our current solution is based on polygon simplification, progressive meshes and Loop subdivision surfaces and is illustrated by Fig. 5 and Fig. 6. The triangles in the raw model are significantly reduced by a polygon simplication algorithm from Garland and Heckbert [10] before being converted into a progressive mesh [12]. From the progressive mesh an appropriate level of detail – typically around 10 % – is chosen as the input mesh for Loop subdivision [14]. 4.5

Web-based dissemination and interaction

The final issue to be discussed is the degree to which the presented approach lends itself to webbased usage, i.e., the suitability of the encoding structure for web-based storage and transmission and for efficient display at the receiving client’s side. Despite increasing bandwidth capacities of the web the net throughput has to compete with an even faster increasing number of users compensating the net gain in bandwidth. Also, the representation chosen has to warrant interactive rendering speeds at current client hardware. For the generative modeling approach as well as for the combined meshes we can proudly report that the memory requirements are one to two orders of magnitude below alternative encoding formats. For standard (triangle) meshes an encoding in tagged OBJ-files is as compact as standard OBJ encodings. With regard to rendering speed Figure 7 shows the screen dump of a browser plugin supporting 3D

Figure 6: left: Resulting model rendered as Loop subdivision surface; right: Detail of mesh structure and resulting tesselation around the crocodile’s eye

elements based on generative modeling as well as on combined meshes. Despite the fact that the screen dump only shows two 3D elements which can be inspected and manipulated at interactive rates the page contains 52 models, some of them highly complex. Interactive rendering rates have been made possible by recent results on high-speed rendering of Loop and Catmull-Clark subdivision surfaces [16, 11].

Conclusion The treatment of 3D models as native document types in today’s Digital Libraries requires a careful look at many aspects of 3D graphics which, until very recently, has been considered as one of many application domains in computer science (as well as in many other disciplines where computer graphics plays the role of an enabling discipline). Particularly, the generation (i.e., the modeling), the (lossless) storage, the transmission over limited-bandwidth channels, and the effective interaction with the 3D documents deserve special attention and, for some areas, new approaches. In this paper the presented framework focuses on the semantic encoding of manually modeled 3D objects – i.e., strives for the preservation of the semantic detail put into the model by the human operating the modeling tools – as well as on the highly compact representation of objects, encoded as polygonal meshes. However, this is only the first step to fully integrate 3D graphical objects into Digital Libraries and many more steps will be necessary. Based on the excellent results from approximately 30 research groups in the first two funding periods of the research initiative V 3D 2 [6] and in many other DL initiatives worldwide we encourage both, the Digital Library and the Computer Graphics community, to contribute to the challenging problems in the field of ’3D Digital Libraries’.

Figure 7: Screen-dump of a web page holding 3D elements which can be inspected and manipulated at interactive rates. The wireframe in the left image indicates the low number of polygons needed to define the complete model.

Acknowledgement The support from the German Research Foundation (DFG) under the Strategic Research Initiative Distributed Processing and Delivery of Generalized Digital Documents (V 3D 2) [6] to address basic research challenges in the field of Digital Libraries and from the European Commission under the CHARISMATIC project is greatly appreciated.

References [1] Ed Catmull and J. H. Clark. Recursively generated B-spline surfaces on arbitrary topological meshes. ComputerAided Design, 10:350–355, September 1978.

[2] P. E. Debevec, C. J. Taylor, and J. Malik. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. In Proc. SIGGRAPH ’96, pages 11–20. ACM, 1996. [3] DELOS. Digital libraries: Future directions for a european research programme. Technical Report ERCIM-02W02, 2002. San Cassiano, Alta Badia, Italy, 13-15 June 2001. [4] Albert Endres and Dieter W. Fellner. Digitale Bibliotheken – Informatik-Lösungen für globale Wissensmärkte. dpunkt.verlag, Heidelberg, 2000. [5] Dieter W. Fellner, editor. Graphics and Digital Libraries, volume 22, 6 of Computers & Graphics. Elsevier, 1998. [6] Dieter W. Fellner. Strategic Initiative V 3 D2 – Distributed Processing and Delivery of Digital Documents. German Research Foundation (DFG), http://graphics.tu-bs.de/dfgspp/V3D2, 1998-2003. [7] Dieter W. Fellner. Graphics content in digital libraries: Old problems, recent solutions, future demands. Journal of Universal Computer Science, 7(5):400–409, 2001. [8] Dieter W. Fellner, Jörg Haber, Sven Havemann, Leif Kobbelt, Hendrik P. A. Lensch, Gordon Müller, Ingmar Peter, Robert Schneider, Hans-Peter Seidel, and Wolfgang Straßer. Beiträge der Computergraphik zur Realisierung eines verallgemeinerten Dokumentenbegriffs. it+ti Informationstechnik und Technische Informatik, 42(6):8–16, 2000. [9] Edward H. Fox and Gary Marchionini. Toward a worldwide digital library. Communications of the ACM, 41(4):29 pp, 1998. [10] Michael Garland and Paul S. Heckbert. Surface simplification using quadric error metrics. In Proc. SIGGRAPH ’97, page 209 216. ACM, 1997. [11] Sven Havemann. Interactive rendering of catmull/clark surfaces with crease edges. Visual Computer, 18(5-6):286– 298, 2002. [12] H. Hoppe. Progressive meshes. In Proc. SIGGRAPH ’96, pages 99–108. ACM, 1996. [13] Marc Levoy, Kari Pulli, Brian Curless, Szymon Rusinkiewicz, David Koller, Lucas Pereira, Matt Ginzton, Sean Anderson, James Davis, Jeremy Ginsberg, Jonathan Shade, and Duane Fulk. The Digital Michelangelo Project: 3D scanning of large statues. In Proc. SIGGRAPH 2000, pages 131–144, 2000. [14] Charles T. Loop. Smooth Subdivision Surfaces Based on Triangles. PhD thesis, Dept. of Mathematics, University of Utah, 1987. [15] Peter Lyman, Hal R. Varian, James Dunn Dunn, Aleksey Strygin, and Kirsten Swearingen. How much information? http://www.sims.berkeley.edu/research/projects/how-much-info/index.html, 2000. [16] Kerstin Müller and Sven Havemann. Subdivision surface tesselation on the fly using a versatile mesh data structure. Computer Graphics Forum, 19(3):C151–C159, August 2000. Proc. Eurographics’2000 Conf. [17] Marc Pollefeys, Luc Van Gool, Maarten Vergauwen, Kurt Cornelis, Frank Verbiest, and Jan Tops. Image-based 3D acquisition of archaeological heritage and applications. In Proc. VAST 2001 Intl. Symp. ACM, 2002. [18] B. Schatz and Hsinchun Chen. Digital libraries: Technological advances and social impacts. IEEE Computer, 32(2):45–50, February 1999. [19] Ian H. Witten and David Bainbridge. How to Build a Digital Library. Morgan Kaufmann, July 2002.