Multimedia Applications and Their Implications on Database ...

8 downloads 9033 Views 362KB Size Report
to satisfy the needs of multimedia application in the form of multimedia database management systems. In section 2 of this paper we describe a few sample ...
Multimedia Applications and Their Implications on Database Architectures Wolfgang Klas, Karl Aberer

GMD-IPSI Integrated Publication and Information Systems Institute Dolivostr. 15 D-64293 Darmstadt, GERMANY email: fklas, aberergdarmstadt.gmd.de

In: Advanced Course on Multimedia Databases in Perspective, University of Twente, The Netherlands, 1995.

Contents

1 INTRODUCTION 2 SAMPLE APPLICATIONS 2.1 2.2 2.3 2.4 2.5

A Multimedia Publication Environment : : : : : : : : : : : : : : : : Multimedia and Database System Support for Systems Engineering : A Multimedia Calendar of Event Teleservice : : : : : : : : : : : : : : Multimedia Document Archives : : : : : : : : : : : : : : : : : : : : : Other Emerging Applications : : : : : : : : : : : : : : : : : : : : : :

3 4

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

3 CHARACTERISTICS 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9

Types of Multimedia Data : : : : : : : : : : : Temporal Aspects : : : : : : : : : : : : : : : Media Representation : : : : : : : : : : : : : Data Volume : : : : : : : : : : : : : : : : : : Data Modelling : : : : : : : : : : : : : : : : : Resources : : : : : : : : : : : : : : : : : : : : User Interaction : : : : : : : : : : : : : : : : Querying Multimedia Information : : : : : : Typical Database Management Functionality

11 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

4 BUILDING BLOCKS FOR MULTIMEDIA DATABASE SYSTEMS 4.1 The Notion of Multimedia Database Management Systems : : 4.2 Multimedia Data Models : : : : : : : : : : : : : : : : : : : : : 4.2.1 Multimedia Data Abstraction : : : : : : : : : : : : : : 4.2.2 Time-dependent data : : : : : : : : : : : : : : : : : : 4.2.3 Query Processing and Retrieval : : : : : : : : : : : : : 4.2.4 Object-Oriented Paradigm : : : : : : : : : : : : : : : : 4.3 Exploiting Traditional Database System Technology : : : : : 4.4 A Reference Architecture For Multimedia Database Systems : 4.4.1 Playout Management : : : : : : : : : : : : : : : : : : : 4.4.2 Continuous Data Management : : : : : : : : : : : : : 4.4.3 Multimedia Storage Management : : : : : : : : : : : : 4.4.4 System support : : : : : : : : : : : : : : : : : : : : : :

5 Conclusion

4 6 7 9 10

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

11 13 14 15 16 17 18 18 19

19

19 21 21 22 25 26 26 29 30 31 32 32

32

1

1 INTRODUCTION Over the last ve to six years everybody has been talking about multimedia computing and how multimedia computing would change our ways of doing business and of running our every day life. The discussion was mostly based on technology becoming available less on a need of society itself to go "multimedia". The continuing search for a so called "killer"-application is sucient proof of that fact. But how did this discussion come about? The rapid growth in compute power available in every PC or workstation, the development of high-speed digital communication networks, the appearance of digital input and output devices for all kinds of unorthodox data types, and not least the new user interface paradigms more close re ecting our human experience and habits have red the imagination of inventive thinkers and enterpreneurs. In all the excitement around multimedia and the rapid creation of early products a more systematic investigation and development has been left to nd. Add on extensions in operating systems, new types of communication protocols, ad hoc addition to relational database systems, and all kinds of representation and exchange formats for multimedia systems have been developed. However, more recently companies, developers, and researchers have come about and realize now that a more systematic approach to multimedia in the long run will be bene cial to everybody. Open, extensible, standardized systems and system components will enable the reliable, save, and secure handling of the multimedia data procured, created, and used by eventually everybody. It is, of course, impossible to cover all the necessary aspects of an extensible, multi-user, multimedia system in a single paper and we, therefore, will concentrate on the database aspects of such systems. Using some typical application scenarios we will investigate the functionality required in multimedia databases and illustrate that "simple" extensions of relational databases are not the answer but that more advanced systems, incorporating concepts of object-oriented or active databases as well as handling functionality for (new) multimedia data types, are needed. The bene ts of multimedia database management systems are especially pronounced in areas were groups of (multimedia) information producers create complexly structured multimedia information that has to be kept and manipulated/updated over longer periods of time and that will be accessed by a multitude of information consumers each looking for his "individualized" information that satis es his information needs of the moment. In such environments multiple concurrent producers and consumers have to handle the system. They have to be isolated from each other in such a way that they do not interfere with each others work but on the other side can cooperate as group when their activities require such. Database systems, of course, have been developed originally to provide for such concurrent utilization in the commercial eld. Over the years more and more non-commercial applications have realized these bene ts of database management systems leading to extensions like objectoriented, active, real-time, or deductive database management systems. As we will show in this paper many of these features are needed and need to be integrated and extended further in order to satisfy the needs of multimedia application in the form of multimedia database management systems. In section 2 of this paper we describe a few sample multimedia applications involving databases. Section 3 discusses the characteristics of multimedia data and observations with respect to requirements for multimedia database systems. In section 4 we draw general conclusions from the characteristics and observations with respect to the functional building blocks of a multimedia database system. Based on a reference architecture we discuss individual extensions of database technology required for multimedia database systems. Section 5 concludes the paper. 2

2 SAMPLE APPLICATIONS To give an idea of typical application environments for multimedia database management systems we will describe some applications in the following. The rst example is a publication scenario for a multimedia information service, the second example shows how multimedia concepts and database management systems concepts can be used to support the design and evaluation of complex technical systems. Finally, some other applications with characteristics di erent from the previous examples will be summarized.

2.1 A Multimedia Publication Environment

At GMD-IPSI a prototypical implementation of an electronic multimedia magazine, called MultiMedia It consists of an editing environment, where information providers can produce and combine multimedia information, and of a reading environment where information consumers can access the information. Therefore, the MultiMedia Forum serves as an example for the entire multimedia publication process which goes far beyond the traditional desktop publishing paradigm. The publishing processes supported by the MultiMedia Forum make use of an underlying database system. The general impact of using advanced database concepts in such an environment is discussed in [ABH94]. Figure 1 shows the multimedia publication process. The functionality supported by this prototype can be grouped into three main functions as follows and it will be described in some more detail below. Forum [S+ 94] has been developed.

 Information import: this process covers the creation and the acquisition of information from

the authors or editors, and the transformation of import formats into formats used internally for further processing or information exchange with other systems.  Information processing: this process consists of storing, indexing, retrieving, layout preprocessing, and manipulating documents, which are modeled by means of a rich semantic document model (e.g. a hypermedia document model).  Information export: this process deals with the export of information to other environments and the distribution of multimedia documents to users. This includes important features like administrative functions such as access control mechanisms, work ow-management, and accounting.

Information Import: Various types of input information are imaginable. Paper or electronic

documents, analog or digital media like graphic, video clips or audio, numerical data and hypertext documents are accessible via various information bases. Such sources can be e.g. paper archives, the user's mailbox, private archives, heterogeneous globally accessible or online databases [YGM94], video/audio le-servers or public archives. The rst step in a publication process is the digitalization of analog data input. All digital data need then to be prepared (i) to be converted into an appropriate document format (e.g. SGML - Standard Generalized Markup Language [ISO86]) by proper tools (e.g., DREAM [GF92]), (ii) to be compressed according to some standard compression techniques (e.g. JPEG [ISO92], MPEG [Gal91]), or 3

import

processing

export

status info

ext. doc.

retrieval and select

contr. info dig. doc.

mm doc.

edit

enrich

release mm doc.

digitize archiving anal. doc. store

enr. doc.

distribute

work. doc.

pro– duct

ar– chive access

gen. prof. selection ind. prof.

Figure 1: The Multimedia Publication Process Model at GMD-IPSI

4

(iii) to be preprocessed, e.g., identifying and extracting the abstract of a multimedia document, categorization of documents, the creation of metainformation (document content describing information), and the actual insertion into the multimedia document pool. The input to the document pool of the MultiMedia Forum are currently hypermedia documents structured by means of SGML and containing linked multimedia content data that will be extended to HyTime [NKN91] in the future [BA94].

Information Processing: Documents are stored in a multimedia document pool based on an

object-oriented database management system with multimedia extensions. Tools used by the authors and editors retrieve and load documents stored in the database by using an integrated retrieval interface providing access to the entire document pool.

Information Export Selection and retrieval facilities are provided by a speci c export component, called the readers environment, as well as by the application programming interface of the MultiMedia Forum. Retrieval facilities are based on formal query languages and/or navigational interfaces. The query language can be used in an ad hoc interactive fashion or embedded in a host programming language (e.g., C++). Application dependent interfaces, e.g., a particular document pool navigator, which are based on their own query engine, can better support user needs in a speci c application context. The disadvantage of application speci c interfaces is a lack of reusability and exibility. Another important function available is releasing information which is applied before distributing the nal approved electronic product. Releasing information triggers among other things the (active) distribution of information, e.g., a new issue of the magazine. From a technical point of view the distribution is mainly opening and transmitting data to a speci c communication channel. These could be networks, an external storage device (e.g., disk or CDROM), external or online databases [YGM94], or an electronic archive. Typical new distribution channels are new services such as electronic bookstores or electronic libraries.

2.2 Multimedia and Database System Support for Systems Engineering

The design process of technical systems has to be supported by various di erent software tools. In traditional systems engineering environments the integration of these tools resolved the heterogeneity of the underlying operating and le systems. Integration was achieved by sharing common data (e.g. documents) by means of les and standard data exchange formats. Under these circumstances it was dicult to manage the dependencies between di erent documents and to ensure the consistency of documents in a multiuser environment. The MuSE 1 project [DGJ+94] aims at the integrated system support of the systems engineering process such that dependencies between di erent speci cation and documentation documents can be managed automatically and global consistency can be ensured. MuSE covers various phases of the systems engineering process including design, veri cation, animation, and simulation, and follows the concurrent engineering paradigm. In the MuSE -environment all information resulting from the design process, veri cation of speci cations, system simulation and animation, and testing is stored in its underlying object-oriented database management system. This includes alphanumeric data as well as graphics, images, audio 1 MuSE is the acronym for a project entitled Multimedia Systems Engineering. The project is a joint e ort of groups at the Technical University of Darmstadt and GMD-IPSI, Darmstadt. It is sponsored by the Deutsche Forschungsgesellschaft DFG, grant numbers He 1170/5-1 and He 1170/5-2.

5

Hypermedia Authoring Environment

Visualization

Logical Specification

Functional Specification

Hardware Specification

Communication Interaction Hyperdocument Database

Figure 2: Architecture of the MuSE Environment and video annotations which may originate from simulation and animation results. MuSE allows for the storage, retrieval and manipulation of highly structured information like 3D-data, part structures, and multimedia and hypertext documents in a multiuser environment which need to be supported by the underlying database management system. The MuSE prototype uses hypermedia concepts to organize the documents of the system development process. The complete system model is represented as a hypernetwork containing the di erent speci cations. The hyperstructure is visualized via a hypermedia authoring environment which provides the desktop for the whole MuSE environment. Figure 2 shows the architecture of the system. Components to describe technical systems using several problem speci c speci cation languages and a visualization and interaction component are embedded into the hypermedia desktop and database environment. Storage and manipulation of hyperstructures and the documents contained in the hypernodes are modeled at the database level[KAN93]. In addition to the management of conventional data, the underlying database management system VODAK[GMD95, RL94, AK94, KNS90, BA94] o ers mechanisms for the storage, manipulation and presentation of multimedia data. Therefore, the system allows an extended notion of documents including text, images, audio, video and data produced from the tools of the MuSE -environment. The object-oriented model of the documents contains methods providing access to all the tools that are necessary to display and edit their content. This means that even complete simulations may be executed calling a method of an object that contains the system model.

2.3 A Multimedia Calendar of Event Teleservice

In the eld of teleservices there is a high potential for multimedia applications. Progress in integrating network technology and multimedia information systems allows to provide speci c multimedia teleservcies. One speci c example from this application domain is a multimedia database system supported archiving teleservice which incorporates electronic mail [TR94]. This teleservice supports both interchange mechanisms of the multimedia mail system used which is based on the principles of the CCITT Recommendation X.400 Message Handling System. It allows for store and forward operations to interchange complete multimedia documents and for access by reference in order to refer to large multimedia documents available in a global store instead of copying the documents for 6

Archive Client

Archive Client

AAM

AAM

MMM-UA

X.400 MTA

Broadband–WAN

ROA

X.400 MTA

MMM-UA

ROA X.400 MTA

Global Store

ROA

Document Components

MMM-UA

AA

VODAK DB

Multimedia Documents

Archive Server store-and-forward interchange referenced-object interchange Multimedia Archiving Components Multimedia Mail Components Simple Mail Components

MTA: Message Transfer Agent ROA: Referenced Object Access AAM: Archive Access Management System MMM-UA: Multimedia Mail User Agent

Figure 3: Multimedia Archiving Teleservice Architecture (Source: [TR94]). each e-mail message. The distinguished features of the teleservice are the integration of multimedia mail with simple standalone archive clients for heterogeneous platforms and an archive server which is based on a multimedia database system. The archive contains structured multimedia documents which can be retrieved via e-mail requests. Answers to a request are returned to the user via multimedia electronic mail by means of composite multimedia documents. Figure 3 shows the overall client-server based architecture of the teleservice prototyped in the context of the GAMMA project at GMD-IPSI. The multimedia document archive is based on the open object-oriented database system VODAK [GMD95, RL94, AK94, KNS90, BA94]. The archive and the clients are connected to the Multimedia Mail User Agent (MMM-UA) providing the X.400 Message Handling System functionality. The archive contains complete structured multimedia documents and keeps track of received and sent multimedia mails, (part of) documents, and access requests. Archived data is accessed by the VODAK Data Manipulation Language [GMD95] using application speci c interactive tools at the clients. The global store contains only parts of documents for some limited period of time determined by the user access. In this sample teleservice the database system used for the teleservice has to manage multimedia documents, but does not need to deliver multimedia data according to some temporal quality of service parameters. This is because of the asynchronous access mechanism which calls for decoupling the presentation environment from the database management system. Multimedia data is delivered via multimedia e-mail and subsequently processed 1by presentation tools at the client site. Obviously, this limits the applicability of this approach because huge amounts of multimedia data are copied within the network. However, a broad range of public multimedia applications, e.g., product 7

DFR-Client

DFR–User DFR– Application

MM– Presentation

DFR–Client Protocol Machine

MM- Client Protocol Machine

Network

ÉÉÉ ÉÉÉ ÉÉÉ

Interface

MM–Protocol

DFR–Protocol

Process Comm.

Communication Network MM–Protocol

DFR-Server

DFR–Protocol DFR–Server Protocol Machine

MM–Server Protocol Machine

DFR–DBS Mapper

MM–Store Mapper

Query

ÉÉ ÉÉ ÉÉ ÉÉ

Result Request

É É É É É

Data Stream

MM Storage System

DBMS

DFR– Archive

MM– Archive

Archive Administrator

Administrator

Figure 4: Multimedia Document Archiving System (Source: [R+ 94]). catalogs for teleshopping, subscription services for multimedia products (titles, publications, etc.), virtual travel agency, cooperative authoring of multimedia documents, etc. can cope with the e-mail delay and, hence, may be based on this approach to a teleservice.

2.4 Multimedia Document Archives

Within the BERKOM II initiative 2 and the POLIKOM research programme 3 concepts and prototypes addressing the general problem of document archiving are developed. One sample application for a multimedia document archive was developed also in the GAMMA project (see also section 2.3). The documents are event descriptions composed of text, images, graphics, auditions, and video clips. Examples of such document parts are a snapshot picture of a theatre performance, a video clip about the announced top actors, digitized newspaper critiques, and e.g., an audio sample presenting a prominent song of the performing interpret. The functionality of the archive includes storage and presentation of multimedia documents, navigational access within a hypersee DeTeBerkom GmbH, Berlin. POLIKOM is a national research and development program for providing telecooperation and telepresence for the German government distributed between Bonn and Berlin. See [HBS92] 2

3

8

linked structure, document retrieval, and concurrent access of several users to the same multimedia documents. In contrast to the Multimedia Calendar of Event Teleservice this application is based on synchronous access to the multimedia document archive. That is, the archive becomes responsible for the timely delivery of multimedia data according to quality of service parameters relevant for the presentation to the end user. The archive structure is based on extensions of the ISO/IEC standard "Document Filing and Retrieval" (DFR) towards multimedia data handling [R+ 94]. Figure 4 shows the architecture of the multimedia archive system. The prototype is based on an extended VODAK DBMS [GMD95] which allows for the handling of multimedia documents. The extensions include a multimedia storage component, DBMS interface components including appropriate multimedia transport protocols, and appropriate components at the client site which receive multimedia data via the network and provide for their presentation to the user.

2.5 Other Emerging Applications

There are many other application scenarios which involve the storage, processing, and retrieval of multimedia data. We brie y review a few additional application domains to give a more complete picture of the potential usage of multimedia database management system technology.

 Multimedia document management and processing is a quite natural and general applica-

tion domain for multimedia databases. Multimedia document management will be needed in various more speci c application domains like CAM, technical documentation of product maintenance, education, geographic information systems, teleservices, etc.. All these emerging applications share the need for management of multimedia documents. [BA94] discusses the technical issues involved in multimedia document management based on HyTime. [TR94] presents an approach on multimedia document management o ered as part of a teleservice.  Multimedia Mailing Systems are an advanced form of electronic mailing systems which integrate various applications like multimedia editing and voice mail. Related projects for example are the voice-mailing system Etherphone developed at Xerox PARC [Vin91], the MICE (Modular Integrated Communications Environment) project of Bellcore, and the multimedia mail and archive projects of BERKOM (e.g., [TRR94]). This type of communication environments may signi cantly bene t from the availability of a multimedia storage system which serves as a repository of multimedia messages in the network.  Teleconferencing involves multimedia data in several ways. The participants in a teleconference can communicate to each other via audio and video channels at the same time. In practice this was achieved by using dedicated equipment and lines, and specially designed conference rooms. The techniques of communicating the material used by the participants within a teleconference session may be integrated with the technology providing for the communication between the participants. That is, one can imagine that teleconferencing may be based on multimedia workstations which provide an integrated workbench for computer-based teleconferencing. [SGHH94] discusses a prototypical implementation of such a teleconferencing scenario for integrated meeting support across electronic whiteboards, local and remote workstations. In [HBS92] technical challenges for computer-based telecooperation and telepresence in a highly distributed government administration are described.  Kiosk information systems are an example of a highly distributed multimedia information system which consists of servers managing and maintaining huge amount of multimedia data which is made available to customers at clients' sites. The document archive containing a 9

Calendar of Events [R+ 94] as described in section 2.4 is an example of such a multimedia information system which allows customers to inquiry on events like concerts, festivals, and exhibitions in particular metropolitan regions, and to order or to buy tickets for such events. A special type of kiosk information systems could be set up for the purpose of home shopping. Customers would dial the retailer of interest and connect to the retailers server in order to conduct a sale or at least to prepare a nal sale.  Video on demand is currently a hot topic in the entertainment business, cable television and telephone industry. The consumer gets more exibility in what programming is shown when, if the video servers o er adequate retrieval and selection mechanisms which include content based selection (e.g., selection on actors, movie titles, music). It is quite obvious that the functionality of the video servers is crucial for the commercial success of this application domain.

3 CHARACTERISTICS This section reports several observations and requirements with respect to multimedia database systems. The observations can be drawn from multimedia applications involving database systems functionality. The main characterizing property of multimedia systems in general is the incorporation of continuous media like video, audio, animation together with conventional types of data. The speci c properties of these media types lead to a number of requirements that a database management system has to ful ll in order to support these media types adequately be means of services o ered to applications and users. Important issues to be discussed in this context are time-dependency of data, interactivity of multimedia applications, high data volume, data modelling primitives, and device management. For a general introduction to the basics of multimedia systems and their technical demands in general the reader is referred to e.g. [Fur94, Gro94, Buf94, oF94].

3.1 Types of Multimedia Data

In the following we brie y characterize the di erent types of data that are characteristic for multimedia systems. Text: The representation type Text quite often is reduced to represent strings of characters. But a useful representation of textual information, e.g., in multimedia document archives, should include structural information like title, authors, authors' aliation, abstract, sections, subsections, and paragraphs. An example of a standard which allows to express the logical structure of documents is SGML [ISO86]. In addition to the representation of the logical structure of text a comprehensive representation of textual information has to represent layout information as well [ISO94]. The complexity of all these structures, which arises from concepts like nesting and repetition, requires powerful modeling capabilities from an underlying database management system [BAH94]. Processing (implicitly or explicitly) structured textual information requires appropriate tools. For example, importing textual information may require particular tools for enriching poor or only implicitly structured text by structural information. A good example for such a tool is DREAM [GF92] which generates SGML tagged documents from ASCII text according to a given grammar which is derived from a few text samples using the interactive grammar learning tool MarkItUp! [FX94].

10

As soon as structural as well as layout information is associated with textual information this could be utilized in the retrieval process. For example, retrieval operators could be de ned on the logical structure and on the layout information in addition to full text retrieval operators. This calls for an extension of conventional query languages o ered by database systems and an integration of concepts known from the information retrieval eld. Graphics: The representation type Graphics stands for all concepts that allow to generate drawings and other images based on formal descriptions, programs or data structures. Several standards [ISO84] have been established and serve as the basis for many industrial and scienti c applications. The proper integration of graphics with other media types and existing systems as well as the ability to handle the complex structured that graphical objects represent are central requirements for a multimedia database management system in this context. Pictures/Images: The representation types Picture or Image stand for the digital equivalents of drawings, paintings, photographs, or prints. A multimedia system, e.g., an electronic dictionary of art [HRRS95] or encyclopedia, has to provide the functionality to import and manipulate these media. Basic manipulation operations are clipping, scaling, chromatic correction and the composition of several picture sources. Handling picture data by a database system means managing large amounts of simple structured data eciently. For example, the size of a detailed color image (still photo) is in the order of 7.5 MB. Adequate abstractions should hide the various formats for internal representation of pictures and images from end users and applications. Audio: In contrast to the previous representation types which share the property of being timeindependent Audio has to consider time-continuous characteristics. A meaningful interpretation of audio data is based on its relationship to a constantly progressing time scale. The time scale associates audio data, or more precisely the atomic constituents of an audio stream, its correct interpretation at each point in time. Some manipulation operations like cut, copy and paste can still be handled statically. Some operations like playback and recording will always be associated to a time scale. In the case of audio the time scale is an absolute one which corresponds to the real world time. Advanced retrieval operations like a best-match word retrieval operator for speech documents [SG94] may be de ned on audio data which need additional abstractions which play a similar role as traditional indexing techniques in information retrieval. Furthermore, audio data usually amounts in a signi cant mass of data which requires compression techniques for its storage and exchange between system components. For example, voice quality audio results in 64 Kb/s, CD DA quality audio (44.1 kHz, 16 bit) results in 1.4 Mb/s. A CD-DA has a capacity of 74 minutes audio playback which corresponds to about 747 MB stored data. All these characteristics, i.e., time-dependency, temporal relationships, compression techniques, need to be re ected by an implementation of an audio data type and its associated operations in a multimedia database management system if the database system wants to provide some basic understanding of the semantics of audio data. Video: The representation type Video integrates the properties of the representation types Audio and Picture/Image. In addition to the time-dependency of audio data Video has to re ect the time-dependency of video data, i.e., the time-dependent sequencing of pictures-images. The time scale of a video is an absolute one which associates each video frame its correct interpretation at any point in time. The manipulation operations like cut, copy, paste, playback, and recording are similar to those de ned for audio data. The atomic constituents of 11

video data are video frames which are closely related to picture/image data. For the purpose of content-based retrieval, advanced retrieval operators may be de ned on the content of a video, e.g., retrieving particular portions of a video which start with speci c scene cuts which are close to a given picture. Representing video data requires e ective compression techniques as it leads to huge amount of data. For example, regular motion video requires 30 frames/s, NTSC quality video (512 x 480, 8 bpp) results in 1.92 Mb/frame, and HDTV quality video (1024 x 2000, 24 bpp) results in 48 Mb/frame. Or in other words, assuming a compression ratio of 200:1 one hour of digital video requires about 1 GB storage, at a compression ratio of 30:1 0.5 GB capacity is needed to store 10 minutes of digitized video. All these properties of video data have a signi cant impact on the implementation of video data types and the handling of the data by a database management system. Generated Media: This general representation type stands for particular computer generated presentations like animation and music. Both can be seen as a special kind of continuous media types if they are generated in real time during presentation. Animations have associated a relative time scale, i.e., animations do not have a canonical mapping to real world and, hence, the time scale may be deformed by means of speeding up or slowing down the animation without a ecting the meaning of the animation. Music is similar, although the freedom of changing the time scale may be restricted. An important advantage of generating media like an animation or a piece of music on the

y during presentation is the possibility to increase interaction. Examples are changes in the visual angle on the scene presented or the manipulation of simulation parameters. Speech: The represenation type Speech covers spoken language and often is not recognized prominently in the context of multimedia systems. But it will become more important with respect to the interaction features of multimedia systems as the eld of spoken natural language processing progresses. Speech data could serve as input for the retrieval of stored audio and speech data, or speech data could be generated as a result of queries. Recent improvements in speech recognition allow the recognition of characteristic keywords [RJ93] and the identi cation of speci c speakers [WB92]. Although speech shares the characteristics of audio it shows some unique properties of spoken natural language.

3.2 Temporal Aspects

Very often multimedia systems are just de ned by an enumeration of data types they are able to deal with, in the style given in the previous section. To work out characteristics of multimedia systems on has to look closer to the di erent kinds of multimedia data one can di erentiate and classify them according to speci c criteria. The most signi cant features of multimedia data come from the observation that its representation can be much closer to the physical or a virtual physical reality than the usual alphanumeric data which in general is used to represent symbolic information. Due to the unique role of time in physics, the most striking classi cation of multimedia data can be made as either time dependent data like audio, video, and animation or time-independent data which includes data types like text, still images and alphanumeric data types. A time scale is needed to associate with a time dependent data its correct interpretation at each point of time expressed by the atomic constituents of the data. For example, video recorded with a camera has a canonical mapping to real world time. The atomic constituents are called frames which correspond to intervals of equal length on the time 12

scale of the video. The length of the interval is determined by the recording speed, e.g., 30 frames per second. An animation also consists of frames, but since there is no canonical mapping to the real world the time scale may be deformed, for example to speed up or slow down the animation without a ecting the natural meaning of the animation. When processing dynamic data, onvolving dynamic data types like au o or video, typically parallel tasks occur. This is due to the nature of dynamic data since, in contrast to processing static data, operations take non-negligible periods of time. When looking at the dynamics of multimedia documents one can distinguish two aspects: First, there is the inherent dynamics of media types like audio and video a multimedia document is composed of. Second, the dynamics of a multimedia document is also based on the temporal relationships between the constituent media types. For example, an application plays back a video on a screen. Simultaneously it gets the audio from a di erent device and allows for user interaction to control presentation, e.g., to associate annotations at certain points to the video without interrupting the video presentation. This includes parallel tasks for playing back the video and audio, for the user interaction, and for processing user input. In addition to concepts for expressing the temporal relationships between constituent dynamic media and concepts for parallel execution there is a need to provide support for media-speci c synchronization. For example, playing back the video frames and the sound track of a movie requires ne-grained synchronization, e.g., lip synchronization. Presenting some multimedia advertisment composed of several multimedia components on a trade fair may require a precise, timly scheduling of the individual (possibly simultaneous) media presentations even if there is no need for negrained synchronization at that level. Therefore, there is a need to model, store, and process temporal relationships between media components of a multimedia document or presentation. This includes synchronization mechanisms which should allow for the handling of data streams according to temporal relationships de ned between the streams. Although parallel execution of applications is supported by database management systems, it is considered to be transparent to users. Hence, database management systems do not explicitly and eciently provide concepts for the control of parallel tasks by the user 4 . Some tools have to be provided to the user which allow him to explicitly control the parallel execution of di erent tasks. There are basically the following alternatives to control parallelism of tasks: establishing relationships between tasks (relative scheduling), e.g., two tasks have to be executed simultaneously or a task can trigger another one, or placing events on a time scale (absolute scheduling), or combining both, e.g., start a task at the next full hour when another task has nished. Both alternatives require appropriate concepts which allow a programmer to express such schedules. We can summarize requirements with respect to temporal aspects as follows:  Incorporation of time-related concepts into the data model.  Non-transparent parallelism for explicit control of parallel tasks is needed.  Scheduling and (media-speci c) synchronization mechanisms are needed to provide for the description of temporal relationships and their observation during execution.

3.3 Media Representation

Let us now look at the general data representation issues which are relevant for the representation of the di erent media. The representation of alphanumerical data is straightforward, and formatting

4 It is of course possible to coordinate tasks by using the database as a coordinating medium. But this often will not be an ecient way to coordinate parallel tasks which have to ful ll ne-grained temporal constraints.

13

problems have already mostly been settled for this kind of representation. Moreover, the operating systems mostly guide the way by providing some standard set of datatypes. This is, at least today and in the near future, not the case for multimedia data. The basic datatypes like the alphanumeric ones are not appropriate to re ect the strucuture of multimedia data. New built-in datatypes like bitmap or audiosample have to be provided. Furthermore, type constructors taking into account the temporal nature of multimedia data will be needed in some form. Additionally, appropriate support for processing these data types has to be provided. Similar to the standard operations associated with alphanumeric data (e.g., add integers, concatenate strings) one needs operations like interactive editing videos, playing back and synchronizing videos and audios. But there is another aspect which has not been considered yet: While it does not make sense to use many di erent formats for the same alphanumeric datatypes like integer, oat (on the contrary it is confusing and dangerous, as one can experience from programming languages like C) this is crucial for multimedia data for the following reasons: 1. Di erent compression techniques may be appropriate for di erent applications. Each format can in principle be converted to another. Di erent resources may need di erent formats. Due to the high degree of hardware dependency of multimedia data proprietary standards are more likely to emerge. One has to take this into account as de-facto standards and has to provide for the proper openness of a system. The system must provide for a modular and ecient representation of these di erent formats and standards but it must also be able to make this transparent to the user. For example, encoding of still or moving images is di erent in nature. For videos with little dynamics di erential compression may be appropriate. 2. The internal representation may not be appropriate to be presented to the user (in contrast, the representation of alphanumeric data to the user is close to the internal one). So special representations for di erent users to provide di erent views of the same data may be needed. These may be generated on the y or be stored persistently (e.g. as a result of a query). To summarize, we can derive the following requirements:

   

New built-in data types and operations for multimedia data are needed. Modular and ecient representation of di erent formats should be supported. Data representation should be transparent to the application/user. Di erent views on the same data should be possible.

3.4 Data Volume

From the characteristics of media types as previously described one can already see that the amounts of data to be processed can be huge. As long as data is static in time and size like symbols, pictures, or images no serious problems in terms of processing speed are imposed on networks, on storage devices, and on main memories of current computer technology. Also data dynamic in size can be handled eciently by using the abstraction of les as provided by operating systems. Serious problems with such data occur only in connection with applications where extremely high numbers of data elements are involved, e.g., processing satellite images for weather forecast [25]. On the contrary, data dynamic in time inherently leads to a high data volume for single data elements. Table 5 illustrates the huge amount of data for some media types. 14

Media Type Text B/W Image Color Image

Sample Format ASCII G3/4-Fax GIF, TIFF; JPEG CD-music CD-DA Consumer video PAL High quality video HDTV Speech m-law, linear; ADPCM, MPEG audio

Data Volume 1 MB / 500 pages 32 MB / 500 images 1.6 GB / 500 images 0.2 GB / 500 images 52.8 MB / 5 minutes 6.6 GB / 5 minutes 33 GB / 5 minutes 2.4 MB / 5 minutes 0.6 MB, 0.2 MB / 5 min.

Transfer Rate 2KB/page 64 KB/page 3.2 MB/image 0.4 MB/image 176 KB/sec. 22 MB/sec. 110 MB/sec. 8 KB/sec.

Figure 5: Sample media types, formats, and related data volumes and transfer rates (Source: [RNL95]) In addition to the analogous problems mentioned above we now have to deal with this huge amount of data under real time constraints. This has on the one hand serious consequences for the design of hardware, operating systems, and networks, on the other hand it must be taken into account when designing a multimedia database system. When dealing with this data it may be convenient or even necessary to perform the processing not on the data values themselves but on the references to the values. A good example for this is video script editing. Certain applications of dynamic data may need operations which cannot be performed over references, e.g., copying, but also cannot be executed in the standard way as for alphanumeric data because dynamic data exceeds the physical resources. In this case some form of dynamic data management has to be provided to spread the process over time such that at each distinct moment only a limited amount of physical resources are needed. Since this kind of dynamic operations heavily a ect the behavior of a system they must not be transparent. For example, they last for a considerable amount of time or block certain resources. The characteristics determined in this subsection can be summarized as follows:

 Appropriate referencing mechanisms to refer to multimedia data units should be provided.  Dynamic data management for very large objects is needed.

3.5 Data Modelling

Representation of multimedia data encodes the physical reality (as explained in section 3.3) and hence is a very low level representation formalism. This leads to the problem of huge amounts of data as discussed in section 3.4. As mentioned there, references to the data which can also be understood as abstractions of the data are needed to eciently process it by avoiding copies. More complex abstractions like references enriched with more information than that needed for identi cation can be used to index data to provide for fast access. Another reason for introducing such abstractions is to allow the user to refer to the data in terms of abstractions which make up his model of the application domain. These abstractions may be provided by the user or by the system based on the contents of the multimedia data (see e.g., [JLS95]. It can be very reasonable to store these derived abstractions since their computations may be very expensive. For the retrieval and organization of the multimedia data it should be possible to provide several layers of abstractions. 15

Assume for example that we have a database of videos. A rst possible layer of abstractions would be to identify single scenes in videos such that several abstractions may be provided for a single video. Another layer could be used to identify geometric objects in these scenes, and in a further layer the geometric objects could be related to real world entities in the scenes. A user may search for such entities based on attribute values stored elsewhere in the database, and so, may access a video in which this entity occurs using indices at each layer. In order to have these features available in a multimedia database system appropriate solutions have to be found for

 indexing mechanisms, and  semantic and consistent modelling of abstractions.

3.6 Resources

Many di erent physical devices are involved in processing multimedia data because of the fact that one standard device cannot handle all kind of multimedia data. There are several reasons for this: special purpose hardware (e.g., compression chips, equipment for analog or digital video/audio, presentation devices like loudspeakers, monitors, and windows), eciency (although it is possible to store digital videos on hard discs, it may be much more ecient for retrieval to store them on laser discs), space requirements (a few minutes of digital minutes easily ll up any standard hard disc). These devices can range from physical devices with their corresponding device drivers over devices which come with all kinds of special software to devices hidden by other database systems. Although these di erent types of devices and their behavior should be made transparent as far as possible to the application developer, some of their characteristics should be made visible as far as necessary. Therefore an abstraction mechanism is needed for device transparency. Some of these devices may often be used only by a limited number of applications at the same time. Therefore, appropriate mechanisms to share these resources upon applications have to be provided. By classifying them into appropriate hierarchies and groups one can reduce redundancy and modelling is rendered more ecient. Since the available devices evolve continuously it must be possible to integrate them in a simple and ecient way without a ecting existing systems. This can be achieved by employing the well-known modularization principle. Since there may be so many devices involved in a multimedia application the same data can reside on di erent devices. This should be made transparent for the application programmer, therefore a mechanism for data distribution transparency is needed. Multimedia applications have to interact with these devices simultaneously, maybe over longlasting periods of time, or on the basis of interrupts. We will cover the requirements derived from this fact in section 3.7 because these aspects are closely related to user interaction. We summarize these requirements as follows:

   

Device transparency should be o ered to applications and users. Resources need to be shared among applications. Device classi cation and modelling to provide for modular application design. Distribution of data should be transparent to applications and users. 16

3.7 User Interaction

User interaction is much more complicated when multimedia data is involved. For example, input devices like microphones, cameras may be used additionally to keyboard and mouse for speech and gesture recognition or output devices like windows, monitors, loudspeakers, and VCRs could be involved. Thus the interaction takes place simultaneously over di erent media which requires (1) simultaneous control of di erent devices, (2) handling interrupts from users, and leads (3) to long lasting interactions. In the presentation as well as retrieval of multimedia data di erent modes can be used to control the quality of output and input, e.g., di erent resolutions and speeds, image stabilization, browsing. While not all of these techniques and their support will be integral parts and central goals of multimedia database systems they have to support any developments which will take place in these directions. The major requirements with respect to user interaction can be named as follows:

   

Appropriate simultaneous device interaction (see also subsection 3.6) Ecient (real time) handling of user interaction Appropriate model for long lasting interactions Support for advanced user interfaces

3.8 Querying Multimedia Information

One of the major issues in querying multimedia data is content-based querying. For example, retrieving videos according to a set of given objects which are shown in the video, or searching for speci c scene cuts in a video. In the case of audio or speech one can think about searching for data which contains speci c audio samples or spoken parts, e.g., searching for all broadcasted news about some speci c issue speci ed by some spoken terms. Current methods for content-based search in (encoded) audio, video, or image data are very limited because of their computational complexity and limitations of appropriate abstractions. Techniques known from the eld of object recognition can help in getting better solutions for retrieving information from images (including video frames), e.g., indexing of images [G+ 94, WMG+94]. However, a lot of open problems have to be addressed in order to come up with better retrieval techniques which are suitable. In addition to single media-based search one would like to access information on the basis of multiple media. Little experience exists in providing retrieval techniques which allow for the combination and comparison of multiple media. For example: Given a comprehensive recording of a computer supported meeting including all the material presented by the individual participants one might search for the point in the meeting during which a speci c person P was talking using the words "proliferation of nuclear material". Such a query can only be processed if there exist an appropriate time-indexed textual representation (or index) of the text spoken as well as techniques to identify speakers [CHK+ 94]. A multimedia database system should

 support for content-based search, and  support for spatial and temporal queries.

17

3.9 Typical Database Management Functionality

Many of the needs discussed so far apply for multimedia programming languages as well as for multimedia database manipulation languages. The need for database management functionality for multimedia data arises, apart from the usual reasons, from the nature of multimedia data. Multimedia applications deal with persistently stored data because of the huge amounts of data already for single objects, and the processing of the multimedia data very likely requires secondary storage. Few exceptions can be found in real-time applications like video-phones or video-conferences. Very often, abstractions as discussed previously in subsection 3.5 are based on derived data and, therefore, database management functionality is needed to maintain consistency between original data and derived data. Beside the usual reason for multi-user support, namely consistently sharing data among several users, there is the aspect of eciency for sharing data which is signi cant for storing multimedia data since it makes no sense to maintain copies of the same multimedia data. Thus there is a need for the typical functionalities like transaction management, query languages, data dictionaries, etc., but they have to be adapted or new concepts have to be developed due to the characteristic of multimedia data discussed so far. In summary, there is a strong need to provide

 persistent secondary storage management,  consistent management of derived data, and  ecient sharing of multimedia data among applications.

4 BUILDING BLOCKS FOR MULTIMEDIA DATABASE SYSTEMS

4.1 The Notion of Multimedia Database Management Systems

In order to de ne the notion of a multimedia database management system, to analyze the requirements such a system has to satisy and to propose system components it is worthwhile to perform rst some general considerations on database management systems. A database management system can be characterized by certain properties the system exhibits when dealing with data. The kind of data that is dealt with and the operations that can be performed on the data are determined by the data model the DBMS is based on. The properties and the data model together constitute the logical model of what a DBMS is. Let us discuss the notion of data model rst. The data model determines roughly three aspects with regard to the data the DBMS manages: the data structures, the operations de ned for the data stuctures, and the constraints that are to be satis ed by the data structures and operations. To illustrate this let us consider the table given in Figure 6. The table contains characteristic features of the data models of di erent kinds of database systems. The more advanced database systems include typically some of the features listed for the previous systems, e.g., a multimedia database system often supports an object-oriented data model or an object-oriented database system should support a declarative query language. The table clearly shows that a fundamental di erence between conventional DBMSs and multimedia DBMSs is the support of multimedia data types, including their operations and constraints. We will later give a more detailed discussion of the multimedia data models and the kind of features they require. 18

Data Model Feature Structural model

Relational Model Relations

Object-oriented Multimedia Model Model Objects, Attributes continuous data streams References (e.g. audio, video) Behavioral model SQL Methods time-dependent operations Constraints primary keys referential integrity quality of service parameters Figure 6: Di erent constituents of data models for di erent types of database systems The DDL/DML allows the user of the database system to de ne application-speci c data structures, operations and constraints on the basis of the data model by means of database schemas. Now we want to discuss the properties a DBMS has to support. These properties are basically necessary to guarantee the consistency of the data and operations that are supported in the DBMS under di erent modes of operation of the system.

 Persistency: The primary purpose of a DBMS is to store data exceeding the life cycle of an    

application program. This property of the system is generally called persistency. Decoupling of applications: One of the central ideas of a DBMS is to modularize functionality that is generally required in the context of persistent data storage and to provide it in an application-independent way. That is, di erent applications that use the data need not to reimplement these basic services. Database system interface: In order to use the DBMS it provides appropriate interfaces to applications (or users). These interface need to support the operations that are provided with the data model. The use of the interfaces must guarantee the consistency constraints that are given by the data modeland the database schemas.. Multi-User Access: Multiple users must be able to access the database through its interface simultaneously, while maintaining the consistency constraints given by the data model and the database schemas. Recovery: In case of failure of the data manipulation operations of an application or the database system a consistent state of the database must be recoverable.

Further features are usually supported by a DBMS, for example distribution, authorization or interoperability. We omit a detailed discussion of these as they are not central at this point to substantiate the notion of multimedia DBMS. All of the above properties of a DBMS re ect themselves within the DML/DDL more or less explicitley. The user thus can determine with the DDL/DML not only the kind of data that is stored in a database systems but also the ways how to operate on it, e.g. multi-user access and recovery by means of de nition of transactions. When considering the above list none of the DBMS properties is multimedia data speci c. The important observation is that multimedia data has no impact on the properties that characterize a system as a database system, rather the impact is on the data model that is supported by the system. This of course does not exclude that the importance or the quality of certain properties might shift in a multimedia database system, e.g. interoperability might play a much more important role or long-lasting transaction become an issue. 19

As a consequence of extending the existing data models to multimedia data models typically deep reaching changes will be needed in the internal architecture of the database system. A DBMS supports the DBMS properties by di erent functional units that map the logical model of the DBMS to a physical level, i.e., the database implementation language, the operating system, the network system etc.. Typical functional building blocks for a conventional DBMS are storage management, bu er management, log management, transaction management, lock management, query processing and optimization, index management, DBMS API or DDL/DML Interpreter/Compiler. The principle of data independence states that the DBMS maps the data model and the DBMS properties to these functional units, such that the user is relieved from dealing with physical details of data storage and manipulation. The functional units must support the logical DBMS model in a way, such that the operations are performed consistently with the model. The important issue is however, that the operations are performed eciently, despite the (complex) mapping from the logical level to the physical level. This can be achieved in particular by exploiting as much semantics of the given data model as possible. Because the data model drastically changes for MMDBMS, the implementation of the functional units is the point where the shift from a conventional to a multimedia DBMS has its greatest impact. Extendible database systems, that allow the de nition of arbitrary abstract datatypes, in principle o er the possibility to realize a multimedia data model in a DBMS. As any countable structure can be encoded in any other countable structure, and any computationally complete programming language can be used to simulate any other language, it is of course in principle possible to represent any data model. However, the important question is in general not how expressive the data model is, but which constructs and operations of the data model are directly and eciently supported by the DBMS. Direct support means that the implementation of the corresponding construct or operation can exploit the semantics (e.g. consistency constraints) to obtain an ecient realization on the physical level of the underlying database system. Therefore, a real multimedia DBMS o ers dedicated services for eciently (and consistently) managing multimedia data. Typical functional units of this kind, that will be discussed in the following, are multimedia data presentation management, resource managemement, continuous object management or interaction management. Conventional services need in general to be extended or adapted, like storage management, bu er management or query processing.

4.2 Multimedia Data Models

As discussed in the previous section on a logical level the di erence between a conventional DBMS and a multimedia DBMS is the di erent data model. Therefore we rst discuss the modlling concepts the are required to represent the semantics of multimedia data.

4.2.1 Multimedia Data Abstraction Reference mechanism Often the manipulations on multimedia data take not place on the data

itself, but on some logical structure that is imposed on them. For example, when a user gets back a complexly composed multimedia document such as a product advertisement containing video clips, audio images etc., it does not make sense to transfer the complete data to the user regardless of what he plans to do with the result. Maybe he just wants to refer to that advertisement on a logical basis in a subsequent query. Or the user may choose to jump to a particular video frame in order to play back some fragment of the advertisement video in e.g., slow motion. In the rst case no media data content has to be transferred it suces to use logical references or identi ers. In the second case also not the complete document needs to be transferred but just a part of it 20

Movie (Video) Music (Audio) German Text (Audio) English Text (Audio) Subtitle 1 (Text)

Subtitle 2 (Text)

Figure 7: Di erent media bound to a common time line

t

which is rst identi ed by logical references. This shows that any multimedia data model requires a powerful reference mechanism, as it is for example o ered by object-oriented data models.

Metadata Processing multimedia data heavily depends on the availablilty of appropriate meta-

data on the multimedia data. If no such metadata describing the content, structure, semantics etc. of multimedia data is available in a database it is very hard to come up with ecient and powerful solutions. Hence, it is very important to provide for the (semi)-automatic generation of metadata by analyzing the original multimedia data. For example, in order to allow for querying particular fragments of a video clip one needs to know about e.g., scene cuts and objects contained in the video. In order to test whether two (fragments of) audio streams are related to the same (piece) of music powerful comparison operators on audio need to be de ned which operate on appropriate metadata. As the metadata is typically derived from the multimedia data content the data model needs some concept to support derived data. A good overview of the problems related to metadata on multimedia data and techniques for using metadata for multimedia retrieval purposes is given in [KS94].

4.2.2 Time-dependent data Solutions which are needed to re ect the time-related characteristics and requirements of continuous data have to address the description, the processing of temporal relationships, and synchronization constraints.

Time-dependent data structures The main ingredients for modeling time-dependent data

structures are composition principles like sequential-composition and parallel-composition. One approach to the modeling of multimedia data composition is the imitation of conventional devices like movie projectors or video recorders on a digital basis. The most time critical medium is used as the reference for the other media. For example in the case of a video that is synchronized with audio, the audio will be used for reference. This modelling concept may be useful for special purpose systems for e.g., playback of xed media compositions as in the case of a video on demand system. A more general approach is the introduction of an abstract temporal dimension. Referencing a common time line (see Figure 7) provides independency between the single media components and therefore allows to change, add, or remove components without a ecting the other components 21

involved in a complexly composed multimedia object. Nevertheless, it is possible to handle the whole media composition like a single entity during cut, copy, paste, scaling, and playback operations. An advantage of this approach is the analogy in manipulating time dependent media with the well-understood, and traditional method of specifying media compositions from movie and audio tapes. The time axis may also be integrated with an (existing) spatial coordinate system. Several systems use the model of an integrated time axis (Athena Muse[HSA89], DVI[Gre92], QuickTime [App91], AV Databases [GBT93]). The concept of a common time line is sucient to describe multimedia data in their nal form for presentation. For the process of multimedia authoring additional means of abstraction may be necessary. For example, it is more convenient for the editor of a movie to specify the sequence of the desired shots instead of specifying points in time for every start of a shot. Changing the length of a single shot does not change the relationship between the parts of the sequence, but would require the adjustment of all subsequent shots on the time line. Graph-based and language based approaches have been proposed in this context. Object Composition Petri Nets (OCPN) [LG90b] is an example for the graph based approach. OCPN is a formal model which provides for the description of any temporal relationship and for the structured modeling of synchronization conditions using hierarchically nested structures. The approach is especially appropriate for the modeling of event-driven systems. HyTime [NKN91] is an example of a language based approach. It is directed towards the modeling multimedia documents.

Time-dependent operations Operations are needed to model the access to the database, e.g., for further processing or presentation of data. As long as only statical data are involved the conventional concept of data delivery, namely by function return values, is sucient. For example, in the case of image data, the complete compressed images are passed to an (external) viewer, which is provided by the database system as the standard default viewing mechanism. The general form of such operations is op : P  DB ! R  DB: An operation op(p; db) takes parameter(s) p 2 P and is applied to a database state db 2 DB and returns a result value of a domain R and potentially results in a new database state. As soon as time-dependency is involved the situation becomes more complicated. In general it will not be possible to deliver the data at once for the application due to the potential huge sizes of the result. Rather the database system should support a mechanism to deliver the data at the appropriate time, e.g. when presenting a video on a video viewer (or delivering it to any other applications). To model this requires operations that consider the temporal dimension. In the simplest case such an operation additionally is attached with a duration. In this case the operation has a signature of the type op : P  DB  T ! R  DB  T : T is the time axis (in whatever coordinate system) and the operation op(p; db; t) depends additionally on the time t 2 T it started and the operation terminates at a later point in time T , as indicated in the signature. A typical example of such an operation is the playback operation for videos or audios. Implicitely this operation has the additional semantics of a certain duration, which should be supported by the DBMS. The data delivery occurs in this case as a side e ect under the control of the DBMS. If the data model has to allow for example explicit control over data transport in time more complex concepts are needed for the synchronization of di erent temporal operations, like schedules orscripts [AK94, BRR94], which capture the characteristics of time-dependent operations and can 22

be used to describe the high level scheduling of the individual processing steps. These concepts are needed on a very high level such as the scheduling of an entire presentation, but also at lower levels such as the synchronization of individual continuous data streams by means of data units (video frames, audio samples). Known approaches for synchronization at the level of presentation scheduling can be classi ed in terms of di erent ways for generating events [BDE+ 93]: An approach based on action driven event generation can be found in e.g. [LG90b], approaches based on reference point driven event generation are described in [BHL91, AK94], and approaches based on time system driven event generation are proposed in [Gib91, BHL91]. HyTime [NKN91] and MHEG [Pri93] also provide synchronization schemes based on time system driven event generation. All of these concepts support the speci cation of static presentation schemes but have their de ciencies. Approaches based on time system driven event generation except HyTime and MHEG are limited as they do not support the speci cation of presentation schemes which include user-driven actions, i.e., user interaction. Little experience is available showing whether the expressive power of these schemes meets the requirements in advanced multimedia applications. Low level synchronization of continuous data streams is also a service which will be o ered to some extend by advanced networks and communication systems [AH91, And90]. In addition to the synchronization of processing of the media data, concepts are needed to model the interaction with the user of the data, e.g. a user viewing a multimedia presentation. Interaction with the user may include standard functions like stop, start, continue, pause, but also functions which allow to change the presentation speed, to randomly access speci c points in a presentation, etc. That means the data model has to include concepts of interruptability, signaling and referring to events as the are provided e.g. in active database models (or for an object-oriented model including this see e.g.,[AK94]). Another useful concept for modelling interaction between with users to distinguish the notions of world time and object time [GDT91].

Synchronization constraints Synchronization contraints are needed in order to specify to what extent operations are executed consistent with their temporal constraints, with respect to an ideal execution, which in general is technically not feasible. To describe the requirements of a multimedia application with respect to the functionality of system components used on one side, in particular with regard to synchronization constraints, and to characterize the performance of multimedia system components on the other side the notion of quality of service (QOS) has been introduced [LG90a]. These QOS parameters constitute a comprehensive set of constraints that ensure the quality of the execution of time-dependent multimedia operations. In the following we give a list of the important quality of service parameters:  Average delay describes the time between the triggering event (e.g. user interaction) and the observable reaction of the system by means of executing an operation. For example, the average delay between submitting a query to a database system and getting back the (multimedia) result is a critical parameter for user acceptance.  Speed Ratio is de ned as the ratio between the original intended and the actual achieved presentation rate. This parameter relates the actual presentation speed to real time and therefore allows the speci cation of increased or decreased playback speed.  Utilization describes the ratio between the amount of media data used for the actual presentation rate and the total amount of data available for this presentation. For example, using only 8bit out of 16bit audio information corresponds to a utilization of 1=2. 23

Media video

Mode, Application

QoS

animation correlated 120 audio lip synch. 80 image overlay 240 non overlay 500 text overlay 240 non overlay 500 audio animation event correlation (e.g. dancing) 80 audio tightly coupled (stereo) 11 loosely coupled (dialog mode with various participants) 120 loosely coupled (e.g. background music) 500 image tightly coupled (e.g. music with notes) 5 loosely coupled (e.g. slide show) 500 text text annotation 240 pointer audio relates to showed item ?500 +750

ms ms ms ms ms ms ms s ms ms ms ms ms ms 2 ms 3

Figure 8: Synchronization Requirements for Di erent Media Compositions and Applications (for detailed results see [SE93])

 Jitter is a measure for the temporal deviation of two simultaneous presentations at a certain

point in time.  Skew is a measure for the accumulated temporal deviation of two simultaneous presentations during a certain interval of time. Figure 8 shows the results of a test on the maximum permissible skew for several media compositions.  Reliability describes the average frequency of errors during a given time interval of media presentation or recording. Reliability may be measured at multiple levels such as bits, packets, or whole frames. A table of typical synchronization requirements is given in Figure 8.

4.2.3 Query Processing and Retrieval Providing powerful querying facilities on multimedia data is a very crucial issue. The conventional query paradigm of traditional database systems only deals with exact queries on conventional types of data. This might be sucient to deal with queries posed against metadata and multimedia data abstractions, which are de nitely of great importance. Nevertheless querying multimedia databases requires additional concepts. Several approaches are known addressing these issues. Approaches incorporating speci c domain knowledge (e.g., [YKHI94]) often allow special purpose querying for the price of quite high costs on acquisition and maintenance of the domain knowledge. Other solutions are related to information retrieval approaches, for example, speech retrieval [SG94] and content-based video retrieval [WDG94]). Extensions of conventional concepts of query languages will be required that take account of the particular characteristics of multimedia data. Approaches that already exist and allow to cover important aspects are temporal and spatial query languages, that can deal with the temporal and spatial semantics of multimedia data, or query languages that incorporate concepts of vagueness 2 3

pointer ahead of audio pointer behind audio

24

or unsharpness. The latter are in particular important in combination with content-based access, where the results are in general of imprecise nature.

4.2.4 Object-Oriented Paradigm The object-oriented paradigm provides several useful concepts which we can exploit to meet particular requirements. This is also the reason why most approaches for multimedia database systems follow object-oriented principles (e.g., [DG92, Kla92, WKL86]). Object identity leads to a referencing mechanism provided that object identi ers are made explicitly available in the data model. Encapsulation, message passing: Encapsulation is one fundamental building block for (static) transparency of de nition because it allows to hide details of implementation from the application programmer and provide uniform method interfaces. Message passing is one fundamental building block for (dynamic) transparency of execution because it allows for a separation between method interfaces and method implementations during runtime. Both mechanisms contribute to e.g., the requirements regarding the modelling of device transparency, modular device classi cation, data distribution transparency, modular and ecient representation of di erent media formats, and transparency of data representation. Class taxonomy and inheritance: they allow for reusability of de nitions and implementations by sharing common parts and are the tool to provide an ontological order of the application domain. Views: One has to distinguish two problems: rst, the hierarchical organization of interfaces and code in order to ensure certain access to objects, and second, the integration of inhomogeneous interfaces for users who are not interested in the details. An important mechanism to do so (which is still not well-understood in the framework of object-oriented modelling) is the concept of views. Views are important to provide the right level of abstraction of the resources, data distribution and data representation. Views are also important to provide di erent views of same data, as, e.g., to realize the often mentioned distinction between internal and external representation of data.

The object-oriented paradigm is the basis for many systems which provide the appropriate and powerful mechanisms for semantic modelling. In VODAK [GMD95] this mechanism is provided by the concept of semantic relationships. It has already proven its powerful modelling capabilities in many applications, e.g., [KNS90, KN90, Kla92, KAN93, ABH94, BA94, BAH94, AK94].

4.3 Exploiting Traditional Database System Technology

Traditional state-of-the-art database systems o er to a certain degree the possibility to realize certain aspects of multimedia data models. In this section we look at possible solutions and their limitations. Of particular interest due to their general availablity are database systems based on the relational model and the object-oriented model. It can also be expected that new developments in the areas of active and real-time database systems will provide additional features which can contribute to the design of a multimedia database system. As already identi ed in section 3.5 appropriate data types are needed to represent multimedia data in addition to the traditional standard data types. Some conventional data models o er as 25

special type long elds [HL82] and BLOB's (binary large objects) to support multimedia data. However these types reduce the view of multimedia data to single large, uninterpreted data values, which is not adequate for the rich semantics of multimedia data. In particular the time-dependent operations cannot be modelled adequately. Consequently the database system based on this model of multimedia data provides not much more than storage capacity. All the processing of the multimedia data has to be performed within the application. Size restrictions on data units often a ect the usefulness for continuous media with high data volume. Relational database systems together with the query language SQL are a well understood technology. Therefore, it seems to be attractive to use this technology for the storage and retrieval of multimedia information. For the relational data model additionally the storage of multimedia documents encounters additional diculties, due to the complex hierarchical and the sequential structures, that typically occur and which are dicult to map to relational structures. This leads to many tables and complex retrieval expressions [MW91]. To overcome the limitations of relational database systems and to support the whole range of media types and possible multimedia document structures, it was recognized that substantial modi cations and extensions are necessary, even for the management of static multimedia data. The STARBURST system [Haa88] is an example for an extensible relational database system that allows to add types and query language extensions for media data as well as the capabilities for the representation of multimedia documents. Another example is POSTGRES which could be extended with multimedia abstract data types. Extensibility can be exploited to incorporate into the system external functionality, for example in order to include retrieval functionality that is already available in specialized systems. Similar observations as for extended relational database systems hold in principle for objectoriented database systems. The object-oriented approach o ers mechanisms to de ne new data types (or classes) together with their operations. It allows the extension of existing data types using subtyping mechanisms and it allows the modeling of complex relationships between the stored entities. This results in better support for the modeling of complex structured multimedia objects, the de nition of abstract media types and operations on media data units. These capabilities allow the usage of object-oriented database systems for multimedia applications that deal with media like graphics, text and pictures. Static multimedia documents with complex structure can be modeled without restrictions. For time-dependent media and dynamic documents the problems of stream-oriented access, real-time access, and appropriate storage techniques still remain. Temporal modeling capabilities equivalent to their structural counterparts are also missing. If these limitations can be overcome by extensions that t into the object-oriented framework, the resulting system should be able to ful ll all requirements for a complete multimedia database system. Active and real-time database technology provide much better support for handling timedependent data and ecient interaction. But they lack dedicated support for the management of continuous data. The techniques for modelling multimedia data using traditional database system technology can be summarized as follows:

 External References: A database contains only references to les which contain the original

multimedia data. In addition to that the database may contain derived data modeled as additional attributes in the database. For example, it may contain information about the length of a video stream, about the output devices and decompression techniques needed, and about the content of the video. Obviously, the database system can provide only services making use of metadata and not on the original multimedia data. 26

 Uninterpreted Storage of Multimedia Data: A database stores multimedia data in attributes

of type long eld or BLOB (binary large object). In this case the database system provides persistency for the data, bu ering when accessing the data, and multi-user support, recovery and authorization concepts on the granularity of long elds. The data stored is still uninterpreted and the functions o ered to operate on the data are generic.  Using External Functions: Some database systems do allow to call external functions to process data stored in the database. The reason for this are limitations on the data manipulation language such as SQL which are not universal programming languages. This technique is orthogonal to the other approaches and is only useful in combination with some other approach. However, it is often very useful to reuse external algorithms, programs, and tools in the context of multimedia presentations.  Object-Oriented System: The object-oriented approach allows to model application speci c data types and classes including associated operations. This approach o ers the most suitable support but still lacks some features such as supporting time-dependent data, user interaction, and content-based query and retrieval techniques. The main issues that cannot be resolved by conventional approaches are summarized in the following list.

 Consistency of persistent data: With approaches employing external references consistency,  





or persistent storage at all, cannot be guaranteed by the DBMS. The DBMS cannot excert any control over the referenced les. Application decoupling for multimedia data types: Approaches using uninterpreted storage for multimedia data, but not providing mechanisms for de ning abstract data types, cannot provide a decoupling of the multimedia data semantics from the applications. Thus each application has to reimplement this semantics on its own. Support of synchronisation constraints: None of the approaches is able to support the synchronistaion constraints as given by composition of multimedia data and operations, or QOS parameters. This leads to a de ciency in the database interface, as it is not able to support the operations provided by the model appropriately. The situation becomes even more serious under multi-user access, as in this case the resources need to be shared in a way that the data can be delivered consistently to the user, i.e. under maintaining the synchronization constraints. Eciency of DBMS support for multimedia operations: Although in many cases the semantics of multimedia data can be supported by ADT de nitions, this support may lack eciency, as the operations are not directly implemented using the DBMS resources (DBMS implementation language, operating system etc.) but in the DML. This corresponds to an interpretative approach of processing operations on multimedia data. Ecient support of continuous data by long elds: For eciently processing continuous data particular storage and bu ering concepts are needed. An implementation based on generalpurpose long elds does not support such mechanisms in general.

27

Application

MM Playout Manager STI-Script Interpreter

Application

...

SM Pres.

MM Playout Manager

...

Continuous Object Mgr.

STI-Script Interpreter

COM

MM Client

...

SM Pres. Continuous Object Mgr.

COM

MM Client mm capable LAN/MAN

conventional data

traditional LAN/MAN

DBMS Interface, API Query Processor MM DBMS Server

Script Generator

Retrieval Engine

Transaction Manager Object Manager.

multi– media data

Ext. Media Continuous Object Mgr. Server

...

Figure 9: Reference Architecture for a Multimedia Database System

4.4 A Reference Architecture For Multimedia Database Systems

In this section we discuss the structure of a multimedia database management system in terms of basic functional units. The new dedicated components are intended to support the DBMS properties for a multimedia data model better than it can be achieved (by simulation) in a conventional system. Figure 9 shows the reference architecture as it is also used to develop a multimedia database system based on VODAK extensions in the context of the AMOS project at GMD-IPSI. The architecture follows a client/server model. The main building blocks at the client side are

 multimedia playout services for controlling and managing the interactive presentation of mul-

timedia information,  media-speci c presentation devices which can be combined to virtual presentation devices for presentation of composite multimedia information,  a presentation script interpreter encoding the presentation to be realized, and  continuous object management services for retrieving and handling continous data streams delivered from the database server. 28

The building blocks at the server side are  conventional services including object management for conventional data, transaction management, query processing, etc.,  continuous object management services for time-dependent data which has to be delivered according to quality of service parameters via a high-speed network to the database system clients,  support for external media storage devices such as CD-ROMS, and  content-based retrieval services. The following simple scenario illustrates the functionality of the system. Let us assume that a user sets up a request and gets back a composite multimedia entity which he wishes to play back. The system behaves as follows: The Multimedia Playout Manager forwards the user request to the server which returns a result to the client. The result can be for example a simple object identi er of a composite multimedia object, or it can be a more complexly structured object which contains references to multimedia data. Let us assume in our example, that the result is an object identi er of a composite multimedia object. As the user gets indicated the result at the client he decides to get the data presented. Hence, he submits a request to, e.g., play back a video with a sound-track in French. The request is handled by the Multimedia Playout Manager and forwarded to the server in order to receive the presentation plan for the composite object. As soon as the client gets back the presentation plan generated by the Script Generator at the server the Script Interpreter starts executing the plan. That leads to the initialization of the Continuous Object Manager and speci c Single-Media Presenters for the individual media types involved in the presentation. Furthermore, a data request is sent from the client to the server in order to receive the continuous data from the continuous object manager residing at the server site. The Script Interpreter executes the individual presentation steps using the Single-Media Presenters which are controlled and synchronized according to the quality of service parameters and the synchronization constraints by the Multimedia Playout Manager. The continuous data streams for the sound-track and the video are directly addressing the Continuous Object Manager at the client site using a high-speed network providing an appropriate transport protocol for continuous data. In case of more complicated presentations the presentation plan may change according to user interactions. In that case a new plan is requested from the server (either a recomputed one or a precomputed alternative plan). Optionally, only relevant fragments of a presentation plan may be requested by the client.

4.4.1 Playout Management The Playout Manager primarily aims at the ecient support to present and manipulate continuous data and to control the presentation interactively. Assuming a client/server environment it is allowed for users to specify individual quality of service parameters in order to match the available bandwidth and application needs. The relevant quality of service (QOS) parameters from the perspective of the database are the average delay at the beginning of a presentation, the speed ratio between desired and actual speed, and the utilization of stored and presented data. The parameters themselves are not independent of each other, e.g. when presenting a video it might be appropriate to x the speed ratio and to change the utilization in order to overcome "bottlenecks". 29

ÉÉÉÉÉÉÉÉ ÉÉÉÉÉÉÉÉ ÉÉÉÉÉÉÉÉ ÉÉÉÉÉÉÉÉ ÅÅÉÉÉÉÉÉ ÅÅÉÉÉÉÉÉ ÅÅ ÅÅÉÉ ÉÉÉÉÉÉ ÉÉÉÉ ÅÅÉÉÉÉÉÉ

Object–>play(...) 1

2

3

4

1

2

3

4

5

6

7

8

Object–>fastplay(...) 5 6 7 8 10 12 14

Figure 10: Continuous Object Management The single units of continuous data, for instance the frames of a video-clip, must not be presented at arbitrary points of time. Instead, the presentation has to be performed according to a certain speed rate. The Playout Manager is aware of the corresponding intramedia synchronization requirements according to the QOS. Some of the presentation control commands are addressed to currently running presentations like the stop-command. Hence, interruptability, for example, to stop all media streams at the same point in time, is a central requirement met by the Playout Manager. Additionally, the management of speci c multimedia devices (e.g. audio-boards or decompressing chips) and presentation devices, so called Single-Media Presenters, is performed by the Playout Manager. A detailed description of an approach to a multimedia playout management is given in [TK95a, TK95b].

4.4.2 Continuous Data Management The Continuous Object Manager frees the applications from considering time-dependency during media capture and presentation. Continuous object management functionality is categorized into object handling, direct access, and bu er resource management. Additionally, it has been found that traditional communication protocols, e.g. TCP/IP or OSI-like protocols are not sucient for real-time requirements of multimedia applications [LG90a, Nic90]. Therefore, a multimedia transport protocol between the Continuous Object Managers at the client and server sites is provided. . The normal client/server distribution of VODAK is constructed in such a way that a central server component performs all method calls coming from the di erent clients. In the case of a multimedia application this concept was no longer feasible, mainly due to the impossibility to support interactions on a media stream on a client side without retransmission of the data which is not necessary in many cases. This strategy required a distributed database bu er which is maintained by the continuous object managers on the server and the client. The support of interactions for continuous data leads to a new understanding of bu er management strategies. The well-known statical bu er preloading and replacement strategies (e.g. most recently used etc.) are substituted by more elaborate algorithms which consider the actual structure and behavior of continuous data streams as discussed in [R+ 94]. The primary idea is described by an example of the presentation of an S-JPEG (Sequence JPEG) video clip (Figure 10). At the beginning of the presentation the method call play() is sent to the respective object Object. The continuous object manager initializes its bu er by preloading continuously the JPEGframes which are needed to best support the presentation state play. In our example in gure 10, frame 4 is being presented, 1 to 3 were already displayed and frames 5 to 8 are preloaded. While consuming frame 4, the user changes the presentation status from play to fastplay. The continuous 30

object manager reorganizes its internal structure on behalf of the new message call fastplay(). The fastplay-operation may technically be realized by dropping every second frame. Hence, the new preloaded frames are 10, 12 and 14. Frame 7 is no longer needed. When changing the presentation direction, the continuous object manager can use the same strategies by preloading "on the left". A state transition from fastplay to play is realized by frame stung of eventual missing frames. A very simple, but sucient replacement strategy is as follows: replace frames which are farthest away from the actual presentation point. In [MKK95] more elaborated concepts for continuous object management are described. Other relevant questions of continuous object management are the intra-media synchronization of di erent media streams, how the bu er resource is distributed over several multimedia presentations and how and which scaling [D+93] or adaptation strategies on the client and the server side can be considered.

4.4.3 Multimedia Storage Management To achieve better performance a multimedia storage manager component is integrated, which stores only raw data of continuos objects and is explicitly used by the continuous object management. Multimedia retrieval and storage systems as well as special storage devices (e.g. magneto-optical storage devices) capable of storing large volumes of data are supported. Hence, small and large objects are managed together, but stored di erently. The placement of multimedia streams and the consideration of admission control algorithms which handles new multimedia presentation requests are other functions of this component. For the latter similar approaches as those suggested in [Bil92, RV91] were chosen.

4.4.4 System support A multimedia database system obviously cannot and should not try to reimplement services which are (should be) provided by underlying platforms. For example, it does not make sense to reimplement high-speed network protocols which o er properties like guaranteed delivery, reserved channel capacity, quality of service parameters, etc. in order to realize a client-server architecture of a multimedia database system. Such services need to be provided by an external communication system. This also means that the existence of such components is actually crucial for the design and implementation of a multimedia database system. Appropriate storage support has to be realized partially by the multimedia database system, but it should be based on proper concepts for continuous media provided by e.g., the le system or speci c devices like magneto-optical disks which are integrated into the operating system. The operating system as well as the programming language should provide concepts for handling signals, events, parallel computation on the basis of threads and/or light weight processes in order to support processing of time-dependent data. In addition to that some real-time requirements may need to be met by the operating system.

5 Conclusion It was clearly illustrated in the three example applications that database management functionality, supporting persistency and multi-user access, is urgently needed when managing multimedia data. Current solutions, based on application-speci c implementations of general DBMS services on top of le systems are not exible enough, cannot fully cover the application requirements and last but not least are too expensive to implement. 31

Multimedia data comes along with complex data structures like in hypermedia documents, and in many cases additionally needs dedicated support for managing continuous data. Many external services will be used by a multimedia DBMS, like network support, handling of data in di erent formats, editors for complex structured multimedia documents or retrieval algorithms. The set of these services is continuously changeing as technology is developing. Thus a multimedia DBMS will be a much less self-contained system than a conventional DBMS and extensibility plays a central role. Among the existing database systems object-oriented DBMSs come closest to the requirements of multimedia DBMSs, but they still lack in particular support for managing continuous data. Manging continuous data leads to a new consistency notion for DBMSs, namely synchronization consistency. This has an impact on the data model level, where extensions to represent the temporal nature of continuous data as well as concepts for adequately retrieving multimedia data are needed. All services of the DBMS have to be measured against the notion of synchronization consistency. The basic services of a DBMS, like persistency, multi-user support or application decoupling, remain untouched, but their implementation is heavily a ected by the additional modelling concepts. In particular new concepts for storage and transport of time-dependent data require novel solutions. This a ects all operations of a DBMS where real time capturing or presentation of multimedia data is involved. At the interface of the DBMS it will be very hard to draw the borderline between general DBMS services and application speci c services. Should the DBMS provide its own editors/viewers for multimedia data or should it just provide the data in an adequate form to external components? Should certain retrieval mechanisms be part of the DBMS functionality or are many of them too much targeted against certain application domains? In many cases the solution here will be the provision of a customized DBMS, which extends some core functionality of an underlying multimedia DBMS towards the needs of whole application areas. The paper showed up in what direction further development in multimedia DBMS could go, however we are still far away from such a well-de ned, stable and generally accepted multimedia DBMS, as they are currently in use for managing standard business data with conventional DBMSs. However, rst concepts and systems indicate that the approach of a multimedia DBMS is feasible and eventually such systems will come to the daily practice as relational systems do today.

32

References [ABH94]

K. Aberer, K. Bohm, and C. Huser. The Prospects of Publishing Using Advanced Database Concepts. In Proc. of the International Conference on Electronic Publishing, Document Manipulation, and Typography, EP94, Darmstadt, Germany, pages 469{480. John Wiley & Sons, Ltd., 1994. [AH91] D. P. Anderson and G. Homsy. A Continuous Media I/O Server and Its Synchronization Mechanism. Computer, 24(10):51{57, October 1991. [AK94] K. Aberer and W. Klas. Supporting Temporal Multimedia Operations in Object-oriented Database Systems. In Proceedings of the IEEE International Conference on Multimedia Computing and Systems, Boston, USA, May 1994. [And90] D. P. Anderson. Meta-scheduling for Distributed Continuous Media. Technical Report No. UCB/CSD 90/599. EECS, University of California at Berkeley, Berkeley, CA, USA, 1990. [App91] Apple Corp. QuickTime Developer's Guide (preliminary version), 1991. [BA94] K. Bohm and K. Aberer. An object-oriented database application for hytime document storage. In Proceedings of the Conference on Information and Knowledge Management (CIKM94). Gaithersburg, MD, December 1994. [BAH94] K. Bohm, K. Aberer, and C. Huser. Introducing D-STREAT - The Impact of Advanced Database Technology on SGML Document Storage. h TAG i, 7(2):1{4, February 1994. + [BDE 93] I. Barth, G. Dermler, R. Er e, F. Fabian, K. Rothermel, J. Ruckert, and F. Sembach. Multimedia Document Handling - A Survey of Concepts and Methods. IBM European Networking Center, 1993. [BHL91] G. Blakowski, J. Hubel, and U. Langrehr. Tools for Specifying and Executing Synchronized Multimedia Presentations. In Network and Operating System Support for Digital Audio and Video", Second International Workshop, Proceedings, November 1991. [Bil92] A. Billiris. The Performance of Three Database Storage Structures for Managing Large Objects. In Proc. ACM SIGMOD Conf., pages 276{285, 1992. [BRR94] J.F. Buford, L. Rutledge, and J.L. Rutledge. Integrating Object-Oriented Scripting Languages with HyTime. In Proceedings of the IEEE International Conference on Multimedia Computing and Systems, Boston, USA, May 1994. [Buf94] J. F. Koegel Buford. Multimedia Systems. Addison-Wesley, 1994. [CHK+ 94] F. Chen, M. Hearst, J. Kupiec, et al. Meta-Data for Mixed-Media Access. ACM Sigmod Record, Special Issue on Meta-data for Digital Media, 23(4), December 1994. + [D 93] L. Delgrossi et al. Media Scaling for Audiovisual Communication for the Heidelberg Transport System. In Proc. ACM Multimedia Conf., 1993. [DG92] N. Dimitrova and F. Golshani. EVA: A Query Language for Multimedia Information Systems. In Proceedings of the Int. Workshop on Multimedia Information Systems, Tempe, AZ, USA, 1992. Intelligent Information Systems Laboratory, Arizona State University. + [DGJ 94] M. Deegener, G. Groe, W. John, B. Kuhnapfel, M. Lohr, and H. Wirth. Rapid Prototyping with MuSE. International Symposium on Automotive Technology and Automation, Dedicated Conference on Mechatronics, 1994. [Fur94] B. Furht. Multimedia Systems: An Overview. IEEE MultiMedia, 1(1):47{59, 1994. [FX94] P. Fankhauser and Yi Xu. MarkItUp! An incremental approach to document structure recognition. In Proc. of the International Conference on Electronic Publishing, Document Manipulation, and Typography, EP94, Darmstadt, Germany. John Wiley & Sons, Ltd., 1994.

33

[G+ 94]

Y. Gong et al. An Image Database System with Content Capturing and Fast Image Indexing Abilities. In Proc. of IEEE International Conference on Multimedia Computing and Systems, 1994. [Gal91] D. Le Gall. MPEG: A Video Compression Standard for Multimedia Applications. CACM, 34(4):46{58, April 1991. [GBT93] S. Gibbs, C. Breiteneder, and D. Tsichritzis. Audio/Video Databases: An Object-Oriented Approach. In Proc. of IEEE Ninth International Conference on Data Engineering, 1993. [GDT91] S. Gibbs, L. Dami, and D. Tsichritzis. An Object-Oriented Framework for Multimedia Composition and Synchronization. In Proc. of the Eurographics Multimedia Workshop, Stockholm, 1991. [GF92] T. Gottke and P. Fankhauser. DREAM 2.0 User Manual. Technical Report No. 660, GMD, Sankt Augustin, 1992. [Gib91] S. Gibbs. Composite Multimedia and Active Objects. In Proc. of the Conference on ObjectOrineted Programming: Systems, Languages, and Applications (OOPSLA'91), 1991. [GMD95] GMD. VODAK V4.0 User Manual, apr 1995. GMD Technical Report No. 910, Sankt Augustin. [Gre92] J. Green. The Evolution of DVI System Software. CACM, 35(1):53{67, January 1992. [Gro94] W.I. Grosky. Multimedia Information Systems. IEEE MultiMedia, 1(1):47{59, 1994. [Haa88] L. M. Haas. Supporting Multi-Media Object Management in a Relational Database Management System. Technical report. IBM Almaden Research Center, 1988. [HBS92] P. Hoschka, B. Butscher, and N. Streitz. Telecooperation and telepresence: Technical challenges of a government distributed between Bonn and Berlin. Informatization and the Public Sector, 2(4):269{299, 1992. [HL82] R. Haskin and R. Lorie. Using a Relational Database System for Circuit Design. IEEE Database Engineering, 5(2):10{14, June 1982. [HRRS95] Ch. Huser, K. Reichenberger, L. Rostek, and N. Streitz. Knowledge-based Editing and Visualization for Hypermedia Encyclopedias. Communications of the ACM, 38(4):49{51, 1995. [HSA89] M. E. Hodges, R. E. Sasnett, and M. S. Ackermann. A Construction Set for Multimedia Applications. IEEE Software, pages 37{43, January 1989. [ISO84] ISO. PHIGS - Programmers Hierarchical Interface to Graphics Systems, 1984. ISO/TC97/SC5/WG2/N305. [ISO86] ISO. Information processing - Text and Oce Systems - Standard Generalized Markup Language (SGML), 1986. ISO-IS 8879. [ISO92] ISO/IEC. Draft International Standard DIS 10918: Information Technology - coded representation of digital continuous-tone still pictures, January 1992. ISO/IEC/JTC1/SC29/WG10. [ISO94] ISO/IEC. Draft International Standard DSSSL: Information technology - Text and oce system - Document Style Semantics and Speci cation Language (DSSSL), October 1994. ISO/IEC DIS 10179.2. [JLS95] H. Jiang, C.Y. Low, and S.W. Smoliar. Video Parsing and Browsing Using Compressed Data. Multimedia Tools and Applications, 1(1):89{111, March 1995. [KAN93] W. Klas, K. Aberer, and E. Neuhold. Object-Oriented Modeling for Hypermedia Systems using the VODAK Modelling Language (VML). In Object-Oriented Database Management Systems, NATO ASI Series. Springer Verlag Berlin/Heidelberg, August 1993. [Kla92] W. Klas. Tailoring an Object-Oriented Database System to Integrate External Multimedia Devices. In Proceedings of 1992 Workshop on Heterogeneous Databases & Semantic Interoperability, February 1992.

34

[KN90]

W. Klas and E.J. Neuhold. Designing Intelligent Hypertext Systems using an Open ObjectOriented Database Model. Technical Report No. 489, GMD. GMD, Sankt Augustin, 1990. [KNS90] W. Klas, E. J. Neuhold, and M. Schre . Using an Object-Oriented Approach to Model Multimedia Data. Computer Communications, Special Issue on Multimedia Systems, 13(4):204{216, May 1990. [KS94] W. Klas and A. Sheth, editors. Special Issue on Meta-data for Digital Media. Number 4 in SIGMOD Record. ACM, December 1994. [LG90a] T. D. C. Little and A. Ghafoor. Network Considerations for Distributed Multimedia Object Composition and Communication. IEEE Network, 4(6):32{49, November 1990. [LG90b] T. D. C. Little and A. Ghafoor. Synchronization and Storage Models for Multimedia Objects. IEEE Journal of Selected Areas in Communication, 8(3), 1990. [MKK95] F. Moser, A. Krai, and W. Klas. L/MRP: A Bu er Management Strategy for Interactive Continuous Data Flows in a Multimedia DBMS. In Proceedings VLDB 1995, USA, 1995. Morgan Kaufmann. [MW91] K. Meyer-Wegener. Multimedia Datenbanken. Leitfaden der angewandten Informatik. Teubner Stuttgart, 1991. [Nic90] C. Nicolaou. An Architecture for Real-Time Multimedia Communication Systems. IEEE J. Select. Areas Commun., 8(3):391{400, 1990. [NKN91] S.R. Newcomb, N.A. Kipp, and V.T. Newcomb. The HyTime Hypermedia/Time-based Document Structuring Language. CACM, 34(11), November 1991. [oF94] J.L. Encarnaca o and J.D. Foley, editors. Multimedia. Spinger Berlin, 1994. [Pri93] R. Price. MHEG: An Introduction to the Future International Standard for Hypermedia Object Interchange. In ACM Multimedia 93, pages 121{128, 1993. [R+ 94] T. Rakow et al. Development of a Multimedia Archiving Teleservice using the DFR Standard. In Proceedings of the 2nd International Workshop on Advanced Teleservices and high Speed Communication Architectures, LNCS. Springer Verlag, 1994. [RJ93] L. Rabiner and B. H. Juang. Fundamentals of Speech Recognition. Prentice-Hall, 1993. [RL94] T. Rakow and M. Lohr. Das Ende der Sprachlosigkeit - Auf dem Weg zum multimedialen Datenbanksystem. In GMD-Jahresbericht 1993/94. GMD, Sankt Augustin, 1994. [RNL95] T. C. Rakow, E. J. Neuhold, and M. Loehr. Multimedia Database Systems - The Notions and the Issues. In Georg Lausen, editor, Datenbanksysteme in B|ro, Technik und Wissenschaft (BTW), pages 1{29, Dresden, Germany, March 1995. Springer. [RV91] P. Venkat Rangan and Harrick M. Vin. Designing File Systems for Digital Video and Audio. In Proc. Eurographics '91, pages 269{281, 1991. + [S 94] K. Sullow et al. MultiMedia Forum: an Interactive Online Journal. In Proc. of the Interna[SE93] [SG94]

tional Conference on Electronic Publishing, Document Manipulation, and Typography, EP94, Darmstadt, Germany. John Wiley & Sons, Ltd., 1994. R. Steinmetz and C. Engler. Human Perception of Media Synchronization. IBM European

Networking Center, 1993. P. Schauble and U. Glavitsch. Assessing the Retrieval E ectiveness of a Speech Retrieval System by Simulating Recognition Errors. In Proceedings of the ARPA Workshop on Human Language Technology (HLT'94), 1994.

35

[SGHH94] N. Streitz, J. Geissler, J. Haake, and J. Hol. DOLPHIN - Integrated meeting support across LiveBoards, local and remote desktop environments. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW'94) Chapel Hill, N.C. (October 22-26, 1994)., 1994. [TK95a] H. Thimm and W. Klas. Playout Management | An Integrated Service of a Multimedia Database Management Systems. In Proceedings of the First International Workshop on MultiMedia Database Management Systems, Blue Mountain Lake, NY, August 28-30, 1995. IEEE Computer Society Press, 1995. [TK95b] H. Thimm and W. Klas. Reactive Playout Management - Adapting Multimedia Presentations to Contradictory Constraints. Technical Report No. 916, GMD-IPSI. GMD, Sankt Augustin, 1995. [TR94] H. Thimm and T.C. Rakow. A DBMS-Based Multimedia Archiving Teleservice Incorporating Mail. In W.Litwin and T.Risch, editors, Proceedings of the First International Conference on Applications of Databases (ADB), pages 281{298, Vadstena, Sweden, 1994. Lecture Notes in Computer Science 819, Springer. [TRR94] H. Thimm, K. Rohr, and T.C. Rakow. A Mail-based Teleservice Architecture for Archiving and Retrieving Dynamically Composable Multimedia Documents. In Proceedings of the Conference on Multimedia Transport an Teleservices, MMTT94, 1994. [Vin91] H. M. Vin. Multimedia Conferencing in the Etherphone Environment. IEEE Computer, 24(10):69{79, October 1991. [WB92] Lynn D. Wilcox and Marcia A. Bush. Training and search algorithms for an interactive wordspotting system. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, March 1992. [WDG94] R. Weiss, A. Duada, and D.K. Gi ord. Content-Based Access to Algebraic Video. In Proc. of IEEE International Conference on Multimedia Computing and Systems, 1994. [WKL86] D. L. Woelk, W. Kim, and W. Luther. An Object-Oriented Approach to Multimedia Databases. ACM SIGMOD Record 1986, pages 311{325, 1986. [WMG+94] J.K. Wu, B.M. Mehtre, Y.J. Gao, et al. STAR - A Multimedia Database System For Trademark Registration. In W.Litwin and T.Risch, editors, Proceedings of the First International Conference on Applications of Databases (ADB), pages 109{122, Vadstena, Sweden, 1994. Lecture Notes in Computer Science 819, Springer. [YGM94] Tak W. Yan and Hector Garcia-Molina. The Electronic Library of the Future: Accessing Worldwide Information. GMD-Jubilaum, Springer-Verlag, 1994. [YKHI94] A. Yoshitaka, S. Kishida, M. Hirakawa, and T. Ichikawa. Knowledge-Assisted Content-Based Retrieval for Multimedia Databases. In Proc. of IEEE International Conference on Multimedia Computing and Systems, 1994.

36