Introduction to the special issue on conceptual and ... - IEEE Xplore

3 downloads 0 Views 304KB Size Report
JOHN SMITH, Guest Editor. IBM. Hawthorne, NY 10532 .... John R. Smith (M'97) received the M.Phil. and Ph.D. degrees in electrical engineering from. Columbia ...
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 1, JANUARY 2003

1

Introduction to the Special Issue on Conceptual and Dynamical Aspects of Multimedia Content Description

I

N 1999, Tim Berners-Lee proclaimed, “We are on the verge of a metadata revolution. Get your data models clean and prepare for an interesting ride.” But, at this point in time, have we witnessed this revolution? And, are we ready to take this “interesting ride?” The answer is a rather mixed one. It is true that metadata models—data about data—exist or are being developed to describe every portion of this “interesting ride.” From a standards side, for example, MPEG-7, TV-Anytime, W3C, and SMPTE have been standardizing various models for applications related to multimedia content broadcast, streaming, and the Semantic Web— efficient ways of representing data on the World Wide Web. There are some remaining obstacles that still preclude us from having a “smooth ride,” namely: 1) automatic (or near-automatic) generation of metadata that describes, adequately, multimedia content semantics and context dynamics (by “context dynamics”, we refer to the changes in content description that occur in response to changes in event dynamics, e.g., time, location, etc.); 2) interoperability between different metadata models describing various portions of the ride. Given the technical challenges encountered in obtaining context-sensitive semantics of multimedia, e.g. event-driven changes in multimedia semantics, we have dedicated this Special Section to the conceptual, as well as dynamical, aspects of content description. In the paper “Automatic Scene Extraction in Motion Pictures,” Truong et al. address the problem of locating scene boundaries in a movie based on real-time extraction of low-level visual features, such as color. Since scenes are composed of many shots, the detection of scene boundaries requires, in turn, inference of scene semantics from the extracted visual content. Responding to the growing demand for finding multimedia content, MPEG-7 has defined standards for multimedia content interface. In “Utility of MPEG-7 Systems in Audio–Visual Applications with Multiple Streams, ” Lopez et al. use personalized TV services and video-based surveillance as illustrative applications in which MPEG-7 descriptions are processed in real time, for filtering, control, and media aggregation. User interaction in personalized TV services, and indexing based on evidence accumulation in surveillance applications, provide the dynamic context that needs to be handled by their proposed extended MPEG-7 system. In the paper “CBSA: Content-based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machines,” Chang

Digital Object Identifier 10.1109/TCSVT.2002.808092

et al. propose a content-based soft annotation method for providing images with semantic labels. In their proposed scheme, a user initiates a query by using a set of keywords. They use a combination of perceptual features and an annotation of returned images to refine multimodal query accuracy. Addressing the concern that current content-based retrieval methods lack the capability to capture content semantics, He et al. in “Learning and Inferring a Semantic Space from User’s Relevance Feedback for Image Retrieval” describe a framework that uses spectral methods to infer a semantic space based on a user’s relevance feedback and interaction. They also describe a method of updating and choosing optimal dimensionality of the semantic space. The paper “Reconciling MPEG-7 and MPEG-21 Semantics through a Common Event-Aware Model” by Hunter addresses semantic interoperability of multimedia metadata throughout the lifecycle of multimedia content. She defines an event-aware model (the ABC model), which is combined with the description of the distinct MPEG-7 and MPEG-21 vocabularies in a resource description framework (RDF) and enhanced by mapping the MPEG-7 and MPEG-21 classes and properties to a common event model. The result is a single machine-understandable ontology expressed in DAML OIL that facilitates semantic interoperability by representing the semantics of metadata and associated events with the lifecycle of multimedia content. Similarity detection is an indispensable tool in web data management, searching, and navigation. Cheung and Zakhor present the use of video signature as a measure for similarity in “Efficient Video Similarity Measurement with Video Signature.” They propose several algorithms to efficiently measure video similarity and demonstrate the superior retrieval performance on a large dataset of web video and MPEG-7 test sequences. This Special Section covers many recent advances in multimedia semantic-extraction techniques, pointing to new insights and offering promising directions in research. They, by no means, present the full solution, as we are still far from understanding and modeling, adequately and in real time, the mechanics of human brain function. We, as Guest Editors, would like to take this opportunity to thank all the authors for their excellent contributions, making this Special Issue a valuable reference for future work. We would also like to thank the many reviewers who, despite their heavy workloads, provided timely and expert reviews on the submitted manuscripts. Finally, we extend our sincere thanks to Dr. W. Li, past Editorin-Chief, for his support and encouragement, without which the publication of this Special Issue would not have been possible.

1051-8215/03$17.00 © 2003 IEEE

2

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 1, JANUARY 2003

ALI TABATABAI, Guest Editor Sony Network and Software Technology Center San Jose, CA 95134 USA SETHURAMAN PANCHANATHAN, Guest Editor Arizona State University Department of Computer Science and Engineering Tempe, AZ USA (e-mail: [email protected]) JOHN SMITH, Guest Editor IBM Hawthorne, NY 10532 USA HIROSHI YASUDA, Guest Editor University of Tokyo Tokyo, Japan SIEGFRIED HANDSCHUH, Guest Editor University of Karlsruhe Karlsruhe, Germany

Ali Tabatabai (F’01) received the Ph.D. degree from Purdue University, West Lafayette, IN, in 1981. From 1981 to 1984, he was with AT&T Bell Laboratories, where he worked on algorithmic research and development for the transmission of still images using videotex terminals. In 1984, he moved to Bell Communication Research, where he conducted pioneering research work on the application of subband techniques to the coding of still images. In addition, as a core-member of the ITU-T Study Group XV, he was instrumental in conducting the first flexible hardware trial of H.261/H.321 terminal between Japan and the U.S. In October 1992, he joined Tektronix as the Manager of the Digital Video Research Group, where his responsibilities included the algorithmic research and development of video-compression techniques appropriate for studio and production quality applications. He was the Chair of the ad hoc group of MPEG-2, whose work resulted in the standardization of a highly successful MPEG-2 4 : 2 : 2 Profile & Main Level. In September 1999, he joined Sony Network and Software Technology Center of America (formerly Sony U.S. Research Laboratories), San Jose, CA. He is currently a Department Manager in the Media Processing Division, where he is responsible for the standards and research and development activities in next-generation video compression and streaming technologies. He has authored/co-authored over 45 publications in the fields of image/video compression, processing, content description, and networking. Dr. Tabatabai is an Associate Editor for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (TCSVT). He was Guest Editor for two IEEE TCSVT Special Issues on Advanced Picture Coding and Packet Video. He has also served as the General Chair for the International Packet Video Workshop in 1994 and the International Picture Coding Symposium in 1999. He has also been on a number of the technical program committees for IEEE and VCIP conferences. He received the IEEE Darlington Best Paper Award in 1988 for his work on the application of subband techniques to the coding of still images.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 1, JANUARY 2003

3

Sethuraman (Panch) Panchanathan (F’01) received the B.Sc. degree in physics from the University of Madras, India, in 1981, the B.E. degree in electronics and communication engineering from the Indian Institute of Science, Bangalore, in 1984, the M.Tech. degree in electrical engineering from the Indian Institute of Technology, Madras, in 1986, and the Ph.D. degree in electrical engineering from the University of Ottawa, ON, Canada, in 1989. He is currently a Professor and Interim Chair of the Computer Science and Engineering Department, as well the Director of the Research Center on Ubiquitous Computing (CUbiC), at Arizona State University (ASU), Tempe. He leads a team of researchers and graduate students working in the areas of compression, indexing, storage, retrieval, and browsing of images and video, VLSI architectures for video processing, multimedia hardware architectures, parallel processing, and multimedia communications. He is also an Affiliate Professor in the Department of Electrical Engineering at ASU and an Adjunct Professor in the School of Information Technology and Engineering at the University of Ottawa. He was an Honorary Visiting Professor at the University of New South Wales, Sydney, Australia, is the Chief Scientific Researcher of Obvious Technology, Paris, France, and was a Scientific Advisor for Luxxon Corporation, San Jose, CA. He was a Principal Investigator of projects funded by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Provincial Center of Excellence on Telecommunications Research (TRIO), Corel Corporation, Canadian Marconi, Telesat Canada, and Callisto Corporation. He was also a co-investigator of projects funded by the Federal Centers of Excellence on Telecommunications Research (CITR) and Microelectronics Network (MICRONET). He is currently a PI/Co-PI of projects funded by the National Science Foundation, Motorola, ARM, and SUN. He has published over 200 papers in refereed journals and conferences. He has written a book chapter on compressed/progressive search in the book Image Databases: Search and Retrieval of Digital Imagery (New York: Wiley, 2001). Dr. Panchanathan is an Associate Editor of the IEEE TRANSACTIONS ON MULTIMEDIA and IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (TCSVT), an Area Editor of the Journal of Visual Communications and Image Representation, and an Associate Editor of the Journal of Electronic Imaging. He was Guest Editor of a three-part Special Issue on Image and Video Processing for Emerging Interactive Multimedia in the IEEE TCSVT (September 1998, November 1998, and February 1999), Guest Editor of a two-part Special Issue on Indexing, Storage, Browsing and Retrieval of Images and Video in the Journal of Visual Communication and Image Representation (December 1996 and June 1997), and Guest Editor of the Special Issue on Visual Computing and Communications in the Canadian Journal of Electrical and Computer Engineering. He was Co-Chair of the IS&T/SPIE Digital Video Compression-Algorithms and Technologies’96 and Multimedia Hardware Architectures’97 Conferences, Tutorials Chair of the IEEE International Conference on Multimedia Systems ’97, Symposium Chair of the Electronic Imaging ’98 Symposium, and Chair of the Multimedia Hardware Architectures’98 Conference. He was the Co-Chair of the Multimedia Storage and Archiving Systems III and IV Conferences in Photonics East, the Media Processors 1999–2002 conferences, the Workshop on Parallel and Distributed Computing in Image Processing, Video Processing, and Multimedia in 2000 and 2001, the Internet Multimedia Management Systems 2000 and 2001 Conferences, and the Internet Multimedia Management Systems 2002 and Media Processor 2003 Conferences. He was the Co-General Chair of the IEEE International Symposium on Circuits and Systems (ISCAS’2002). In addition, he is a program committee member of numerous conferences, organizer of special sessions in several conferences, an invited panel member of special sessions, and has presented several invited talks in conferences, universities, and industry. He is a Fellow of SPIE and a member of the European Association for Signal Processing (EURASIP).

John R. Smith (M’97) received the M.Phil. and Ph.D. degrees in electrical engineering from Columbia University, New York, in 1994 and 1997, respectively. He is a Manager of the Pervasive Media Management Group at IBM T.J. Watson Research Center, Hawthorne, NY, where he leads a research team exploring techniques for multimedia content management. He is also Chair of the MPEG Multimedia Description Schemes (MDS) group, and serves as co-Project Editor for MPEG-7 Multimedia Description Schemes. His research interests include multimedia databases, multimedia content analysis, compression, indexing, and retrieval. Additionally, he is an Adjunct Professor at Columbia University.

4

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 1, JANUARY 2003

Hiroshi Yasuda (F’98) received the B.E., M.E., and Dr.E. degrees from the University of Tokyo, Tokyo, Japan, in 1967, 1969, and 1972, respectively. He was with the the Electrical Communication Laboratories of NTT, Japan, from 1992 to 1997, most recently as Vice President and Director of NTT Information and Communication Systems Laboratories. During his 25 years with NTT, he was involved in video coding, image processing, tele-presence, B-ISDN networks and services, and Internet and computer communication applications. He joined The University of Tokyo, Tokyo, Japan, in 1997 as the Professor at Center for Collaborative Research (CCR). He has also served as the Chairman of ISO/IEC JTC1/SC29 (JPEG/MPEG Standardization) from 1991 to 1999, as the President of Digital Audio Video Council (DAVIC) from September 1996 to September 1998. He is author of the books International Standardization of Multimedia Coding (1991), MPEG/International Standardization of Multimedia Coding (1994), The Base for the Digital Image Coding (1995), and The Text for Internet (1996). Dr. Yasuda received the Takayanagi Award (1987), the Achievement Award of EICEJ (1995), an EMMY Award from The National Academy of Television Arts and Science (1995-1996), and a Charles Proteus Steinmetz Award from IEEE (2000). He is a Fellow of EICEJ and IPSJ and a member of the Television Institute.

Siegfried Handschuh received the Information Science degree from the University of Constance, Konstanz, Germany, in 1997. He is a Researcher at the Institute of Applied Computer Science and Formal Description Methods, University of Karlsruhe, Karlsruhe, Germany. Currently, he is involved in the OntoAgents project of the DARPA DAML Program. His research interests include annotations in semantic-web and ontology-based applications. He has chaired several workshops on semantic annotation.