Automatic Metadata Extraction from Art Images

21 downloads 1037119 Views 6MB Size Report
relationship with them, MET also operates email marketing and social ..... services. A high level archival model to act as a framework is necessary. In 2002 the ...
 

Krassimira Ivanova, Milena Dobreva, Peter Stanchev, George Totkov (editors)

Access to Digital Cultural Heritage:

Innovative Applications of Automated Metadata Generation

Plovdiv University Publishing House "Paisii Hilendarski" 2012, Plovdiv, Bulgaria

Access to Digital Cultural Heritage: Innovative Applications of Automated Metadata Generation Edited by: Krassimira Ivanova, Milena Dobreva, Peter Stanchev, George Totkov Authors (in order of appearance): Krassimira Ivanova, Peter Stanchev, George Totkov, Kalina Sotirova, Juliana Peneva, Stanislav Ivanov, Rositza Doneva, Emil Hadjikolev, George Vragov, Elena Somova, Evgenia Velikova, Iliya Mitov, Koen Vanhoof, Benoit Depaire, Dimitar Blagoev Reviewer: Prof., Dr. Avram Eskenazi Published by: Plovdiv University Publishing House "Paisii Hilendarski" 2012, Plovdiv, Bulgaria First Edition The main purpose of this book is to provide an overview of the current trends in the field of digitization of cultural heritage as well as to present recent research done within the framework of the project D002-308 funded by Bulgarian National Science Fund. The main contributions of the work presented are in organizing digital content, metadata generation, and methods for enhancing resource discovery. Printed in Bulgaria by Plovdiv University 24, Tsar Assen, Str., Plovdiv-4000, Bulgaria All Rights Reserved © This compilation: K. Ivanova, M. Dobreva, P. Stanchev, G. Totkov 2012 © The chapters: the contributors 2012 © The cover: K. Sotirova 2012

ISBN: 978-954-423-722-6 Plovdiv, 2012

Acknowledgements This book summarises the outcomes of several recent research projects. It is quite a complex project in terms of scope and number of contributors. We would like to particularly thank all authors for their enthusiasm and commitment. We would also like to thank all our colleagues who were supportive of our work and thus helped to develop our competences on the topic. The projects which helped to develop our ideas and test them in real life were supported by: 

the Bulgarian National Science Fund, namely the Project D002-308 "MetaSpeed: Automated Metadata Generating for e-Documents Specifications and Standards";



the Framework Programme 7 (FP7) of the European Commission, namely the project RI-246686 "OpenAIRE: Open Access Infrastructure for Research in Europe";



the Hasselt University in Belgium, namely the Projects R-1875 "Search in Art Image Collections Based on Colour Semantics" and R-1876 "Intelligent systems' memory structuring using multidimensional numbered information spaces".

The involvement of Bulgarian researchers in these projects in particular was made possible through the cooperation of several institutions: we would like to express our gratitude to the Institute of Mathematics and Informatics – Bulgarian Academy of Sciences, Plovdiv University "Paisij Hilendarski", New Bulgarian University, and Hasselt University in Belgium for providing excellent conditions for collaboration.

4

Access to Digital Cultural Heritage ...

A number of international as well as national events gave us possibilities to present our visions and discuss with other colleagues from different professional communities; these discussions were also of great contribution to our work. Of particular help were the international events organised by the Member of the European Parliament Emil Stoyanov; he also welcomed the idea of creating this collection. We also want to thank our reviewer Professor Avram Eskenazi for his helpful remarks during the process of prepariation of the content of this publication. Last but not least, we would like to thank Emilia Todorova from Glasgow Caledonian University for the help with language revision and to Viktoria Naoumova from the Institute of Mathematics and Informatics for technical assistance.

Table of Contents Acknowledgements ..................................................................... 3 Table of Contents ....................................................................... 5 List of Abbreviations ................................................................... 9 Introduction ............................................................................. 13 Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives ...................................................... 23 1

Cultural Heritage .............................................................. 23

2

The Three Building Blocks of Digital Heritage ........................ 26 2.1

Digitization ..............................................................................26

2.2

Access ....................................................................................28

2.3

Preservation .............................................................................29

3

The Importance of Metadata .............................................. 31

4

Metadata Schemas and Standards Used in Cultural Heritage ... 33

5

6

7

4.1

Common Standards ...................................................................34

4.2

Standards for Resource Discovery ................................................36

4.3

Specific Standards .....................................................................37

4.4

Other Standards Relevant to Cultural Heritage ................................40

Digital Library .................................................................. 41 5.1

Basic Definitions .......................................................................42

5.2

The Contemporary Models of Digital Libraries .................................43

5.3

Repository Software ..................................................................50

Initiatives on World and European Level .............................. 52 6.1

Library and Scientific Open-access Initiatives .................................53

6.2

Examples of Initiatives that Change the Digital World ......................56

6.3

Initiatives, Connected with Data Content Standards.........................59

The User and the New Digital World .................................... 60 7.1

Users: between Policies and Real Involvement ................................61

6

Access to Digital Cultural Heritage ...

8

7.2

User Involvement in Digital Libraries Development ..........................62

7.3

User Studies.............................................................................63

Conclusion ...................................................................... 64

Bibliography ......................................................................... 65 Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts ............................................................. 69 1

Introduction .................................................................... 69

2

Aggregators of Digital Content for Cultural Artefacts in EU ...... 71

3

The Prototype REGATTA–Plovdiv ......................................... 72

4

5

3.1

The Functional Scheme of REGATTA .............................................74

3.2

Data Model in REGATTA ..............................................................75

3.3

Technological Aspects ................................................................78

Virtual Tours in REGATTA .................................................. 80 4.1

Panoramic Virtual Tours .............................................................81

4.2

3D-Virtual Tours .......................................................................82

Presentation of Plovdiv Ethnographic Museum in REGATTA ..... 83 5.1

Movable Artefacts......................................................................83

5.2

Virtual Tours of the Plovdiv Ethnographic Museum ...........................87

6

The Next Step – Enforcing the Data Management with Data Mining Tools .................................................................... 92

7

Conclusion ...................................................................... 94

Bibliography ......................................................................... 94 Chapter 3: Automated Metadata Extraction from Art Images .......... 97 1

Introduction .................................................................... 97

2

Semantic Web ................................................................. 99

3

The Process of Image Retrieval ........................................ 101

4

5

6

3.1

Text-Based Retrieval................................................................ 101

3.2

Content-Based Image Retrieval (CBIR)........................................ 104

The Gaps ...................................................................... 106 4.1

Sensory Gap .......................................................................... 107

4.2

Semantic Gap ......................................................................... 108

4.3

Abstraction Gap ...................................................................... 109

4.4

Subjective Gap ....................................................................... 110

User Interaction ............................................................. 111 5.1

Complexity of the Queries ......................................................... 111

5.2

Relevance Feedback ................................................................ 112

5.3

Multimodal Fusion ................................................................... 113

Feature Design .............................................................. 114

7

7

6.1

Taxonomy of Art Image Content ................................................ 115

6.2

Visual Features ....................................................................... 117

6.3

MPEG-7 Standard .................................................................... 123

Data Reduction .............................................................. 127 7.1

Dimensionality Reduction.......................................................... 127

7.2

Numerosity Reduction .............................................................. 134

8

Indexing ....................................................................... 137

9

Retrieval Process ............................................................ 140 9.1

Similarity ............................................................................... 140

9.2

Techniques for Improving Image Retrieval ................................... 146

10 Conclusion .................................................................... 147 Bibliography ....................................................................... 148 Chapter 4: APICAS – Content-Based Image Retrieval in Art Image Collections Utilizing Colour Semantics ...................... 153 1

Colour – Physiology and Psychology .................................. 153 1.1

Physiological Ground of the Colour Perceiving ............................... 155

1.2

Image Harmonies and Contrasts ................................................ 157

1.3

Psychological Colour Aspects ..................................................... 159

2

Art Image Analyzing Systems........................................... 160

3

Proposed Features .......................................................... 163

4

5

6

3.1

Colour Distribution Features ...................................................... 164

3.2

Harmonies/Contrasts Features ................................................... 166

3.3

Formal Description of Harmonies/Contrasts Features Using HSLartist Colour Model .................................................................. 170

3.4

Local Features, based on Vector Quantization of MPEG-7 Descriptors over Tiles............................................................... 176

3.5

Other Attributes ...................................................................... 178

APICAS: The System Description ...................................... 179 4.1

Functional Requirements .......................................................... 180

4.2

APICAS Architecture ................................................................ 181

4.3

APICAS Ground ....................................................................... 183

4.4

APICAS Functionality ............................................................... 183

Experiments .................................................................. 192 5.1

Analysis of the Visual Features .................................................. 192

5.2

Analysis of the Harmonies/Contrast Descriptors ............................ 194

5.3

Analysis of the Local Features .................................................... 197

Conclusion .................................................................... 200

Bibliography ....................................................................... 201

8

Access to Digital Cultural Heritage ...

Chapter 5: Automatic Metadata Generation and Digital Cultural Heritage ....................................................................... 203 1

Automatic Generation of Metadata .................................... 203 1.1

Regular Expressions ................................................................ 204

1.2

Rule-based Parsers .................................................................. 204

1.3

Machine Learning Algorithms ..................................................... 205

2

Data Mining ................................................................... 205

3

Data Extraction from Web Documents Using Regular Expressions ................................................................... 209

4

5

6

7

3.1

Data Extraction by Learning Restricted Finite State Automata .......... 210

3.2

Program Realization................................................................. 213

3.3

Experiments ........................................................................... 214

ArmSquare: an Association Rule Miner Based on Multidimensional Numbered Information Spaces ................. 218 4.1

A Brief Overview of Previous ARM Algorithms ............................... 219

4.2

Association Rule Miner ArmSquare ............................................. 221

4.3

Multidimensional Numbered Information Spaces ........................... 222

4.4

Algorithm Description of ArmSquare ........................................... 223

4.5

Program Realization................................................................. 227

4.6

Advanced Specifics of ArmSquare............................................... 229

4.7

Implementation ...................................................................... 229

PGN: Classification with High Confidence Rules ................... 232 5.1

The Structure of CAR-algorithms ................................................ 233

5.2

Algorithm Description of PGN Classifier........................................ 235

5.3

PGN and Predictive Analysis in Art Collections ............................... 241

Metric Categorization Relations Based on Support System Analysis ........................................................................ 246 6.1

The Semantic Complexity ......................................................... 246

6.2

Meta-PGN: Algorithm Description ............................................... 247

6.3

Program Realization................................................................. 248

6.4

The Next Step: Application in the Field ........................................ 249

Conclusion .................................................................... 249

Bibliography ....................................................................... 251

List of Abbreviations 5M 5S AAT ACRI ACTA AIP APICAS ARC-AC ARC-BC ARM ArM ARUBAS BIDL CAD CAR CATCH CBA CBIR CCA CCSDS CDWA CH CHO CIDOC CRM CL CMAR CMY CONA CorClass CORDIS CPAR CS

Multicultural, Multilingual, Multimodal, Multivariate, Modelling Streams, Structures, Spaces, Scenarios, and Societies Art and Architecture Thesaurus Associative Classifier with Reoccuring Items Anti-Counterfeiting Trade Agreement Archival Information Package Art Painting Image Colour Aesthetics and Semantics Association Rule-based Categorizer for All Categories Association Rule-based Categorizer By Category Association Rule Mining Archive Manager Association RUle BAsed Similarity framework Bulgarian Iconographical Digital Library Computer-aided Design Class-Association Rules Continuous Access to Cultural Heritage Classification Based on Associations Content-Based Image Retrieval Curvilinear Component Analysis Consultative Committee for Space Data Systems Categories for the Description of Works of Art Cultural Heritage Cultural and Historical Objects International Committee for Documentation – Conceptual Reference Model Colour Layout Classification based on Multiple Association Rules Cyan-Magenta-Yellow Cultural Objects Name Authority Correlated Association Rule Mining for Classification Community Research & Development Information Service Classification based on Predictive Association Rules Colour Structure

10

CSDGM DACS DC DC DCP DELOS DHO DIP DL DLRM DOI DWT EAD EC ECDL EDL EDM EH EMD EOF Fedora FOIL FP7 FRBR FRBROO GIS GLAM GLOH GPS HARMONY HSIS HSL HSV HT HTML ICCROM ICOM ICT-CIP IDABC IFLA

Access to Digital Cultural Heritage ...

Content Standard for Digital Geospatial Metadata Describing Archives: a Content Standard Dominant Colour Dublin Core Data Coverage Pruning Network of Excellence on Digital Libraries Digital Humanities Observatory Dissemination Information Package Digital Library Digital Libraries Reference Model Digital Object Identifier Discrete Wavelet Transform Encoded Archival Description European Commission European Conference on Digital Libraries European Digital Library Europeana Data Model Edge Histogram Earth Mover's Distance Empirical Orthogonal Function Flexible Extensible Digital Object Repository Architecture First Order Inductive Learner Seventh Framework Programme Functional Requirements for Bibliographic Records FRBR – Object Oriented Geographic Information System Galleries, Libraries, Archives, Museums Gradient Location and Orientation Histogram Global Positioning System Highest confidence clAssification Rule Mining fOr iNstancecentric classifYing Humanities Serving Irish Society Hue-Saturation-Luminance Hue-Saturation-Value Homogeneous Texture Hyper-Text Markup Language International Centre for the Study of the Preservation and Restoration of Cultural Property International Council of Museums Information and Communication Technologies – Competitiveness and Innovation Framework Programme Interoperable Delivery of European eGovernment Services to public Administrations, Businesses and Citizens International Federation of Library Associations

11 IMI-BAS IRI ISAAR(CPF) ISAD(G) ISOC IT JISC LESH LIDAR LIDO LLE MARC MBR MDS MET METS MINERVA MODS MPEG NSDL NURBS OAI-PMH OAIS OpenAIRE ORE OWL PCA PGN PP R&D RDF RDFS REGATTA RGB RYB SAIL SC SDA SGML SIFT SIP SRES SRSWOR

Institute of Mathematics and Informatics – Bulgarian Academy of Sciences International Resource Identifier International Standard Archival Authority Record for Corporate Bodies, Persons and Families General International Standard Archival Description Internet Society Information Technology Joint Information Systems Committee Local Energy based Shape Histogram Light Detection And Ranging Light Information Describing Objects Locally Linear Embedding MAchine-Readable Cataloging Minimum Bounding Rectangle Multi Dimensional Scaling Metropoliten Museum of Art Metadata Encoding & Transmission Standard MInisterial NEtwoRk for Valorising Activities in digitisation Metadata Object Description Schema Moving Picture Experts Group National Science Digital Library Non Uniform Rational BSpline Open Archives Initiative Protocol for Metadata Harvesting Open Archival Information System Open Access Infrastructure Research for Europe Ontology Rule Editor Web Ontology Language Principal Component Analysis Pyramidal Growing Network Projection Pursuit Research and Development Resource Description Framework Resource Description Framework Schema REGional Aggregator of heTerogeneous culTural Artefacts Red-Green-Blue Red-Yellow-Blue Semi-Automated Interactive Learning systems Scalable Colour Symbolic Data Analysis Standard Generalized Markup Language Scale-Invariant Feature Transform Submission Information Package Self-supervised web relation Extraction System Simple Random Sample WithOut Replacement

12

SRSWR SURF SVD SVM TEL TFPC TGN TPDL ULAN UNESCO URI URL URN VQ VRA W3C WDL WIPO XML

Access to Digital Cultural Heritage ...

Simple Random Sample With Replacement Speeded Up Robust Feature Singular Value Decomposition Support Vector Machines The European Library Total From Partial Classification Thesaurus of Geographic Names Theory and Practice of Digital Libraries Union List of Artist Names United Nations Educational, Scientific and Cultural Organization Uniform Resource Identifier Uniform Resource Locator Uniform Resource Name Vector Quantization Visual Resources Association World Wide Web Consortium World Digital Library World Intellectual Property Organization eXtensible Markup Language

Introduction Krassimira Ivanova, Milena Dobreva, Peter Stanchev, George Totkov The access to cultural heritage artefacts in the digital space is already in the stage of maturity. The romantic time of pilot attempts to digitise several selected objects is long gone, as well as the debate if digitisation is done for preservation or for access. If we can take as a sign of the significance and the scale of effort the collection of statistical data on pan-European scale, this started with the NUMERIC 1 study in 2007-2008 and currently continues with the ENUMERATE 2 project. Europeana, the showcase digital library of Europe, grows quickly in the number of objects which can be discovered and accessed through it – currently about 25 million. The European Commission Recommendation of 27 October 2011 on the digitization and online accessibility of cultural material and digital preservation (2011/711/EU) is very straight-forward in stating that: "The Digital Agenda for Europe seeks to optimise the benefits of information technologies for economic growth, job creation and the quality of life of European citizens, as part of the Europe 2020 strategy. The digitisation and preservation of Europe’s cultural memory which includes print (books, journals and newspapers), photographs, museum objects, archival documents, sound and audiovisual material, monuments and archaeological sites (hereinafter 'cultural material') is one of the key areas tackled by the Digital Agenda." It further emphasizes that the member states should develop their planning and monitoring of the digitisation of books, journals, newspapers, photographs, museum objects, archival documents, sound and audiovisual material, monuments and archaeological sites, and should contribute to the further development of Europeana. According to this Recommendation, Bulgaria

1 2

http://www.numeric.ws/ http://www.enumerate.eu/

14

Access to Digital Cultural Heritage ...

is currently present with 38,263 objects in Europeana and is expected to contribute further 267 000 objects until 2015. This is indeed a change of scale which can not be achieved with the efforts of a single institution, and will require not only to intensify digitization efforts, but also to integrate intelligent tools in the digitization as much as possible. However, the advancement in the area of digital cultural heritage is not only in the numbers of digitised objects. In the last decades, the larger involvement of heritage institutions in digitisation activities motivated a broad range of research activities. Some of them addressed the changing landscape of humanities and arts and inspired the growth of the area of Digital Humanities. But the practical digitisation needs also required specific support from the information and computer sciences. Areas such as dealing with "big data", new metadata models, new methods for content-based analysis and retrieval are only few examples of information research which not only could be applied to a new domain but also had to cater for the specific needs of the digital heritage. While in many European countries the discussions of the value and impact of digital collections are currently high on the agenda, Bulgaria still finds itself in a different situation where the key issues are to achieve a critical mass of digital content, and to integrate this critical mass better within the European efforts to provide access to the digital cultural heritage. In the recent years, four national events, organised by Emil Stoyanov, MEP, member of the Committee on Culture and Education in European Parliament, three in Plovdiv and one in Brussels, were held. These events gave the opportunities to the experts from memory institutions, representatives of the government and scholars to discuss the achievements in digitisation in Bulgaria and the best way forward. These events had an essential role in building a professional community of those already experienced or just starting in digitisation. One of the outcomes of this intensified cooperation of institutions is this book. This edited collection brings together the outcomes of national and international research projects aiming to develop specialised technologies and tools which would be of help in digitisation efforts.  The Projects that Supported the Creation of this Book This book presents the outcomes of several projects which created and delivered specialised information research methods in the digital heritage domain. Below, we give a brief explanation of the projects.

Introduction



15

Project D002-308 "MetaSpeed: Automated Metadata Generating for e-Documents Specifications and Standards" funded by the Bulgarian National Science Fund

The main objective of the project was the creation and testing of technologies, methods and tools for automated specification documents of different electronic format (text, graphics, etc.), different content (cultural and historical artefacts, educational materials, scientific publications, etc.), and different location (local multimedia repositories, web pages, etc.). To achieve the goal several research tasks were addressed: 

Standards for e-documents and tools for their automatic generation: to make critical analysis of existing standards in the areas of cultural heritage, scientific publications, e-learning, and geospatial data.



Automatic generation of metadata from text documents: to create and study of special methods, algorithms and tools for extracting structured data of electronic text documents, especially of the Bulgarian language.



Automatic generation of metadata from multimedia documents: supposed to be developed and used methods and tools for automated extraction of metadata for images, both context and content of objects.



Creation and testing of digital repositories in different areas: cultural heritage, scientific publications, e-learning, and geographic information systems.

The project executives are the teams from Plovdiv University "Paisii Hilendarski" (coordinated by Prof. George Totkov), Institute of Mathematics and Informatics – Bulgarian Academy of Sciences (coordinated by Prof. Peter Stanchev), New Bulgarian University (coordinated by Assoc. Prof. Juliana Peneva), and Sofia Technical University (coordinated by Prof. Elena Shoykova). 

Project RI-246686 "OpenAIRE: Open Access Infrastructure for Research in Europe" within the FP7 Framework programme

Creating and maintaining a strong network for European research cooperation is high on the EU agenda. Helping drive this effort is OpenAIRE, a project encouraging and supporting free online access to knowledge generated by researchers with grants from the Seventh Framework Programme (FP7) and the European Research Council. One of the most critical components of the OpenAIRE project is to give researchers, businesses and citizens free and open access to EU-funded research papers. The OpenAIRE infrastructure is also helping devise new methods of indexing, annotating, ordering and linking research results, as

16

Access to Digital Cultural Heritage ...

well as automating processes. With these activities in mind, OpenAIRE will contribute to the development of fresh services on top of the information infrastructure which it offers. The project coordinator for Bulgaria is Prof. Peter Stanchev from the Institute of Mathematics and Informatics – Bulgarian Academy of Sciences (IMI-BAS). 

Project R-1875 "Search in Art Image Collections Based on Colour Semantics" funded by the Hasselt University

This is the doctoral research project of Krassimira Ivanova from IMIBAS with advisor Prof. Koen Vanhoof from Hasselt University. The aims of the project were to make a comprehensive analysis of successful colour combinations examined and used by the artists during the centuries which send different impression, expression and construction messages to the viewers. Typical features are color contrasts, because one of the goals of the painting is to produce specific psychological effects in the observer, which are achieved with different arrangements of colors. The main goals of this work were: 

to provide a detailed analysis of the colour theories, especially on existing interconnections in successful colour combinations;



to formalize them in order to implement automated extraction from digitized artworks.

The extracted features were successfully used for similarity search with selected image by one or more of the extracted features; search of images that satisfied user queries featuring contrasts' characteristics; as well as for investigation on the possibilities to integrate such characteristics within specialized resource discovery (searching for distinctive feature of movements, artists or artists' periods). 

Project R-1876 "Intelligent systems' memory structuring using multidimensional numbered information spaces" funded by the Hasselt University

This is the doctoral research project of Iliya Mitov from IMI-BAS with advisor Prof. Koen Vanhoof from Hasselt University. The goals of this thesis were two-fold: 

to introduce a parameter-free class association rule algorithm, which focuses primarily on the confidence of the association rules and only in a later stage on the support of the rules. We expect that this approach will ensure implementing high-quality recognition especially within unbalanced and multi-class datasets. The nature of such a classifier is more oriented to having characteristic rules;

Introduction



17

to show the advantages of using multidimensional numbered information spaces for developing memory structuring in data mining processes on the example of implementation of the proposed class association rule algorithms.

The main results of this project lay as one of the basic algorithms in data mining environment PaGaNe. The system incorporates different types of statistical analysis methods, discretization algorithms, association rule miner, as well as classification algorithms.  The Book Content The book consists of five chapters. Here we give a brief explanation of them. The first chapter "Digitization of Cultural Heritage – Standards, Institutions, Initiatives" provides an introduction to the area of digitisation. The main pillars of process of creating, preserving and accessing of cultural heritage in digital space are observed. The importance of metadata in the process of accessing to information is outlined. The metadata schemas and standards used in cultural heritage are discussed. In order to reach digital objects in virtual space they are organized in digital libraries. Contemporary digital libraries are trying to deliver richer and better functionality, which usually is user oriented and depending on current IT trend. Additionally, the chapter is focused on some initiatives on world and European level that during the years enforce the process of digitization and organizing digital objects in the cultural heritage domain. In recent years, the main focus in the creation of digital resources shifts from "system-centred" to "user-centred" since most of the issues around this content are related to making it accessible and usable for the real users. So, the user studies and involving the users on early stages of design and planning the functionality of the product which is being developed stands on leading position. Chapter 2 "REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts" describes the prototype of REGATTA (REGional Aggregator of heTerogeneous culTural Artefacts), aimed to present different types of collections, including museum collections, archaeological sites, immovable heritage from Ancient, Mediaeval and National Enlightenment periods in Bulgaria. The chosen approach supports the idea of preserving the valuable national monuments in the European area of culture keeping their identity and specificity. It was designed following the standards of Europeana and characteristics specified in the Bulgarian regulation for creating and managing of museum funds. The functional scheme, data model, and technological aspects in REGATTA are discussed. Currently

18

Access to Digital Cultural Heritage ...

the first application of the REGATTA is applied for Plovdiv Ethnographic Museum where artefacts from fund "Crafts" are presented and possibilities to make virtual tours in the museum are discussed. Chapter 3 "Automated Metadata Extraction from Art Images" focuses on content based image retrieval. Over the past decade, considerable progress has been made to make computers learn to understand, index and annotate pictures representing a wide range of concepts. The field of image retrieval has to overcome a major challenge: it needs to accommodate the obvious difference between the human vision system, which has evolved genetically over millenniums, and the digital technologies, which are limited within pixels capture and analysis. And the challenges are even bigger because when focus on analysis of the aesthetic and semantic content of art images. The content-based image retrieval is a technology that helps to organize digital images based on their content. This way, a variety of features of different level of conceptualization can be extracted. Typically, a content-based image retrieval system consists of three components: feature design, indexing, and retrieval. The feature design component extracts the visual feature(s) information from the images in the image database. The indexing component organizes the visual feature information to speed up the query or processing. The retrieval engine processes the user query and provides a user interface. During this process the central issue is to define a proper feature representation and similarity metrics. Chapter 4 "APICAS – Content-Based Image Retrieval in Art Image Collections Utilizing Colour Semantics" offers a succinct review of colour theory from different points of view. The rationale for that is the strong connection of any work on art paintings with the complex area of colour perception. Physiological grounds of this phenomenon are taken as a starting point for focusing the search within art painting images. A brief historical overview of attempts to define colour interconnections and mutual colour influences is made. Visual low-level features, which represent colour distribution in art images, were chosen as a ground for constructing higher-level concepts. The classification of harmonies and contrasts in accordance to Ittens' theory from the point of view of three main characteristics of the colour – hue, saturation and luminance, was made. The formal description of defined harmonies and contrasts was established. A method for extracting local features that capture local colour and texture information, based on tiling the image and applying vector quantization of MPEG-7 descriptors, calculated for the tiles of the image, has been described and implemented. A program system APICAS ("Art Painting Image Colour Aesthetics and Semantics") was developed in order to supply an appropriate environment for realizing proposed

Introduction

19

algorithms and for conducting experiments. A variety of experiments on the use of the system for different tasks (similarity search, user queries, predictive analysis) shows the vividness of proposed features as a step in the transition from Web 2.0 to Web 3.0. The final chapter "Automatic Metadata Generation and Digital Cultural Heritage" examines how automatic metadata generation methods can be used in the field of cultural heritage. An approach for indirect spatial data extraction by learning restricted finite state automata is proposed. The realized system InDES was tested over extraction of spatial metadata from websites and shows promised results. It give assurance that such approach can be used for metadata extraction from objects descriptions and this way to be used in the process of migration from older representations of the objects in the case when the descriptions are in non-structured form. One approach for association rule mining, which uses the possibilities of the multidimensional numbered information spaces as a storage structures is build. The algorithm ArmSquare is realized in data mining environment system PaGaNe. The possibilities of extracting frequent item-sets can be used for enforcing connections between metadata elements within created ontology. Based to similar techniques, but in the field of categorization are class-association rules algorithms. The created algorithm PGN, which is a kind of such algorithms, is also realized in PaGaNe. It was implemented in the field of analyzing semantic attributes, extracted from art images using content-based image retrieval. Within the frame of the data management and the access of the aggregator the classifier PGN can be used for enforcing information discovery.  Contributors of the Book This book brings together the efforts of multiple authors and their experiences in a number of domains. Krassimira Ivanova is a Senior Assistant in the Information System Department at Institute of Mathematics and Informatics – Bulgarian Academy of Sciences. Since 2011 she is a head of the laboratory "Digitisation of Scientific and Cultural Heritage". Her main topics of interests are Data Mining, Image Retrieval, Multimedia Semantics, Data Bases, and Information Systems applied in the areas of Digitization of Cultural Heritage, Disaster Risk Management, and Analysis of Economical Processes. She has published about 70 journal articles and conference peer-reviewed papers.

20

Access to Digital Cultural Heritage ...

Milena Dobreva is a Senior Lecturer in library, archival and information studies at the University of Malta. She was the principal investigator of EC, JISC and UNESCO funded projects in the areas of user experiences, digital cultural heritage and digital preservation. She has worked at IMI-BAS since 1990 where she earned her PhD degree in Informatics and Applied Mathematics and was the founding head of the first Digitization Centre in Bulgaria (2004). She also served as a chair of the Bulgarian national committee of the Memory of the World programme of UNESCO. Currently, she is a regular project evaluator for the EC and a number of national research bodies. She contributed to 5 books, was the lead author 5 textbooks on informatics for the secondary school, and published over 40 journal articles, technical reports and conference papers. Milena was awarded an Academic Award for young researchers (Bulgarian Academy of Sciences, 1998) and a honorary medal for contribution to the development of the relationships between Bulgaria and UNESCO (2006). Peter Stanchev is a Professor at Kettering University, Flint, Michigan, USA and Professor and Head of Information System Department at the Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Sofia, Bulgaria. He has published 2 books, more than 200 chapters in monographs, journal and conference peer-reviewed papers, more than 200 conference papers and seminars, and have more than 600 citations. His research interests are in the field of Multimedia Systems, Database Systems, Multimedia Semantics, and Medical Systems. Serving also on many database and multimedia conference program committees, he is currently editor of chief and on the editorial boards of several journals. George Totkov is a Professor and a Head of the Computer Science Department at Plovdiv University. Currently he is a Vice-rector on information infrastructure, quality systems and accreditation, and distance learning at Plovdiv University. His main topics of interests are computational linguistics, e-Learning, conceptual modelling, information systems, applied mathematics, etc. Prof. Totkov has been a principal investigator of over 40 national and international projects in computer science, e-learning and applications of IT in education. He authored over 200 scientific publications and 10 textbooks for the secondary and higher education. Kalina Sotirova is a Senior Assistant and PhD student in the Information System Department at the Institute of Mathematics and Informatics – Bulgarian Academy of Sciences. Her research interests are digitization of and online access to Cultural heritage, computer graphics, EduTainment, digital storytelling, initiatives for communicating-science.

Introduction

21

She has published over 10 scientific papers and is contributing to foster the creation of a national digitization strategy. Juliana Peneva is an Associate Professor at New Bulgarian University, Computer Science Department and at the Institute of Mathematics and Informatics of the Bulgarian Academy of Sciences. Her research interests include information modelling, database systems, software engineering, and e-Learning. Juliana Peneva has participated in over 15 research projects and authored over 60 publications. Stanislav Ivanov is an Associate Professor and Head of the Computer Science Department at New Bulgarian University. His research interests include object-oriented programming, computer aided design, computer graphics and e-Learning. Stanislav Ivanov took part in over 20 research projects and has more than 80 publications. Rositza Doneva is an Associate Professor at the Department of Computer Sciences, Plovdiv University (PU) and Manager of the Regional Distance Educational Centre at PU. She has been a principal investigator or member of over 20 national and international projects in computer science, electronic and distance learning and applications of IT in education. Her main scientific interests are in the fields of electronic and distance learning, object-oriented programming, information modelling, etc. Rositza Doneva is the author of over 70 scientific publications and 10 handbooks on computer science and information technologies for the secondary and higher education. Emil Hadjikolev is a Senior Assistant at the University of Plovdiv. His main topics of interests are information systems, digitization of cultural heritage, business process modelling, databases, programming, etc. Emil Hadjikolev authored over 10 scientific publications. George Vragov is a Senior Assistant in the Information Systems Department, Institute of Mathematics and Informatics – Bulgarian Academy of Sciences. He works at the branch of the Institute in Plovdiv. He has an extensive experience in the field of information modelling and research and their applications to different domains. His professional and scientific research interests include web-based systems for content management, digital libraries and digitization of cultural heritage. Elena Somova is an Associate Professor in the Computer Science Department at the University of Plovdiv. She earned her PhD degree in the field of e-learning in 2003. Her main topics of interests are distance education, e-Learning management systems, process modelling, and metadata extraction and generation in different fields including cultural

22

Access to Digital Cultural Heritage ...

heritage. Her scientific results are published in 35 papers and 6 books. Elena Somova participated in more than 20 international, national and university projects. Evgenia Velikova is an Associate Professor, currently Vice-Dean for the Bachelor's programs at the Faculty of Mathematics and Informatics of Sofia University "Saint Kliment Ohridsky". Her main research interests are in the field of algebra and its applications, such as coding theory and cryptography. Iliya Mitov is a Senior Assistant in the Information System Department at Institute of Mathematics and Informatics – Bulgarian Academy of Sciences. His main topics of interests are data mining, databases, and information systems applied in the areas of digitization of cultural heritage, disaster risk management, and analysis of economic processes. He has published about 70 journal articles and conference peer-reviewed papers. Koen Vanhoof is a Full Professor in Business Informatics at the University of Hasselt, Belgium. His major research interests lie in the areas of Data Mining, Statistics, Knowledge Engineering and Modelling, Computational Intelligence Methods, Decision Support Systems and Soft Computing Applications to Information Management, Marketing and Finance, Mobility and Traffic Safety. He has authored and/or co-authored over 60 peer-reviewed journal articles and about 7 book chapters and 60 conference papers. Currently he is Vice-Dean Research at the Faculty of Applied Economics and project leader of the Business Informatics research group at University of Hasselt. Benoit Depaire is an Assistant Professor Business Informatics at the University of Hasselt, Belgium. His research interests focuses on Data Mining, Data Analytics, Data Modelling and Statistics within the field of Business Studies. Recently, he expanded his research to the domain of Business Process Modelling and Business Process Mining. He has authored and/or co-authored over 15 peer-reviewed journal and conference papers. Dimitar Blagoev is an Assistant Lecturer at the University of Plovdiv and a Senior Assistant in the Information System Department at Institute of Mathematics and Informatics (Bulgarian Academy of Sciences). His main topics of interests are Computational Linguistics, Business Process Modelling, Data Mining and Image Retrieval.

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives Kalina Sotirova, Juliana Peneva, Stanislav Ivanov, Rositza Doneva, Milena Dobreva 1 Cultural Heritage The term Cultural Heritage (CH) designates a monument, group of buildings or site of historical, aesthetic, archaeological, scientific, ethnological or anthropological value. CH can be seen as of world, regional, national, or local importance. For example UNESCO World Heritage Convention [UNESCO, 1972] defines the cultural heritage of world value as "architectural works, works of monumental sculpture and painting, elements or structures of an archaeological nature, inscriptions, cave dwellings and combinations of features, which are of outstanding universal value from the point of view of history, art or science; …works of man or the combined works of nature and man, and areas including archaeological sites which are of outstanding universal value from the historical, aesthetic, ethnological or anthropological point of view". It is worth noting that CH is closely related to the concept of value – which is considered in two dimensions – scope of interest (from local to global) and particular area of contribution (historical, aesthetic, ethnological, anthropological). Another descriptive definition is provided by ICCROM 3 Working Group "Heritage and Society" [ICCROM, 2005], which states that CH is "the entire corpus of material signs – either artistic or symbolic – handed on by the past to each culture and, therefore, to the whole of humankind... 3

http://www.iccrom.org/

24

Access to Digital Cultural Heritage ...

include both the human and the natural environment, both architectural complexes and archaeological sites, not only the rural heritage and the countryside but also the urban, technical or industrial heritage, industrial design and street furniture… The preservation of the cultural heritage now covers the non-physical cultural heritage, which includes the signs and symbols passed on by oral transmission, artistic and literary forms of expression, languages, ways of life, myths, beliefs and rituals, value systems and traditional knowledge and know-how". Cultural heritage institutions – Galleries, Libraries, Archives, and Museums (or GLAM) vary in types and sizes across the globe, but in the last decade almost all of them use digital resources. The digital world is the fastest growing and changing world. The euphoria from the childhood of converting analogue information into digital format is gone. Using the digitized content to deliver new products and services in the creative and information industries justifies the efforts of many experts of various domains involved. When talking about CH e-display today there are two main actors, who define the requirements – his majesty the User and current technology standards. These requirements must be known and met when starting digitization, not on the following steps. This means that a digital object from specific collection in a GLAM institution is to be digitized, stored and presented for someone in comparison and in hierarchy with objects of the same type. Correct metadata is a must for any search engine, especially when rich functionality is the goal. Re-use and contextualising is crucial for cultural content and always was. There is no change in the principle of curation between institutional environment and its digital alternative. The means, richness and value are different, in favour of the web. The digital world makes contextualizing richer and easier, adding new layer to it – the layer of the user. If metadata standards and interoperability rules are followed, the user can create his own virtual collections in minutes, can learn the stories behind the object of his interest, can organize and re-use his personal collections, share with others, print, etc. Usually all these options are available for free. As a result from digitization a new type of rights evolved. The reason partially is because digitization is cross institutional and interdisciplinary process. It combines memory institutions who are CH holders, technology labs, and cultural policy makers/managers, whose legal concerns are very different. Rights management for digital heritage lies beyond the scope of this study. However as it concerns online cultural content we would like to point out that intellectual property rights has not been synchronized yet.

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

25

In proof of this statement several initiatives of WIPO 4, ICOM 5 , UNESCO and some other can be mentioned. The material is too complex and contradictory and has to take into consideration wide spectrum of issues of commercial, cultural, ethical, historical, moral, religious, or spiritual nature. Digital rights management for cultural content is a stumbling block for all creative industries, including GLAM institutions. Recent discussion about adoption of ACTA 6 shows it clearly. UNESCO puts an accent in identification of the legal frameworks that would facilitate long-term digital preservation of CH. Building such a framework is one of the five goals in upcoming UNESCO conference – "The Memory of the World in the Digital age: Digitization and Preservation" (26-28.09.2012, Vancouver, Canada). In 2011 WIPO and ICOM signed a Memorandum of understanding, trying to build common discussion framework for access to and dissemination of digitized cultural artefacts [WIPO and ICOM, 2011]. More specifically, they are trying to manage intellectual property and copyright issues. UNESCO and WIPO pay special attention to intangible heritage, where legal issues are even more complicated. Within the Creative Heritage Project 7 WIPO defines guidelines for managing IP issues when recording, digitizing and giving access to this intangible heritage. Public or private parties may use the ICOM-WIPO Art and Cultural Heritage Mediation Process including but not limited to States, museums, indigenous communities, and individuals. The scope of ICOM-WIPO Mediation covers disputes relating to ICOM's areas of activities – digitization, deposit, acquisition, intellectual property, etc. In 2011 the General Assembly of the Europeana Council of Content Providers and Aggregators presented its Licensing Framework [Keller, 2011] and tried to solve the orphan works issue within the framework of the Europeana project. Other possible approaches to orphan works are examined here [Hansen, 2012]. The intellectual property questions are also in the scope of the work of MinervaEC working group [MINERVA IPR]. The Google Book Search Library Project provoked some key questions about infringing reproduction and fair use under copyright law [Manuel, 2009].

4 5 6

7

http://www.wipo.int/ http://icom.museum/ http://www.europarl.europa.eu/news/en/headlines/content/20120220FCS38611/html/ Everything-you-need-to-know-about-ACTA http://www.wipo.int/freepublications/en/tk/934/wipo_pub_l934tch.pdf

26

Access to Digital Cultural Heritage ...

Copyright for cultural content in Bulgaria is regulated in two main legal documents – Bulgarian Cultural Heritage Act 8 and Bulgarian Copyright and Neighbouring Rights Act 9. All the initiatives mentioned show that free distribution, sharing and open access to both tangible and intangible heritage depend currently mostly on legislation, and not on technology.

2 The Three Building Blocks of Digital Heritage There are three basic activities which are vital for creating, using and sustainment of digital heritage, namely digitisation, access and preservation. The first one is the digitization. It is the process of converting analogue objects into digital form. For the new objects that do not have an analogue original but are digitally born, this step is replaced by the process of creating this object as it is. The second element is providing access to the digital heritage. This not only means that the users can "see" an object – but first of all they should have efficient and intuitive resource discovery tools. The third part is assuring long-term preservation for digital objects – which guarantees that digital objects created in the past are available now and also in the future. This not only means that the objects are physically intact, but also that they can be rendered and actually used.

2.1

Digitization

According to Merriam-Webster Dictionary the first known use of the verb digitize dates from 1953. Nowadays meaning of digitization is "conversion of analogue information in any form (text, photographs, voice, etc.) to digital form with electronic devices (scanners, cameras, etc.) so that the information can be processed, stored, and transmitted through digital circuits, equipment, and networks". Other meaning is: "integration of digital technologies into everyday life by the digitization of everything that can be digitized" 10 . The second definition is wider and applies fully to CH. New terms appeared as a result of mass digitization at the beginning of XXI century – "digital heritage", “digital humanities” and "digital curation".

8

http://mc.government.bg/files/635_ZKN.doc (in Bulgarian) http://solicitorbulgaria.com/index.php/bulgarian-copyright-and-neighbouring-rights-act 10 http://www.businessdictionary.com/ 9

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

27

Copyright, authorship and intellectual property rights concern three pillars of DH, but as for digitization there are four points which need clarification at the start of each digitization project – who and under what conditions holds the rights for e-storage, e-presentation, e-access and e-distribution (incl. e-commerce) of digitized artefacts. Digitization techniques depend on the type of object – text, photograph, architecture, audio, video etc. Digitization technology consists of specialized hardware, software, and networks; technical infrastructure includes protocols and standards, presupposes policies and procedures (for workflow, maintenance, security, upgrades, etc.). For example, in digitizing art collection, interesting results have been achieved by using not only photography and video, but X-ray, 3D and laser scans, infrared, and UV [Chen et al, 2005]. One comprehensive survey on this direction is proposed by David Stork [Stork, 2008]. In the field of digitizing 3D objects reality-based surveying techniques (e.g. photogrammetry, laser scanning, LIDAR technology, etc.) employ hardware and software to metrically survey the reality as it is, documenting in 3D the actual visible situation of a site by means of images, range-data, CAD drawing and maps, classical surveying (GPS, total station, etc.) or an integration of the aforementioned techniques [Manferdini and Remondino, 2010]. It is worth mentioning here the contribution to digitization in Bulgaria of the Institute of Mathematics and Informatics (IMI-BAS). In 2002 the EC-funded KT-DigiCULT-BG project implemented in IMI-BAS led to the opening of digitization centre within the institute. Currently the digitization infrastructure at IMI features 2 professional Zeutschel scanners for scanning manuscripts, books, newspapers, graphics, maps and large formatted documents. There are scanned more than 100 000 documents used for reconverting a variety of artefacts, such as state archives or personal archives of prominent Bulgarian scientists; old printed Bulgarian books from 17 to 19 century (from the funds of the National Library "Ivan Vazov", Plovdiv); periodicals from the beginning of the 20th century; architectural photographic collections, etc. The centre also digitised the archive volumes of Serdica Mathematical Journal and PLISKA Studia Mathematica Bulgarica and this Bulgarian mathematical heritage can be now included as an integral part of the World Digital Mathematical Library.

28

2.2

Access to Digital Cultural Heritage ...

Access

Access to digital cultural heritage means first of all efficient tools for resource discovery. The efforts for developing metadata schemas basically serve this domain because without high quality metadata, the discovery of digital objects is impossible. One particularly interesting recent trend is the use of content-based information retrieval methods for cultural heritage. For example the project AXES 11 works on methods for generating metadata on video and audio objects, using image analysis, speech analysis and OCR of subtitles in videos. This is an example of an integrated project, which brings together several different methods for content based retrieval. In IMI-BAS the team of Radoslav Pavlov works on digital content management. It developed IMI-MDL12 which offers a rich environment for creating different types of collections featuring folklore, Bulgarian traditions and Bulgarian culture artefacts. This environment caters for interoperability among many different applications. Currently, using established IMI-DLMS, several digital libraries are being created – Virtual Encyclopaedia of Bulgarian Icons 13, Folklore DL14 , Encyclopaedia Slavica Sanctorum 15, Bulgarian Folklore Artery, which allows virtual presentation of Bulgarian Folk Cultural Heritage using advanced knowledge-based technologies. The creation of knowledge-based technologies is not merely based on the development of appropriate models and implementations, but requires new business models which address the issues of value for the users, and appropriate cost revenue models in the case of creation and delivery of digital objects for the cultural heritage domain. For example, the Metropoliten Museum of Art (MET) is an exemplary case how a museum should work in digital environment and use the web technologies for its own benefit. Business model of WEB 2.0 used widely from MET and other American museums is based on open access to free content (text, video, photo, music) generated by museum visitors in social networks. User generated content is highly regarded by people, because it is free of institutional and other type of control and policy. MET is one of only a handful of museums that have created a comprehensive online access to all catalogued works. As part of a broader effort to support its commitment to online visitors and build and encourage its 11 12 13 14 15

http://www.axes-project.eu/ http://mdl.cc.bas.bg/ http://bidl.cc.bas.bg/ http://folknow.cc.bas.bg/ http://www.eslavsanct.net/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

29

relationship with them, MET also operates email marketing and social media programs that provide content and interactive experiences. More than one million fans, followers, and subscribers interact with the MET daily on Facebook, Twitter, Flickr, Tumblr, ArtBabble, iTunes U, and YouTube. MET is also a founding member of the Google Art Project, which draws Google's broader Internet audience straight to its galleries and collections [Campbell and Rafferty, 2011]. Unfortunately, many Bulgarian GLAM institutions are still not using in its entirety the potential of WEB 2.0, and in some cases even of routine Internet applications. Out of 200 museums and galleries in Bulgaria, 81 have websites, but in more than 90% of the cases those websites are not linked to WEB 2.0 [Sotirova, 2011].

2.3

Preservation

Digital preservation (DP) is defined by the DigitalPreservationEurope project as "a set of activities required to make sure digital objects can be located, rendered, used and understood in the future". 16 The term "digital curation" is often used in parallel with the term digital preservation but it addresses "maintaining, preserving and adding value to digital research data throughout its lifecycle". 17 As [Lavoie and Dempsey, 2004] argued: "The long-term future of digital resources must be assured, in order to protect investments in digital collections and to ensure that the scholarly and cultural record is maintained in both its historical continuity and media diversity… The digital preservation is not just a mechanism for ensuring bit sequences created today to be renderable tomorrow, but also is a process operating in concert with the full range of services supporting digital information environments, as well as the overarching economic, legal, and social contexts". The strategic role of DP in the knowledge economy and eInfrastructures is explicitly stated in high level policy documents of the European Commission. In 2009, the DPimpact report emphasized that "From a strategic point of view, the most relevant strength of DP is its potential multiplier effect on a key resource (born-digital content) for the knowledge economy" [DPimpact, 2009]. This is further elaborated to "integration of organisational policies in technological implementations" as well as "interesting technological developments, such as more automated

16 17

http://www.digitalpreservationeurope.eu/what-is-digital-preservation/ http://www.dcc.ac.uk/digital-curation/what-digital-curation (the emphasis is on research data, and in addition to preservation, it also addresses enhancement of the research data)

30

Access to Digital Cultural Heritage ...

and scalable DP tools, increased capacity of support infrastructures, tools and procedures for addressing high volume, dynamic, volatile and shortlived content, as well as for re-using preserved content" [DPimpact, 2009]. DP has to address two major problems: (1) the physical deterioration (the digital media is very vulnerable to deterioration and catastrophic loss); and (2) the digital obsolescence (the advantages of introducing new hardware and software technologies are coupled with the disadvantages of older ones becoming obsolete, i.e. unusable on the new platforms). The rather limited funding dedicated to preservation in the cultural heritage sector currently coexists with a significant investment into production of digital resources. The NUMERIC project gathered data on digitisation across Europe and summarised that "European institutions reported investment of €80 million annually in the digitisation of their collections, inferring a significant level of expenditure within the whole of the European cultural arena" [NUMERIC, 2009]. This survey-based estimate is not presenting the total real expenditure on digitisation across the EU countries, but is indicative on the scale of annual investment across cultural and scientific heritage institutions. With regard to preservation "of the 262 survey responders who had formulated digitisation plans, 150 (57%) confirmed that these included considerations for the long term preservation of their digitised assets" [NUMERIC, 2009]. Considerations for long term preservation do not yet mean active implementation and an alarming proportion (nearly half) of the institutions are in fact not prepared for DP. In 2006, the Online Computer Library Center developed a four-point strategy for the long-term preservation of digital objects that consisted of [OCLC, 2006]: 

Assessing the risks for loss of content posed by technology variables such as commonly used proprietary file formats and software applications;



Evaluating the digital content objects to determine what type and degree of format conversion or other preservation actions should be applied;



Determining the appropriate metadata needed for each object type and how it is associated with the objects;



Providing access to the content.

Several different complementary strategies are applied in order to assure long-term preservation of digital objects, such as: refreshing (the transfer of data between two types of the same storage medium assuring

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

31

prevention from physical deterioration); migration (transferring of data to newer system environments – changing of file formats, of programming languages, of operating systems, etc., which try to prevent digital obsolescence); replication (creating duplicate copies of data on one or more locations – which assures bigger chance of data to survive, but introduces difficulties in refreshing, migration, versioning, and access control); emulation (replicating of functionality of an obsolete system – applications, operating systems, or hardware platforms). A number of models have been proposed that describe the life-cycle of digital preservation tasks. The pivotal standard in the domain – ISO 14721 – widely known as the OAIS reference model presents a functional framework with main components and basic data flows within a digital archive system [OAIS, 2002]. It defines six functional entities which synthesise the most essential activities within a digital archive: ingest, preservation planning, archival storage, data management, administration and access. The DCC Digital Curation Life-Cycle Model18 presents these core digital preservation activities in wider context that includes also appraisal and disposal.

3 The Importance of Metadata In order to be easily retrieved, shared and used from different users and for different purposes, various types of e-documents have to be described following common schemas and rules e.g. specifications/standards and metadata. The term metadata e.g. data about data is used differently ranging from machine understandable information through records that describe electronic resources. In a library, "metadata" applies for any kind of resource description. Metadata describe how and when and by whom a particular set of data was collected, and how the data is formatted. Metadata is essential for understanding information stored in data warehouses and has become increasingly important in XML-based Web applications 19. In addition they ensure the accessibility, identification and retrieval of resources. Descriptive metadata facilitate the resources" organization, interoperability and integration, provide digital identification and support archiving. Poor quality or non-existent metadata mean that resources remain invisible within a repository or archive thus becoming undiscovered and inaccessible. In the case of digital assets, metadata

18 19

http://www.dcc.ac.uk/resources/curation-lifecycle-model http://www.webopedia.com/TERM/M/metadata.html/

32

Access to Digital Cultural Heritage ...

usually are structured textual information that describes something about the creation, content, or context of an image 20. There are several types of metadata: 

descriptive – title, author, extent, subject, keywords;



structural – unique identifiers, page numbers, special features (table of contents, indexes);



technical – file formats, scanning dates, file compression format, image resolution;



preservation – archival information;



legislative – digital rights management (ownership, copyright, license Information.) Metadata can be stored in three different ways:



separately as a HTML, XML or MARC21 (format for library catalogues) document linked to the resource;



in a database linked to the resource;



as an integral part of the record in a database or embedding the metadata in the Web pages.

Nevertheless that the importance of metadata has been recognized, means for efficient implementation still lack. Due to the rapid growth in digital object repositories and the development of many different metadata standards metadata implementation is complex. On the other hand quality metadata can be produced by experts in the subject domain only. So far, most of the resource discovery metadata are still created and corrected manually either by authors, depositors and/or repository administrators. It appears attractive to auto-generate metadata with no human intervention. Recent research findings are reported in [Polfreman and Rajbhandaji, 2008] and [Greenberg et al, 2005]. In order metadata to be processed via computer, proper encoding has to be applied. This is done by the addition of markup to a document to store and transmit information about its structure, content or appearance. Schemas comprise metadata elements designed to describe particular information. We can mention the following encoding schemas concerning how metadata is presented: 

HTML (Hyper-Text Markup Language);



XML (eXtensible Markup Language);



RDF (Resource Description Framework);



MARC (Machine Readable Cataloguing);



SGML (Standard Generalized Markup Language).

20

http://www.jiscdigitalmedia.ac.uk/crossmedia/advice/metadata-overview/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

33

Metadata schemas can be viewed as standards describing the categories of information to be recorded. They ensure consistency in metadata application, support interoperability of applications and resource sharing. Schemas are built from individual components, i.e. metadata elements. Depending on the element definition each element contains a particular category of information. Certainly not all schemas contain the same elements as the needs of users differ [Peneva et al, 2009].

4 Metadata Schemas and Standards Used in Cultural Heritage Accordingly to the comprehension of VRA-web Community 21 data standards promote the consistent recording of information and are fundamental to the efficient exchange of information. They provide the rules for structuring information, so that the data entered into a system can be reliably read, sorted, indexed, retrieved, communicated between systems, and shared. They help protect the long-term value of data. Practically, metadata is data about data, because of this it is considered as subset of data content. The identification and management of metadata is important to facilitate access to wide ranges of materials over networks. This is particularly important because of the rapid development of resources on the WWW. According to content specifics there are four types of standards concerning: data structure, data content, data value, and data communication. 

Data structure standards deal with the definition of a record and the relationship of the fields within it;



Data content standards are standards for describing metadata associated with digital copies of material culture. Examples of such standards are the Dublin Core and VRA Core;



Data value standards contain a description of concepts and relations between them in the field of cultural heritage. Typical examples in this respect are thesauri built from Getty Research Institute – AAT, ULAN, TGN;



Data communication/record interchange standards and protocols define the technical framework for exchanging information work between systems. As example, the MARC standard is a hybrid of a data structure and an information exchange standard.

In addition, standards can be divided taking into account the application area they serve. So, they fall into several groups: common standards; standards for resource discovery; specific standards for 21

http://www.vraweb.org/resources/datastandards/faqs.html

34

Access to Digital Cultural Heritage ...

libraries, archives and museums; other standards, relevant to cultural heritage. Of course, such division is rather arbitrary, since some standards in the process of development have expanded from service specific sites to cover a wider spectrum of application areas.

4.1

Common Standards

As [Doerr and Stead, 2011] said "there is a set of rich conceptual models or core ontologies of relationships for the digital world that are completely integrated and cover, in a complementary way, a vast spectrum of key conceptualizations for memory institutions and the management of digital content. Such core ontologies of relationships are fundamental to schema integration and play a vital role in practical knowledge management completely different to the role played by specialist terminologies. The vision is not merely to aggregate content with finding aids, as current DLs do, but to integrate digital information into large scale, trans-disciplinary networks of knowledge. These networks support not only accessing source documents, but also using and reusing the integrated knowledge embedded in the data and metadata themselves while managing the increasingly complex digital data aggregates and their derivatives". Complexity of CH objects requires constant extension of common metadata standards and domain ontologies. Therefore common standards listed below are under permanent development.  CIDOC-CRM CIDOC-CRM 22 (Conceptual Reference Model) provides an objectoriented model with 148 hierarchical classes, more precisely formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation. CIDOC-CRM is intended to be a common language for domain experts and implementers to formulate requirements for information systems. The CIDOC-CRM is result of the efforts of CIDOC Documentation Standards Working Group and CIDOCCRM SIG which are working groups of CIDOC. Since 2006 it is official standard ISO 21127:2006, last updated in 2010. One of the goals of CIDOC-CRM was to create common and extensible semantic framework that any heritage information source can use and develop further. The CIDOC-CRM was developed by a Working Group of the International Committee for Documentation of International Council of Museums (ICOM). It concentrates on the definition of relationships, rather than terminology, in order to allow homogeneously accessing 22

http://www.cidoc-crm.org/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

35

heterogeneous database schemata and metadata structures, the migration between such sources and merging the information they contain. The meaning of its concepts and relationships were constructed by the analysis of hundreds of relevant data structures used by memory institutions, initially from museums. This led to a compact model of 86 classes and 134 relationships, easy to comprehend and suitable for service as a basis for mediation of cultural and library information. The model has recently enjoyed rapid uptake in large-scale information aggregation projects.  FRBROO Standard Functional Requirements for Bibliographic Records (FRBR) 23 of International Federation of Library Associations (IFLA) since 1998 is "object-related model" of metadata for bibliographic descriptions. The working groups of CIDOC-CRM and FRBR have come together and developed, between 2003 and 2008, a conceptual model capturing the concepts of FRBR as core ontology (FRBROO) and integrated it with the CRM in a modular way. The model captures in an ontologically rigorous manner the aggregation of intellectual content by origin and derivation, as intended by FRBR, and formalizes the documentation of performing arts. The model was jointly approved by IFLA and ICOM in 2009.  Europeana Data Model (EDM) Europeana is a very large-scale metadata repository and aggregation service for all kinds of cultural heritage information from Europe. EDM reuses elements from Dublin Core, CIDOC-CRM, FRBROO and Ontology Rule Editor (ORE) 24 . It provides powerful abstractions even over Dublin Core and CIDOC-CRM concepts that will ensure sufficient recall when accessing this vast collection.  VRA Core VRA Core 25 has been a standard of Visual Resources Association's Data Standards Committee since 1982, aimed to describe the visual objects of cultural heritage. It contains 13 categories with 119 metadata elements. It consists of a metadata element set (units of information such as title, location, date, etc.), as well as an initial blueprint for how those elements can be hierarchically structured. The element set provides a categorical organization for the description of works of visual culture as well as the images that document them. The standard is used in museums, visual 23 24 25

http://www.ifla.org/en/publications/functional-requirements-for-bibliographic-records/ http://ore.sourceforge.net/ http://www.vraweb.org/projects/vracore4/index.html

36

Access to Digital Cultural Heritage ...

resources collections, archives and libraries for art and architecture, archaeological sites and more.

4.2

Standards for Resource Discovery

Metadata is an essential part of any digital resource and theis main purposes are to be used in the process of resource discovery. If resources are to be retrieved and understood in the distributed environment of the WWW, they must be described in a consistent, structured manner suitable for processing by computer software 26.  Dublin Core Dublin Core (DC) 27 is definitely the most popular standard developed by the Dublin Core Metadata Initiative in 1995. The standard contains in its basic part (dc namespace) only 15 elements: contributor, coverage, creator, date, description, format, identifier, language, publisher, relation, rights, source, subject, title and type. Each is optional and repeatable, and may appear in any order the creator of the metadata wishes. This simple generic element set is applicable to a variety of digital object types. It is used for the description of simple textual or image resources. For richer descriptions to enable more refined resource discovery, Qualified Dublin Core has been developed. This standard employs additional qualifiers to the basic 15 elements to further refine the meaning of an element. Qualifiers increase the precision of the metadata. It owns 7 additional groups with 126 metadata elements.  OAI-PMH Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) 28 was established in 2002 and represents a protocol for metadata collection. It is directly connected with Dublin Core and XML. The Open Archives Initiative works for effective dissemination of interoperability standards and promotes open access and institutional repository improvements. The aim of OAI-PMH is to facilitate broad access to digital resources for eScholarship, eLearning, and eScience.  DOI DOI (Digital Object Identifier) 29 is an ISO International Standard, which provides a framework for the identification and management of digital content networks providing persistence and semantic 26 27 28 29

http://www.ukoln.ac.uk/qa-focus/documents/briefings/print-all/metadata/ http://dublincore.org/ http://www.openarchives.org/ http://www.doi.org/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

37

interoperability. The system is managed by the International DOI Foundation, an open membership consortium including both commercial and non-commercial partners. Over 50 million DOI names have been assigned by DOI System Registration Agencies in the US, Australasia, and Europe. Using DOI names as identifiers makes managing intellectual property in a networked environment much easier and more convenient, and allows the construction of automated services and transactions.

4.3

Specific Standards

The core ontologies are generic across a set of domains. The domain ontologies express conceptualizations that are tuned for specific area. 4.3.1

Standards for Libraries

Such standards enable the maintenance of standardized bibliographic descriptions.  MARC 21 MARC 21 (MAchine-Readable Cataloguing) 30 is probably the most popular and widespread standard, consisting of multiple sub-standards developed by the Library of Congress in 1999. MARC contains 5 substandards: for Bibliographic Data, Authority Data, Holdings Data, Classification Data, and Community Information. MARC is standard for the representation and communication of bibliographic and related information in machine-readable form.  METS METS (Metadata Encoding & Transmission Standard) 31 was created by the Digital Library Federation in 2007. It is maintained in Network Development and MARC Standards Office of the Library of Congress, USA. The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library expressed using the XML schema language. It contains 33 XML elements located in the tree-like structure with 158 attributes.  MAB2 MAB232 is the standard of the German National Library since 2001 and contains many sub-standards as standards for: bibliographic data, personal names, corporate names, titles, local data, addresses and library data and classification and notation data. 30 31 32

http://www.loc.gov/marc/ http://www.loc.gov/standards/mets/ http://www.d-nb.de/eng/standardisierung/formate/mab.htm

38

Access to Digital Cultural Heritage ...

 MODS MODS (Metadata Object Description Schema) 33 has been the standard of Library of Congress since 2008. This is an XML schema for descriptive metadata compatible with the MARC 21 bibliographic format. It includes a subset of MARC fields and uses language based tags rather than the numeric ones used in MARC 21 records.  MIDAS MIDAS 34 can be defined as a specific standard for a description of historical heritage. MIDAS was developed by English Heritage in 2008 to document buildings, archaeological sites, shipwrecks, artefacts and so on. 4.3.2

Standards for Archives

These standards provide common metadata records for archival descriptions regardless of the physical media on which documents are located.  ISAD(G) ISAD(G) (General International Standard Archival Description) 35 is a standard from 1994 of International Council on Archives (Canada), and contains 26 metadata in 7 categories.  ISAAR (CPF) ISAAR (CPF) (International Standard Archival Authority Record for Corporate Bodies, Persons and Families) 36 is analogous to previous standard for Australia developed by the Committee on Descriptive Standards in 2003  DACS DACS (Describing Archives: a Content Standard) 37 , adopted by the Society of American Archivists in 2004, is the American analogue of ISAD(G) and ISAAR (CPF). DACS contains 31 metadata in 10 categories, and consists of set of rules for describing archives, personal documents and collections of manuscripts.

33 34 35 36 37

http://www.loc.gov/standards/mods/ http://www.english-heritage.org.uk/server/show/nav.8331 http://www.ica.org/en/node/30000 http://www.icacds.org.uk/eng/ISAAR(CPF)2ed.pdf http://www.archivists.org/governance/standards/dacs.asp

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

39

 EAD EAD (Encoded Archival Description) 38 of Society of American Archivists and MARC Standards Office of the Library of Congress has been the standard since 2002 for a description of archives and collections and coding of documents. EAD was developed as a way of marking up the data contained in finding aids so that they can be searched and displayed online. In archives and special collections, resources are described via a finding aid. Finding aids differ from catalogue records by being much longer, more narrative and explanatory, and highly structured in a hierarchical fashion. They generally start with a description of the collection as a whole, indicating what types of materials it contains and why they are important. The finding aid describes the series into which the collection is organized and ends with an itemization of the contents of the physical boxes and folders comprising the collection. 4.3.3

Standards for Museums

Standards for museums provide adequate systems for metadata description of museum objects.  CDWA CDWA (Categories for the Description of Works of Art) 39 is established by the College Art Association in 1990. It consists of 31 categories with 505 metadata for description of artworks (objects and images). The standard has a lightweight version CDWA Lite. FDAGuide 40 of Foundation for Documents of Architecture from 1994 is an expansion of CDWA which is intended to describe the architectural documents and contains 92 metadata, split into 5 categories. The standard Object ID 41 of John Paul Getty Trust since 1999 is a small subset of CDWA. The standard Museumdat 42 was created by the Institut fur Museumsforschung in 2006 for extraction and automatic publication of basic metadata in the museum gates. The standard is a summary of CDWA Lite and consists of 5 categories with 114 metadata.

38 39 40 41 42

http://www.archivists.org/saagroups/ead/aboutEAD.html http://www.gettytrust.us/research/conducting_research/standards/cdwa/ http://www.getty.edu/research/conducting_research/standards/fda/ http://icom.museum/objectid/ http://museum.zib.de/museumdat/museumdat-v1.0-en.pdf

40

Access to Digital Cultural Heritage ...

 SPECTRUM The standard SPECTRUM 43 was developed by museums in Britain in 2007. Because of the bulky character of the standard (it contains 481 metadata) a lighter version SPECTRUM Essentials was developed for small museums. Besides metadata, SPECTRUM contains a description of the 21 museum procedures, accompanied by the necessary supporting data.  LIDO LIDO (Light Information Describing Objects) 44 is a new standard from 2009, established on the basis of CDWA Lite, CIDOC CRM, Museumdat and SPECTRUM, and consists of 12 categories with 75 metadata. The standard is used by Athena Project.

4.4

Other Standards Relevant to Cultural Heritage

Certain standards are specialized for other purposes, but indirectly concern CH area. Thus MPEG family standards, which describe multimedia objects, fall into CH scope. Other example is standards, which are specialized in geospatial data, which are used for contextualising CH objects in geographic place.  MPEG Family The ISO/IEC Moving Picture Experts Group (MPEG) 45 has developed a suite of standards for coded representation of digital audio and video. Two of the standards address metadata: MPEG-7, Multimedia Content Description Interface (ISO/IEC 15938), and MPEG-21, Multimedia Framework (ISO/IEC 21000). MPEG-7 defines the metadata elements, structure, and relationships that are used to describe audiovisual objects including still pictures, graphics, 3D models, music, audio, speech, video, or multimedia collections. MPEG-7 is not interested in the ways of encoding and storage of descriptors. Depending on the degree of abstraction, descriptors are extracted in different ways – most low-level features are extracted by automatic means, such as high-level require more user interaction. The vision for MPEG-21 is to define a multimedia framework to enable transparent and augmented use of multimedia resources across a wide range of networks and devices used by different communities. MPEG-21 defines a standard for sharing of digital rights, permissions and restrictions for digital content creator of content to its users. MPEG-21 as

43 44 45

http://www.collectionstrust.org.uk/spectrum http://www.athenaeurope.org/getFile.php?id=535 http://www.mpeg.org/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

41

an XML-based standard aims to collect information on rights of access to digital information. One purpose of the introduction of this standard is the hope that the industry will end illegal file sharing, and that he would rather represent "a normative open framework for multimedia delivery and consumption to be used by all participants in the chain. This open framework will provide content creators, producers, distributors and service opportunities in the existing MPEG 21 free market" [MPEG 21, 2005].  CSDGM CSDGM (Content Standard for Digital Geospatial Metadata) 46 is a metadata schema for geospatial datasets comprising topographic and demographic data, geographic information systems (GIS), and computeraided cartography base files. An international standard, ISO 19115, Geographic Information – metadata was issued in 2003. The objectives of the standard are to provide a common set of terminology and definitions for the documentation of digital geospatial data. The standard establishes the names of data elements and compound elements (groups of data elements) to be used for these purposes, the definitions of these compound elements and data elements, and information about the values that are to be provided for the data elements [CSDGM, 1998].

5 Digital Library According to the definition, given by ECDL 200547 "A digital library is a library in which collections are stored in digital formats (as opposed to print, microform, or other media) and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks". As was discussed in the workshop of the international conference TPDL 2011: "Digital Libraries are information systems and their technology can be researched as such. They are also organizations and they can be researched also in that respect. They are arenas for information seeking behaviour and for social processes such as learning and knowledge sharing, which can be another dimension of research. They are collections of content that need curation. They are social institutions with a social mandate, and as such they are affected by social, demographic and legal issues".

46 47

http://www.fgdc.gov/metadata/csdgm/ http://www.ecdl2005.org/

42

Access to Digital Cultural Heritage ...

5.1

Basic Definitions

Usually any collection of digital objects is called a repository. During the last five years different types of repositories ranging from subjectbased digital collections through e-journals up to collaborative learning environments have been built. However what is the difference from other datasets as directories, operational databases, catalogues, and portals? Currently there is no a clear definition of the repository concept. For the purposes of this book and in the context of cultural heritage, we are linking the terms repository, library and aggregator in the following way: 

A repository consists of digital objects, organized in collections sets, which are stored in managed in computer networks. Both digital library and aggregator are repositories;



Library is a fully packed repository, with relevant user interface and services. Digital library is domain and institutionally specific;



Aggregator is a depository, which ingests and manages digital content from GLAM source into a repository. It does not obligatory have user oriented interface; does not provide services; is not obligatory a heritage holder. Aggregator could be only a technical mediator between the holder institution and its digital library. The process of data ingesting/management follows technical and technological requirements of a specific project.

The basic elements in these structures are digital objects. In [Kahn and Wilensky, 1995] digital objects are defined as "a data structure whose principal components are digital material, or data, plus a unique identifier for this material, called a handle (and, perhaps, other material)". This definition further evolved to capture access rules to use the object and metadata for description of the content [Lagoze, 1995]. Following these definitions digital objects can be referred as entities together with their metadata, and the services they offer to the clients. Generally speaking a digital repository can be considered as means of handling digital content. Thus they may include a wide range of content for a variety of purposes and users. What goes into a repository depends on decisions made by each institution or administrator. The peculiarities of digital repositories that distinguish them from other digital collections are summarized in [Heery and Anderson, 2005]. In addition an attempt to develop a classification of repositories is also proposed. According to Heery and Anderson repositories can be typified by content (corporate records, e-theses, learning objects, research data), by coverage (personal, institutional, national, journal), by users (learners,

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

43

researchers, teachers, etc.) and by function (access, preservation, dissemination, reuse). In JISC 48 two more features of the repositories, namely policy (persistence, deposit, access) and infrastructure – centralized versus distributed have been taken into account. It is very important to determine the content and scope of any repository because this is the way to define the managerial policies.

5.2

The Contemporary Models of Digital Libraries

Contemporary digital libraries (DL) are trying to deliver richer and better functionality, which usually is user oriented and depending on current IT trend. The uniqueness among DLs nowadays is not only in that technological side, which is under constant development, but in the content. As for CH domain, its' content is very complex and as a rule – interactive. This explains more complex technological requirements for building DLs in CH domain. The technical requirements in presentation of digitized content are:

cultural



well structured digital library and personalized access to it;



rich functionality; easy management, incl. metadata and knowledge management;



Web 2.0 tools: creation of user-oriented objects grouping ("personal collections") and complex objects;



Web 3.0 services: advanced search (standard, semantic, contextual).

There are several reference models in use, which satisfy the requirements above. We will put the accent on three of them, which were chosen to represent the main trend in the construction of DLs in the last decade.  OAIS Flexibility among collections is a key feature. Accordingly GLAM repository is to offer a proper infrastructure with a well defined range of services. A high level archival model to act as a framework is necessary. In 2002 the Consultative Committee for Space Data Systems (CCSDS) prepared a Blue book with technical recommendations establishing a common framework of terms and concepts which comprise an Open Archival Information System (OAIS) [OAIS, 2002] [OAIS, 2009]. Later OAIS was adopted as international standard ISO 14721:2003 49 Space data and information transfer systems – Open archival information 48 49

http://www.ukoln.ac.uk/repositories/digirep/index/Typology http://www.iso.org/iso/rss.xml?csnumber=24683&rss=detail

44

Access to Digital Cultural Heritage ...

system – Reference model. This model can be successfully implemented as common framework for application areas such as CH and GLAM institutions.

Figure 1. OAIS Functional Entities [OAIS, 2009]

The functional schema of OAIS (Figure 1) contains six entities and related interfaces. Ingest functions include receiving Submission Information Packages (SIPs), performing quality assurance on SIPs, generating an Archival Information Package (AIP), extracting Descriptive Information from the AIPs for inclusion in the archive database, and coordinating updates to Archival Storage and Data Management. Archival Storage provides the services and functions for the storage, maintenance and retrieval of AIPs. Data Management provides the services and functions for populating, maintaining, and accessing both Descriptive Information which identifies and documents archive holdings and administrative data used to manage the archive. Data Management functions include administering the archive database functions (maintaining schema and view definitions, and referential integrity), performing database updates (loading new descriptive information or archive administrative data), performing queries on the data management data to generate result sets, and producing reports from these result sets. Administration provides the services and functions for the overall operation of the system. Preservation Planning provides the services and functions for monitoring the environment of the OAIS and providing recommendations to ensure that the information stored in the OAIS remains accessible to the Designated User Community over the long term, even if the original computing environment becomes obsolete. Access provides the services and functions that support Consumers in determining the existence,

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

45

description, location and availability of information stored in the OAIS, and allowing Consumers to request and receive information products. Access functions include communicating with Consumers to receive requests, applying controls to limit access to specially protected information, coordinating the execution of requests to successful completion, generating responses (Dissemination Information Packages, result sets, reports) and delivering to Consumers [Ivanova, 2011]. Evaluations concerning the usability of OAIS to build different kind of digital repositories are given in [Allinson, 2006].  DELOS DLRM DELOS (DLRM) is a result of many meetings of cross-domain experts in the frame of EC funded project DELOS [DELOS DLRM, 2007]. The aim of the project is to achieve expert consensus for fundamental concepts, definitions and structures in the field of digital libraries (DL). The model was created by European research groups with experience in the field of DL, which are part of the DELOS Network of Excellence 50. The model has to be considered as a common frame, followed by institutions which create, develop and maintain DLs, so that interoperability requirements are met. Because of the complex character of DL and the diversity of digital world DELOS DLRM undergoing continuous development. In the ground of the model lays three concepts: Digital Library (an organization, which might be virtual, that comprehensively collects, manages, and preserves for the long term rich digital content, and offers to its user communities specialized functionality on that content, of measurable quality and according to codified policies.), Digital Library System (a software system that is based on a defined architecture and provides all functionality required by a particular Digital Library. Users interact with a Digital Library through the corresponding Digital Library System), and Digital Library Management System (a generic software system that provides the appropriate software infrastructure both to produce and administer a Digital Library System incorporating the suite of functionality considered foundational for Digital Libraries and to integrate additional software offering more refined, specialized, or advanced functionality). These correspond to three different levels of conceptualization [DELOS DLRM, 2007]. Accordingly to DELOS DLRM there are six domains that are involved in DL – Content, User, Architecture, Policy, Quality, and Functionality

50

http://www.delos.info/

46

Access to Digital Cultural Heritage ...

(Figure 2). DELOS DLRM defines more than 100 concepts for the links between the six elements.

Figure 2. Delos Elements [DELOS DLRM, 2007]

Content: the data and information that the Digital Library handles and makes available to its users. It is composed of a set of information objects organized in collections. It encompasses the diverse range of information objects, including such resources as objects, annotations, and metadata, which are precondition for syntactical, semantic, and contextual interpretation of information objects. User: covers the various actors (human or machine) whicg interact with Digital Libraries. Digital Libraries connect actors with information and support them in their ability to consume and make creative use of it to generate new information. Here are included such such elements as the rights that actors have within the system and the profiles of the actors with characteristics that personalize the system's behaviour or represent these actors in collaborations. This element is very important to keep in touch with other environments, such as social networks, and provides quick feedback on the accuracy and quality of the information in it. Functionality: the services that a Digital Library offers to its different users. The minimum of functions includes new information object registration, search, and browse. Bryon that, each DL manages different set of functions in order to serve the particular needs of its community of users relating to the content it contains. Quality: represents the parameters that can be used to characterize and evaluate the content and behaviour of a Digital Library. Quality can be associated not only with each class of content or functionality but also

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

47

with specific information objects or services. Some of these parameters are objective in nature and can be automatically measured, whereas others are subjective in nature and can only be measured through user evaluations. Policy: represents the sets of conditions, rules, terms and regulations governing interaction between the Digital Library and users, whether virtual or real. Examples of policies include acceptable user behaviour, digital rights management, privacy and confidentiality, charges to users, and collection delivery. Architecture: refers to the Digital Library System entity and represents a mapping of the functionality and content offered by a Digital Library onto hardware and software components. The six core concepts (Content, User, Functionality, Quality, Policy and Architecture) that lie at the heart of Digital Library universe need to be considered in conjunction with the four main ways that actors interact with digital library systems – End-Users, Designers, System Administrators, and Application Developers [DELOS DLRM, 2007].  Model 5S 5S model, just like DELOS DLRM, is trying to develop a common view for what a digital repository in an international context should be built upon.

Figure 3. High-level concepts in the 5S Model [Goncalves et al, 2004]

48

Access to Digital Cultural Heritage ...

5S model is constructed by Streams, Structures, Spaces, Scenarios, and Societies [Goncalves et al, 2004] that are the core elements of the framework for providing theoretical and practical unification of digital libraries. This model is more computer science oriented and helps to understood deeply the mathematical methods and algorithms that are useful in the process of construction, build and using of digital libraries. Later the main concepts of the model are described as they are presented in [Goncalves et al, 2004]. A stream is sequence of elements of an arbitrary type (e.g., bits, characters, images, etc.). In this sense, the streams can model both static (e.g. text) and dynamic (e.g. video) content. In the static interpretation, the temporal nature is ignored or is irrelevant, and a stream corresponds to some information content that is interpreted as a sequence of basic elements, often of the same type. The type of the stream defines its semantics and area of application. A dynamic stream can represent an information flow. Typically, a dynamic stream is understood through its temporal nature. A dynamic stream can be interpreted as a finite sequence of clock times and associated values that can be used to define a stream algebra. The synchronization of streams can be specified with Petri Nets or other approaches. A structure specifies the way in which parts of a whole are arranged or organized. In digital libraries, structures can represent hypertexts, taxonomies, system connections, user relationships, etc. Markup languages (e.g., SGML, XML, HTML) have been the primary form of exposing the internal structure of digital documents for retrieval and/or presentation purposes. Usually, the relational and object-oriented databases impose strict structures on data as tables or graphs. The increasing of the complexity and heterogeneity of the content impose using of more contemporary ways for describing interconnections, such as semantic nets. In general, humans and natural language processing systems can expend considerable effort to unlock the interwoven structures found in texts at syntactic, semantic, pragmatic, and discourse levels. A space is a set of objects together with operations on those objects that obey certain constraints. The combination of operations on objects in the set is what distinguishes spaces from streams and structures. Spaces are extremely important mathematical constructs. The operations and constraints associated with a space define its properties. Spaces also can be defined by a regular language applied to a collection of documents. Document spaces are a key concept in many digital libraries. Human understanding can be described using conceptual spaces. Multimedia systems must represent real as well as synthetic spaces in one or several

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

49

dimensions, limited by some metric or presentational space (windows, views, projections) and transformed to other spaces to facilitate processing (such as). Many of the synthetic spaces represented in virtual reality systems try to emulate physical spaces. Digital libraries can use many types of spaces (measure spaces, probability spaces, vector spaces, topological spaces, etc.) for indexing, visualizing, and other services they perform. A scenario is useful as part of the process of designing information systems. It can be used to describe external system behaviour from the user's point of view; to provide guidelines to build a cost-effective prototype; or to help to validate, infer, and support requirements specifications and provide acceptance criteria for testing. Scenarios tell what happens to the streams, in the spaces, and through the structures. Taken together the scenarios describe services, activities, tasks, and those ultimately specify the functionalities of a digital library. A society is a set of entities and the relationships between them. The entities include humans as well as hardware and software components, which either use or support digital library services. Societal relationships make connections between and among the entities and activities. Members of societies have activities and relationships. During their activities, society members often create information artefacts (art, history, images, data) that can be managed by the library. Electronic members of digital library societies, i.e., hardware and software components, are normally engaged in supporting and managing services used by humans. A society is the highest-level component of a digital library, which exists to serve the information needs of its societies and to describe the context of its use. Digital libraries are used for collecting, preserving, and sharing information artefacts between society members. Several societal issues arise when we consider them in the digital library context. These include policies for information use, reuse, privacy, ownership, intellectual property rights, access management, security, etc. Language barriers are also an essential concern in information systems and internationalization of online materials is an important issue in digital libraries, given their globally distributed nature.  5M Layer to 5S Model Digital libraries for international development need a combination of converging technologies which enable librarians and end users to manage, access and utilize collections of increasing size and complexity. The authors of 5M model [Darányi et al, 2010] foresee this to happen by a mix of social networking and automatic document indexing and

50

Access to Digital Cultural Heritage ...

categorization. 5M model is a digital library with "Multicultural, Multilingual, Multimodal documents, plus their content processed by Multivariate statistical algorithms, adding the Modelling of user behaviour and content evolution". This can be made to match the respective 5S formal model of DL. The proposed extension to 5S model is to add possibility to use infinite dimensional Hilbert space in order to allow the visualization of evolving semantic content in sentences, documents or databases.

5.3

Repository Software

Digital repository solutions consist of hardware, software and open standards. A wide variety of available software with different features and strengths exists. A functional comparison of repository software products is presented in [JISC/RSS, 2010]. To set up a repository three approaches can be followed [JISK/RSP, 2009]: 

do-it-yourself;



use standard packages;



outsourcing – external hosting.

With limited staff resources for long-term maintenance and support the most popular approach appears to be using a standard package nevertheless that external hosting recently becomes more popular. Recently the more commonly adopted software solutions fall into two broad groups: open source and commercial software. Open source software is exemplified by DSpace 51, Fedora 52, EPrints53, and Digital Commons 54. DSpace is the software of choice for academic, non-profit, and commercial organizations building open digital repositories. DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, and data sets. It is applied for accessing, managing and preserving scholarly works. Fedora (Flexible Extensible Digital Object Repository Architecture) was originally developed by researchers at Cornell University as an architecture for storing, managing, and accessing digital content in the form of digital objects [Kahn and Wilensky, 1995]. Nowadays the Fedora Repository Project and the Fedora Commons community together with the DSpace project are under the supervision of the non-profit 51 52 53 54

http://www.dspace.org/ http://www.fedora-commons.org/ http://www.eprints.org/ http://digitalcommons.bepress.com/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

51

organization DuraSpace 55. The Fedora Repository Project (simply Fedora) implements the Fedora abstractions and provides basic repository services. This permits to express digital objects, to assert relationships among digital objects, and to link services to digital objects. Fedora ensures the durability of the digital content by providing tools for digital preservation. The Fedora Commons community deals with producing additional tools and applications that enlarge the functionally of the Fedora repository. The latter is extremely flexible and can be used to support any type of digital content. There are numerous examples of Fedora being used for digital collections, e-research, digital libraries, archives, digital preservation, institutional repositories, open access publishing, document management, digital asset management, and more. Fedora Commons provides sustainable technologies to create, manage, publish, share and preserve digital content. EPrints is an open source platform for building repositories of documents like research literature, scientific data, and student theses. Digital Commons offers external hosting for institutional repositories. It can include pre-prints and/or final copies of working papers, journal articles, dissertations, master's theses, conference proceedings, and a wide variety of other content types. Commercial software could be based on an open source repository engine coupled with a proprietary application software layer, such as VITAL56. VITAL is an institutional repository solution built on Fedora. It is designed to simplify the development of digital object repositories and to provide online search and retrieval of information for administrative staff, contributing faculty and end-users. VITAL provides all functions such as storing, indexing, cataloguing, searching and retrieving required for handling large text and rich content collections. Other possibility includes openly accessible API's using XML interfaces, as example DigiTool 57 . Because of the increased demand to manage digital assets, libraries need standard methods and tools to facilitate cataloguing, sharing, searching, and retrieval of digital collections. Through highly customizable user interfaces DigiTool enables academic libraries and library consortia to manage and provide access to the growing volume of digital collections. Support for library standards and built-in integration with other ExLibris products, e.g., Aleph, Voyager, MetaLib, SFX, and Primo, makes DigiTool an integral part of the library

55 56 57

http://duraspace.org/ http://www.vtls.com/products/vital http://www.exlibrisgroup.com/digitool.htm

52

Access to Digital Cultural Heritage ...

infrastructure and facilitates the incorporation of digital resources into library services. A functional comparison of repository software products is presented in JISC Repository Infokit 58. Consulting services are available through Sun [Grant, 2007].

6 Initiatives on World and European Level Numerous successful projects that cover the digitization process have been funded by a number of research programmes over the last decades, including but not limited to Esprit, Impact, Raphael, and IST programmes [Maitre et al, 2001]. The European Union has funded numerous digital culture research and development projects. The EU's CORDIS (Community Research & Development Information Service) 59 is the primary resource to learn about past and current R&D projects in this domain. For instance, in the field of Fine Art, some of the projects, such as Vasari (1989-1992) and Marc (1995-1996) focus on digital acquisition, storage and handling of colorimetric high-definition images of paintings (up to 2GB per image) for a range of galleries and museums in the European Union. The Crisatel project (2001-2004) developed equipment for the direct fast capture of paintings, with a new ultra-high definition multi-spectral scanner in order to make spectrometric analysis of varnish layers to allow the effect of an aged varnish to be subtracted from an image of a painting. The FingArtPrint project (2005-2008) aimed to combine 3D surface scanning and multispectral imaging in order to create a unique data record of the object which can be compared to check its authenticity, etc. [Ivanova, 2011]. Other projects and initiatives are aimed at establishing repositories. One of the first projects in this domain was NARCISSE (1990-1992), which created a very high-quality digitized image bank, supervised by a multilingual text database (in German, French, Italian and Portuguese). The objective of the project Artiste (2000-2002) was to develop and prove the value of an integrated art analysis and navigation environment aimed at supporting the work of professional users in the fine arts. The environment has exploited advanced image content analysis techniques, distributed hyperlink-based navigation methods, and object-oriented relational database technologies. Artiste has integrated art collections virtually while allowing the owners of each collection to

58 59

http://www.jiscinfonet.ac.uk/infokits/repositories http://www.cordis.europe.eu/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

53

maintain ownership and control of their data, using the concept of distributed linking [Ivanova, 2011]. In more recent years several projects and initiatives focused on harmonizing activities carried out in digitization of cultural and scientific content in order to create a common platform for cultural heritage. Such project is MINERVA+ (MInisterial NEtwoRk for Valorising Activities in digitisation) 60, sponsored by FP6 of the EC, which enlarged the existing thematic network of European Ministries of Culture addressing this direction. Since 2005 the Netherlands' Organization for Scientific Research supports the research program CATCH (Continuous Access to Cultural Heritage) 61 that finances teams focusing on the improvement of cross-fertilization between scientific research and cultural heritage. In the light of transferability and interoperability, the research teams work on their research at the heritage institutions [Ivanova, 2011]. Below, we will stop our attention in some big projects and initiatives that make remarkable jump in their areas.

6.1

Library and Scientific Open-access Initiatives

Below, we stop our attention on some initiatives for creating digital libraries that expand the possibilities to reach cultural and scientific heritage in the digitized form.  TEL The project TEL (The European Library: Gateway to Europe's Knowledge) 62 from 2001-2004, launched an initiative to establish a European Digital Library (EDL) 63. In 2005 a virtual library portal began to operate, which now offers access to the resources of 47 European national libraries in 35 languages. EDL offers search and retrieval of metadata and digital objects (free or fee) of books, magazines, newspapers, audio recordings and other materials. TEL uses the standard DC with some extensions and is compatible with Z39.50, MARC 21, UNIMARC and ISO 2709. Subsequent projects expanded EDL: TEL-MEMOR (The European Library: Modular Extensions for Mediating Online Resources) in the period 2005-2007; EDLproject 64 in the period 20062008; TEL+ 65 in the period 2007 – 2009 and FUMAGABA 66 in the period 60 61 62 63 64 65

http://www.minervaeurope.org/ http://www.nwo.nl/catch/ http://www.theeuropeanlibrary.org http://www.theeuropeanlibrary.org/portal/organisation/cooperation/archive/edlproject http://www.edlproject.eu http://www.theeuropeanlibrary.org/telplus/

54

Access to Digital Cultural Heritage ...

2008-2009. The National Library "St. St. Cyril and Methodius" participates in the TEL+ project.  World Digital Library As is written in the mission of the World Digital Library (WDL) 67 it makes available on the Internet, free of charge and in multilingual format, significant primary materials from countries and cultures around the world. The principal objectives of the WDL are to promote international and intercultural understanding; to expand the volume and variety of cultural content on the Internet; to provide resources for educators, scholars, and general audiences; as well as to build capacity in partner institutions to narrow the digital divide within and between countries. The idea arose in 2005, by proposition of US Librarian of Congress James Billington the establishment of the WDL in a speech to the US National Commission for UNESCO in June 2005 and soon was formed as a common project between the Library of Congress, UNESCO and five other partner institutions, which are leader in the domain of cultural heritage in different points of the world – the Bibliotheca Alexandrina, the National Library of Brazil, the National Library and Archives of Egypt, the National Library of Russia, and the Russian State Library. Input into the design of the prototype was solicited through a consultative process that involved UNESCO, the International Federation of Library Associations and Institutions (IFLA), and individuals and institutions in more than forty countries. The successful unveiling of the prototype was followed by a decision by several libraries to develop a public, freely-accessible version of the WDL, for launch at UNESCO in April 2009. More than two dozen institutions contributed content to the launch version of the site. The public version of the site features highquality digital items reflecting the cultural heritage of all UNESCO member countries. The WDL continues to add content to the site, and enlists new partners from the widest possible range of UNESCO members in the project.  OpenAIRE and EuDML The FP7-project OpenAIRE 68 is aimed to establish the infrastructure for researchers to support them with providing an extensive European Helpdesk System, based on a distributed network of national and regional liaison offices in 27 countries, to ensure localized help to researchers

66 67 68

http://www.theeuropeanlibrary.org/portal/organisation/cooperation/fumagaba/ http://www.wdl.org/en/ http://www.openaire.eu/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

55

within their own context. It also provide a repository facility for researchers who do not have access to an institutional or disciplinespecific repository. The electronic infrastructure built by the project is based on software services of the D-NET package developed within the DRIVER and DRIVER-II projects and the Invenio digital repository software developed at CERN. All deposited articles and data are freely accessible worldwide through the OpenAIRE portal. Thematically, the project focuses on peer-reviewed publications (primarily, journal articles in final or pre-print form, but also conference articles, when considered important) in at least the seven disciplines highlighted in the Open Access pilot (energy, environment, health, cognitive systems-interactionrobotics, electronic infrastructures, science in society, and socioeconomic sciences-humanities). OpenAIREplus, which starts at November 2011, is the next step in development of a 2nd-Generation Open Access Infrastructure. It will "develop an open access, participatory infrastructure for scientific information" and will expand its base of harvested publications to also include all open access publications indexed by the DRIVER infrastructure (more than 270 validated institutional repositories) and any other repository containing "peer-reviewed literature" that complies with certain standards. It will offer both user-level services to experts and nonscientists alike as well as programming interfaces for providers of valueadded services. EuDML 69 is an ICT-CIP project to build the European Digital Mathematics Library. The ambition of the project is to deliver a truly open, sustainable and innovative framework for access and exploitation of Europe's rich heritage of mathematics. The Institute of Mathematics and Informatics at the BAS (IMI-BAS) coordinates these projects for Bulgaria. Currently, the Bulgarian open access educational repositories, registered in OpenDOAR, are [Simeonov and Stanchev, 2011]: 1. Repository at IMI-BAS 70 , based on DSpace has 1182 items, containing Journal Archives, Papers, Book Series, and Proceedings. 2. Repository at Sofia University "St. Kliment Ohridski" 71 , based on DSpace has 375 items, containing Papers, MSc Theses, PhD Theses, and Events.

69 70 71

http://www.eudml.eu/ http://sci-gems.math.bas.bg http://research.uni-sofia.bg

56

Access to Digital Cultural Heritage ...

3. Scholar Electronic Repository 72 of New Bulgarian University, based on Eprints has 336 items, containing Papers, MSc Theses, PhD Thesеs, and Lecture Notes. Under construction are two new repositories: Repository of Central Medical Library at the Medical University of Sofia (MUS) 73 and Repository of University of Rousse 74 . They are based on DSpace, and will contain Journal Articles, Books, Lectures, MSc Theses, and PhD Thesеs.

6.2

Examples of Initiatives that Change the Digital World

Europeana, Wikipedia and Google projects are examples of very large scale initiatives, which represent three different successful approaches for getting working and user attractive repositories. Europeana focuses on European institutions and EU focus, thus showing politically oriented approach. Wikipedia, as a Web 2.0 service, has socially oriented approach with user-generated content. Google Projects represents a technology creative company approach resulting at new user attractive and useful web services, like Google Books, Google Earth, GoogleArtProject, etc.  Europeana The idea of Europeana 75 was born in 2005, when the European Commission announced its strategy to promote and support the creation of a European digital library, as a strategic goal within the European Information Society i2010 Initiative, which aims to foster growth and jobs in the information society and media industries. The European Commission's goal for Europeana is to make European information resources easier to use in an online environment. It will build on Europe's rich heritage, combining multicultural and multilingual environments with technological advances and new business models. Europeana.eu went live on 20 November 2008. Till now more than 19 millions digital items (Images: paintings, drawings, maps, photos and pictures of museum objects; Texts: books, newspapers, letters, diaries and archival papers; Sounds: music and spoken word from cylinders, tapes, discs and radio broadcasts; Videos: films, newsreels and TV broadcasts) are available. Europeana uses DC standard for the description of the objects, supplemented by several specific metadata– 49 metadata (7 highly recommended, 10 recommended, 20 additional and 12 specific).

72 73 74 75

http://eprints.nbu.bg http://nt-cmb.medun.acad.bg:8080/jspui/ http://dspace.ru.acad.bg/ http://www.europeana.eu/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

57

Currently in Europeana there are 108 partners from 23 countries and its supplementation continues with new projects related to the creation of regional and local aggregators of digital artefacts. Thus, for example, Projects Multilingual Inventory of Cultural Heritage in Europe MICHAEL 76 (in 2004-2008) and MICHAEL+ (2006-2009) are associated with Europeana aggregators, providing multilingual description of digital resources. To record relevant metadata DC with some extensions is used – MICHAEL-EU Dublin Core Application Profile, this contains 147 metadata. In the time of written the text of this chapter sixteen Bulgarian institutions participate in MICHAEL. Several projects connected with Europeana address different aspects of presenting European Cultural Heritage. ATHENA 77 (2008-2011) for example aims to bring together relevant stakeholders and content owners from all over Europe, evaluate and integrate standards and tools for facilitating the inclusion of new digital content into Europeana. The LIDO standard is used for object description. The project involved 120 institutions from 24 countries, incl. Bulgaria. The project EuropeanaLocal78 (2008-2011) supports the inclusion of local and regional libraries, museums, archives and audio-visual archives into Europeana. The project has a large partner network of regional and local institutions in 27 countries, and to describe objects using the Europeana standards. It aims to improve the interoperability of the digital content held by regional and local institutions and make it accessible through Europeana and to other services. Project Judaica Europeana 79 aims to provide access to European Jewish culture. APENET 80 try to provide EU citizens, public authorities and companies with a common portal, accessing the archives of Europe. The project CARARE 81 is focused of making the digital content for the archaeology and architectural heritage that they hold available through Europeana; aggregating content and delivering services, and enabling access to 3D and Virtual Reality content through Europeana. None of the aggregated collections, however, are actually held by Europeana. Ironically this prestigious library, with a recognizable brand does not act as the custodian to these collections, hosting within the portal only a thumbnail preview and the metadata; the textural explanations that describe the objects, or works of art. Through browsing and searching on Europeana, and after discovering the collections, the

76 77 78 79 80 81

http://www.michael-culture.org/en/home http://www.athenaeurope.org/ http://www.europeanalocal.eu/ http://www.judaica-europeana.eu/ http://www.apenet.eu/ http://www.carare.eu/

58

Access to Digital Cultural Heritage ...

user is taken out of Europeana to where the content provider where the content digital object resides [Hazan, 2011].  Wikipedia The motto of Wikimedia Foundation is "Imagine a world in which every single human being can freely share in the sum of all knowledge". Wikimedia Foundation is a non-profit and non-governmental organization. The basic idea, which lays in the ground of creation content in Wikimedia projects, is a flagship of Web 2.0. Wikipedia is one of the most popular projects of Wikimedia Foundation. As Wikipedia 82 said for itself it is a "multilingual, web-based, free-content encyclopaedia project based on an openly editable model". Currently, there are more than 82 000 active contributors working on more than 19 000 000 articles in more than 270 languages. With its 365 million readers, 18 million articles (over 3.6 million in English), 281 editions in different languages Wikipedia is the largest and most popular general reference work on the Internet ranking around seventh among all websites on Alexa. Good example of Web 2.0 service, altogether with YouTube, MySpace, and Facebook. Some have noted the importance of Wikipedia not only as an encyclopaedic reference but also as a frequently updated news resource. An investigation in Nature Journal in 2005 [Giles, 2005] found that the science articles they compared came close to the level of accuracy of Encyclopaedia Britannica and had a similar rate of serious errors. Fully automated translation of articles is disallowed. Many CH institutions are using Wikipedia to promote its collections. As for Bulgarian GLAM institutions, in the English version of Wikipedia there are 22 Bulgarian museums. Bulgarian version of Wikipedia has 259 937 articles and 93 410 registered users.  Google's Projects The mission of Google is manifested in another direction – "to organize the world's information and make it universally accessible and useful – requires exceptional thinking and technical expertise" 83. So, the approach they use to born and realize the new ideas is to offer 20% of the time of their engineers to work on what they're really passionate about. Some of the children of such approach, discussed below, are already in our everyday practice. From 1 February 2011 Google presented the Google Art Project 84 . Seventeen galleries and museums were included in the launch of the 82 83 84

http://en.wikipedia.org/ http://www.google.com/jobs/lifeatgoogle/englife/index.html http://www.googleartproject.com/

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

59

project. The 1061 high-resolution images (by 486 different artists) are shown in 385 virtual gallery rooms, with 6000 Street View-style panoramas. Each institute contributed one item of giga-pixel artwork for free access. Concerning presentation of cultural heritage in a connection of time and place, in the latest version, Google Earth 685, it is possible to use and create so-called "historical imagery" and to travel back in time by various tours. Showcase list has 12 elements, among which those related to heritage are Historical Imagery, Ancient Rome, UNESCO, Favourite Places, and 3D buildings. One can add 3D buildings to Google Earth quickly and easily with GOOGLE geo-modelling and 3D modelling tools. Historical Imagery in Google Earth makes possible literally to look at your neighbourhood, home town, and other familiar places and which is more important re heritage issues – to see how they have changed over time.

6.3

Initiatives, Connected with Data Content Standards

There are several big projects addressed the description of the highlevel concepts in the cultural heritage domain [Ivanova et al, 2010].  Getty Vocabularies Getty vocabularies are exploring richness of the speech in terms, when doing a search of heritage and domain specific terms. More precisely, they offer international standards compliant structure of terms in the following areas: art, architecture, decorative arts, archival documents, visual surrogates, bibliographic materials etc. Thus they appear as authoritative source information for enhancing various databases and websites. Let's only mention the richness of gathered and structured information in Getty vocabularies 86. The vocabularies in this program are: 

The Art and Architecture Thesaurus – AAT (containing around 34 000 concepts including 131 000 terms, descriptions, bibliographic citations, and other information relating to fine art, architecture, decorative arts, archival materials and material culture),



The Union List of Artist Names – ULAN (containing around 127 000 records including 375,000 names and biographical and bibliographic information about artists and architects, including a wealth of variant names, pseudonyms and language variants),

85 86

http://www.google.com/earth/index.html http://www.getty.edu/research/conducting_research/vocabularies/

60

Access to Digital Cultural Heritage ...



The Thesaurus of Geographic Names – TGN (containing around 895 000 records including around 1 115 000 names, place types, coordinates and descriptive notes focusing on places important for the study of art and architecture), and



The Cultural Objects Name Authority – CONA (forthcoming in early 2012; it will include authority records for cultural works, featuring architecture and movable works such as paintings, sculpture, prints, drawings, manuscripts, photographs, ceramics, textiles, furniture, and other visual media such as frescoes and architectural sculpture, performance art, archaeological artefacts, and various functional objects that are from the realm of material culture and of the type collected by museums).  IconClass

Iconclass 87 is a hierarchical system, developed by the Netherlands Institute for Art History. It includes the following main divisions: Abstract, Non-representational Art; Religion and Magic; Nature; Human being, Man in general; Society, Civilization, Culture; Abstract Ideas and Concepts; History; Bible; Literature; Classical Mythology and Ancient History.  WordNet WordNet 88 is a large lexical database of English, developed under the direction of George A. Miller. WordNet is freely and publicly available for download. Although it is not domain-specific, it is a useful tool for computational linguistics and natural language processing especially for English-language texts.

7 The User and the New Digital World As we already argued catering for the users influenced a quicker development and large dissemination of digital libraries of all domains, not CH only. Impact and value of digitised collections are concepts which are both being brought to real life through users. Any metrics and criteria which would try to capture impact and value have to factor in firstly how individual users (or user communities) 89 benefit from the digitised resources in question.

87 88 89

http://www.iconclass.nl/index.html http://wordnet.princeton.edu/ Real people could be named differently in order to convey subtle differences on their level of engagement and role – users (in the computer environment), consumers

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

61

Thus, one specific difficulty in measuring impact and value is the subjective and quickly changing user-related component of the valorisation process. How exactly could we find if a digital resource had an impact on the users? What value proposition has resource creators intended to convey to their target audiences? How well did these target audiences understand the message is the value they see in the resource and surrounding services identical to what its producers had in mind? This article presents a description of user evaluation methodologies, and provides a case study from the area of digital resources for historians.

7.1

Users: between Policies and Real Involvement

As the volume of digitised resources grows, so does the number of studies and publications on user studies within the digital library domain, these have been limited in scope, as noted recently by Michael Khoo: "In the case of digital library researchers, the focus of research is often on technical issues (e.g., information retrieval methods, software architecture, etc.) rather than on user-centred issues" [Khoo et al, 2009] In fact, we are currently witnessing a paradox: major institutions from the cultural heritage sector clearly emphasize the place of user evaluation and feedback in digitisation-related policies. But in reality, decisions about aspects of digitization that impact users are frequently taken without direct user involvement. For example, the "National Library of Australia Collection Digitisation Policy" states that: "The Library's digitisation activities take account of user evaluation and feedback. Users are encouraged to provide feedback and make suggestions through the Digital Collections user feedback form or other ways" [NLA, 2008]. Similarly, the "National Library of Wales: Digitisation Policy and Strategy" says that selection will be made according to "an appreciation of user requirements which will drive the selection and delivery of digitised material... the Library will seek user feedback, including that of current and potential users, by means of online surveys, structured evaluation, web metrics (collecting and interpreting data) [which] will include quantitative and qualitative data" [NLW, 2005]. The National Library of Scotland state in their 2008 – 2010 Strategy document that "We will maintain awareness of the needs of our various user (and potential user) communities through market research,

(when we take a business perspective), visitors (when we speak about a particular type of resources, e.g. internet websites). In this chapter we will use the term users.

62

Access to Digital Cultural Heritage ...

consultation and involvement, in order to develop our services in the most appropriate way" [NLS, 2008]. JISC in its Digitisation Strategy seeks to clearly define its terms of selection in relation to users before the actual digitisation, wishing to "continue to fund the digitisation of high quality collections of core relevance to learning, teaching and research in the UK" while also "understand[ing] both more about the condition and potential of new collections to be digitised (particularly those held within the JISC community) and also to understand where areas of the highest demand for new collections may exist" [JISC, 2008]. Paola Marchionni has presented a range of user involvement mechanisms as a synthesis of experiences from the JISC Digitisation program, including users' feedback, establishing relationships with the users, and determining impact [Marchionni, 2009]. The examples illustrate a multi-scale view on users: including the current but also the future ones; inviting their participation in different stages of the digitisation process – at the planning stages of the digitisation, or within the use of the digitised product; and identifying methods that could be used to engage the users – e.g. online surveys, structured evaluation, web metrics. However, meta-analysis shows that there is evidence of insufficient involvement of users, indicating that users need to be engaged more actively in digitisation projects. Within the context of digital resources for archives, Anneli Sundqvist noted that "the general knowledge of user behaviour is a mixture of common sense, presumptions and prejudices "[Sundqvist, 2007]. The Institute of Museum and Library Services reported in 2003 that "The most frequently-used needs assessment methods do not directly involve the users" [IMLS, 2003].

7.2

User Involvement in Digital Libraries Development

This involvement serves summarised in the Table 1.

very

different

purposes

which

are

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives Table 1.

63

Types of user involvement in digital libraries development

Type

What is it used for?

Front-end involvement

Users can take part in assessment on a variety of issues related to digital libraries (technical requirements, e.g. resolution, dimensions of digital objects, preferred formats for use). At this stage users can also take part in exploratory research, e.g. needs in new resources and defining requirements, as well as rationale for selection, appraisal and prioritisation of material to be digitised. This type of evaluation usually takes form of iterative circles of process-and-evaluation when implementing digitisation of collections. Most typically such evaluation will focus on usability, e.g. interfaces and presentation of digitised resources; coverage of identified needs for specific audiences. Here the focus is the final output and the accordance to the expectations and requirements of target communities/organisation structures/the wider disciplinary domain. Direct user engagement can utilise social media tools which allow users to contribute their own digital objects or to take part in the enrichment of digitised resources – e.g. supplying full texts, or metadata. Typical examples are crowdsourcing, e.g. users contribution to create full text versions from images, and the use of Flickr to share digitised resources more widely and invite users to contribute metadata.

Normative evaluation

Summative evaluation

Direct engagement in the digital resource creation

7.3

User Studies

A variety of methods are used in user studies. We cannot present all of them in detail but provide a brief introduction to the various types of methods [Dobreva et al, 2011]. A large group of user study methods are based on direct user involvement. They include: 

quantitative methods, such as questionnaires and experiments involving users (most typically studying user behaviour aspects – e.g. search within an existing resource, or eye tracking – studying the gaze fixation during the use of a resource in order to analyse the quality of its interface);



qualitative methods, such as focus groups, interviews, expert evaluations and user panels (groups of users who discuss regularly the digital resource which is being studied);

64

Access to Digital Cultural Heritage ...



mixed methods can also be used, blending quantitative and qualitative elements, e.g. longer-time experiments where users have to keep a diary on their use of a resource;



ethnographic studies are another method employing direct user involvement; in this case researchers make observations directly in the environment of creation or use of the digital resource. This method helps to see the larger picture and dependencies of digitisation work with other processes in the organisation.

A rapidly developing group of methods for user studies is based on indirect observation. A typical method in this category is deep log analysis which studies the traces of user activities in the use of web resources – e.g. duration of visit, search terms used, websites visited after the use of the research studied. If the users involved in the study have to generate documents (e.g. produce a poster or a presentation), these documents also could be analysed to discover typical patterns of behaviour. In the real-life practice, most current studies are based on hybrid methodologies, e.g. focus groups (a qualitative method) could be used in combination with deep log analysis (a quantitative method) in order to see how user behaviour evidence from the deep logs supports statements made by real users during focus groups. The knowledge gathered by different methods can be used to build a synthesised profile of a typical user (such unified user descriptions are called personae). It also could be used to summarise typical user scenarios which show how the digital resources are used in real life.

8 Conclusion During the years, the ability of processing the information as well as expanding the ways of data exchange increased in parallel. The development of computing and communication capacities allows to place the user in the centre of the process of information exchange and to afford him/her to use the overall power of the intellectualized tools for satisfying his/her needs and expectations. In the recent years as a result of this growth, the virtual museums change towards more compact and systematic presenting the information with abilities of common interoperable search between different collections [Ivanova et al, 2010]. All these areas need much technical work on digitization and organization to be done in parallel with applying of more complex view on the area. It is time these three processes: digitization, access and preservation to be examined as one complete life cycle of information objects.

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

65

Bibliography [Allinson, 2006] Allinson, J.: OAIS as a Reference Model for Repositories. JISK-Report, UKOLN, University of Bath, 2006. [Campbell and Rafferty, 2011] Campbell, T., Rafferty, E.: Metropolitan Museum of Art: Report from the Director and the President for 2011. [http://www.metmuseum.org/en/about-the-museum/annualreports/~/media/Files/About/Annual%20Reports/2010_2011/Director%20and%20Pre sident.ashx] [Chen et al, 2005] Chen, C.-C., Wactlar, H., Wang, J., Kiernan, K.: Digital imagery for significant cultural and historical materials – an emerging research field bridging people, culture, and technologies. Int. J. Digital Libraries, 5(4), 2005, pp. 275–286. [CSDGM, 1998] Content Standard for Digital Geospatial Metadata. Federal Geographic Data Committee, Washington, D.C., USA, 1998. [Darányi et al, 2010] Darányi, S., Wittek, P., Dobreva, M.: Position paper: adding a 5M layer to the 5S model of digital libraries. In: Proc. of Int. Conf. "Digital Libraries for International Development", Australia, 2010. [DCMI, 2009] Interoperability for Dublin Core Metadata, http://dublincore.org/documents/interoperability-levels/ [DELOS DLRM, 2007] The DELOS Digital Library Reference Model. Version 0.96, Nov.2007, http://www.delos.info/files/pdf/ReferenceModel/DELOS_DLReferenceModel_096.pdf [Dobreva et al, 2011] Dobreva, M., Feliciati, P., O'Dwyer, A. (ed): User Studies for Digital Library Development", Facet publishing, London, 2011. [Doerr and Stead, 2011] Doerr, M., Stead, S.: Harmonized models for the Digital World CIDOC CRM, FRBROO, CRMDig and Europeana EDM. Tutorial. 15 th Int. Conf. on Theory and Practice of Digital Libraries, TPDL, Berlin, Germany, 2011. [DPimpact, 2009] DPimpact: Socio-Economic Drivers and Impact of Longer Term Digital Preservation. D.5 Final Report on Contract: 30-CE-0159970/00-04, June, 2009. [Eurobarometer, 2007] Eurobarometer Survey on Cultural European Commission, Belgium, 2007.

Values within Europe.

[Giles, 2005] Giles, J.: Internet encyclopaedias go head to head. Int. Weekly Journal of Science "Nature", 14.12.2005, pp.900-901, http://www.nature.com/nature/journal/v438/n7070/full/438900a.html [Goncalves et al, 2004] Goncalves, M., Fox, E., Watson, L., Kipp, N.: Streams, structures, spaces, scenarios, societies (5s): a formal model for digital libraries. ACM TOIS, 22 (2), 2004, pp.270-312. [Grant, 2007] Grant C.: Delivering digital repositories with open solutions. Sun white paper, Ver. 8.0, Nov. 2007. [Greenberg et al, 2005] Greenberg, J., Spurgin, K., Crystal A.: Final Report of the AMeGA (Automatic Metadata Generation Applications) Project. UNC School of Information and Library Science, 2005. [Hansen, 2012] Hansen, D.: Orphan Works: Mapping the Possible Solution Spaces (March 9, 2012). Berkeley Digital Library Copyright Project White Paper No. 2/2011. http://ssrn.com/abstract=2019121

66

Access to Digital Cultural Heritage ...

[Hazan, 2011]. Hazan, S.: Holding the museum in the palm of your hand. Chapter from: User Studies for Digital Library Development. Dobreva, M., Feliciati, P., O'Dwyer, A. (editors). Facet Publishing, London, 2011. [Heery and Anderson, 2005] Heery R., Anderson, S.: Digital Repositories Review. AHDS, 2005. [ICCROM, 2005] ICCROM Working Group Heritage and Society: Definition of Cultural Heritage: References to Documents in History. ICCROM, 1990, revised 2005, http://cif.icomos.org/pdf_docs/Documents%20on%20line/Heritage%20definitions.pdf [IMLS, 2003] Institute of Museum and Library Services: Assessment of End-User Needs in IMLS-Funded Digitization Projects. Oct. 2003, www.imls.gov [Ivanova et al, 2010] Ivanova, K., Dobreva, M., Stanchev, P., Vanhoof K.: Discovery and use of art images on the web: an overview. Third Int. Euro-Mediterranean Conf. EuroMed, Lemesos, Cyprus, Archaeolingua Publ., 2010, pp. 205-211. [Ivanova, 2011] Ivanova, K.: A Novel Method for Content-Based Image Retrieval in Art Image Collections Utilizing Color Semantics. PhD Thesis, Hasselt University, Belgium, 2011. [JISC, 2008] JISC Digitisation Strategy. February 2008, http://www.jisc.ac.uk/media/ documents/programmes/digitisation/jisc_digitisation_strategy_2008.doc [JISC/RSS, 2010] JISK: Repository Software Survey, Nov. 2010, http://www.rsp.ac.uk/start/software-survey/results-2010/ [JISK/RSP, 2009] JISK: Repositories Support Project – Technical Approaches, 2009, http://www.rsp.ac.uk/start/setting-up-a-repository/technical-approaches/ [Jörgensen, 2001] Jörgensen, C.: Introduction and overview. Journal of the American Society for Information Science and Technology, 52(11), 2001, pp. 906-910. [Kahn and Wilensky, 1995] Kahn, R., Wilensky, R.: A framework for distributed digital object services, 1995, http://www.cnri.reston.va.us/home/cstr/arch/k-w.html [Keller, 2011] Keller P.: The Europeana licensing framework. Presentation at the 2011 General Assembly of the Europeana Council of Content Providers and Aggregators. http://www.slideshare.net/paulkeller [Khoo et al, 2009] Khoo, M., Buchanan, G., Cunningham, S.: Lightweight user-friendly evaluation knowledge for digital libraries. D-Lib Magazine, July/August 2009, http://www.dlib.org/dlib/ july09/khoo/07khoo.html [Lagoze, 1995] Lagoze, C.: A secure repository design for digital libraries. D-Lib Magazine, 1995, http://www.dlib.org/dlib/december95/12lagoze.html [Lavoie and Dempsey, 2004] Lavoie, B., Dempsey, L.: Thirteen ways of looking at ... digital preservation. D-Lib Magazine, 10 (7/8), 2004 [Lynch, 2003] Lynch, C.: Institutional repositories: essential infrastructure for scholarship in the digital age. ARL, 226, 2003, pp.1-7, http://www.arl.org/resources/pubs/br/br226/br226ir.shtml [Maitre et al, 2001] Maitre, H. Schmitt, F. Lahanier, C.: 15 years of image processing and the fine arts. Proc. of Int. Conf. on Image Processing, vol. 1, 2001, pp. 557-561. [Manferdini and Remondino, 2010] Manferdini, A.-M., Remondino, F.: Reality-based 3D modeling, segmentation and web-based visualization. M. Ioannides (Ed.): EuroMed 2010, LNCS 6436, 2010, pp. 110-124.

Chapter 1: Digitization of Cultural Heritage – Standards, Institutions, Initiatives

67

[Manuel, 2009] Manuel, K.: The Google library project: is digitization for purposes of online indexing fair use under copyright law? CRS Report 27.11.2009. http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA511070 [Marchionni, 2009] Marchionni, P.: Why are users so useful?: User engagement and the experience of the JISC digitisation programme. J. Ariadne, Oct. 2009, http://www.ariadne.ac.uk/issue61/marchionni/ [MINERVA IPR] Intellectual Property Guidelines. Ver. 1.0, Sept. 2008, edited by the MinervaEC Working Group http://www.minervaeurope.org/publications/MINERVAeC%20IPR%20Guide_final1.pdf [MPEG 201, 2005] ISO/IEC 21000-2:2005 Information technology framework (MPEG-21), http://www.iso.org/iso/catalogue_detail.htm?csnumber=41112



Multimedia

[NLA, 2008] National Library of Australia: National Library of Australia Collection Digitisation Policy. 4th edition, 2008, http://www.nla.gov.au/policy/digitisation.html [NLS, 2008] National Library of Scotland: Expanding Our Horizons. National Library of Scotland 2008-2011 Strategy, 2008, http://www.nls.uk/about/policy/docs/2008-strategy.pdf [NLW, 2005] National Library of Wales: Digitisation Policy and Strategy, 2005, http://www.llgc.org.uk/fileadmin/documents/pdf/digitisationpolicyandstrategy2005_S .pdf [NUMERIC, 2009] NUMERIC: Developing a statistical framework for measuring the progress made in the digitisation of cultural materials and content. Study Report: Study findings and proposals for sustaining the framework. CIPFA, UK, May 2009. [OAIS, 2002] Reference Model for an Open Archival Information System (OAIS): Blue book. Consultative Committee for Space Data Systems, January 2002, 148 p. [OAIS, 2009] Reference Model for an Open Archival Information System (OAIS): Pink book. Consultative Committee for Space Data Systems, August 2009. [OCLC, 2006] Online Computer Library Center, Inc. OCLC Digital Archive Preservation Policy and Supporting Documentation. Dublin, Ohio, USA, 2006. [Peneva et al, 2009] Peneva, J., Ivanov, S., Andonov, F., Dokev N.: Digital objects – storage, delivery and reuse. Proc. of the 7th Int. Conf. "Information Research and Applications", i.Tech, Madrid, Spain, 2009, pp. 61-69. [Polfreman and Rajbhandaji, 2008] Polfreman M., Rajbhandaji, S.: Metatools Investigating Metadata Generation Tools. JISC Final report, Oct. 2008.



[Simeonov and Stanchev, 2011] Simeonov, G., Stanchev, P.: Open access and institutional repositories in Bulgaria. Proc. of the 1st Int. Conf. DiPP, V.Tarnovo, Bulgaria, 2011, pp.165-170. [Somova et al, 2010] Somova, E., Vragov, G., Totkov, G.: Toward regional aggregator of digitalized cultural-historical objects. Proc. of National Conference Education in Information Society, EIS 2010, 27-28.05.2010, Plovdiv, Bulgaria, pp.154-161 (in Bulgarian) [Sotirova, 2011] Sotirova, K.: Digitization of the museum collections in Bulgaria: Standards and Practices. Presentation in Round Table "The United Europe and its Cultural Heritage", Sofia, 17.06.2011. http://www.math.bas.bg/~kalina/SotirovaDigitizationBGjune2011.pdf

68

Access to Digital Cultural Heritage ...

[Stork, 2008] Stork, D.: Computer image analysis of paintings and drawings: An introduction to the literature. Proc. of the Image processing for Artist Identification Workshop, van Gogh Museum, Amsterdam, The Netherlands, 2008. [Sundqvist, 2007] Sundqvist, A.: The use of records – literature overview. Archives and Social Studies: A Journal of Interdisciplinary Research, 1(1), 2007, pp.623-653. [UNESCO, 1972] United Nations Educational, Scientific and Cultural Organisation: Convention Concerning the Protection of the World Cultural and Natural Heritage. Adopted by the General Conference at its 17th session, Paris, 16.11. 1972, http://whc.unesco.org/archive/convention-en.pdf [Vullo et al, 2010] Vullo, G., Innocenti, P., Ross, S.: Interoperability for digital repositories: towards a policy and quality framework. Fifth Int. Conf. on Open Repositories (OR2010), Madrid, Spain, 2010. [WIPO and ICOM, 2011] WIPO and ICOM to Collaborate in Cultural Heritage and Museum Areas. Geneva, May 3, 2011, PR/2011/689. http://www.wipo.int/pressroom/en/articles/2011/article_0015.html

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts Emil Hadjikolev, George Vragov, George Totkov, Elena Somova 1 Introduction Digital libraries offer modern technological solution for presenting cultural heritage artefacts and providing semantic access to them. The main prerequisite for their effectiveness is the structuring of content through standardized collections of metadata. The European digital library Europeana plays a major role in bringing together cultural heritage content from various countries. 90 One of the issues it faces is the uneven distribution of materials it currently presents from different countries and on different subjects. While Europeana has already developed its strategy to include new digital objects through a network of aggregators, dealing with objects of specific types, the relatively low presence of objects from some countries could be explained by the lack of digitization strategies and respectively, a critical mass of digitized resources. As already emphasized in Chapter 1, currently the European Commission drives digitisation towards setting quantitative goals which possibly will address also existing gaps. Currently, the technology allows creating "digitized images" of cultural artefacts and placing them into our cultural space through the Web. As suggested in [Chen et al, 2005] "research on significant cultural and

90

At the end of 2010 Europeana had 15 million objects from over 2,000 museums, libraries, archives and audiovisual collections across the 27 countries of the European Union (see [Cousins, 2011], p.69)

70

Access to Digital Cultural Heritage ...

historical materials is important not only for preserving them but for preserving an interest in and respect for them". The geographical disposition, relief and climate of the Bulgarian lands, especially the Upper Thracian Valley, make it an attractive living place over the centuries. From Chalcolithic ages till now, many cultural layers have piled up. The creation of a common space for presenting different periods and different types of cultural marks gives the opportunity to receive a complex notion. This chapter presents the results of work done in 2010-11 on the development of a repository to be used within an aggregator of digitized collections of cultural objects. The specific task is to create an environment for the maintenance of various collections through the aggregator in order to integrate them into the European digital library Europeana and to establish a shared environment for representing the rich regional cultural heritage in this part of Bulgaria. The chosen approach supports the idea of preserving the valuable national monuments in the European cultural environment while highlighting their identity and specificity. Information technologies are widely applied in the collection (including recovery), storage, processing and dissemination of information concerning cultural and historical objects (CHO). Information on cultural and historical objects is stored in digital documents in the form of data, located on various media (with local or global access), organized in static Web pages, databases or digital storages. The collection of data on cultural objects includes their digitalisation and input of relevant metadata. After the process of digitalisation is completed, the artefact is presented in a digital format, while the metadata (structured information about the digitalised object) is used in order to enable resource discovery, access, management, etc. Nowadays, the extraction of relevant metadata and the so-called interoperability between different descriptions (metadata schemas) of digital artefacts are amongst the major challenges in the digital cultural heritage domain. Interoperability between metadata systems allows procedures and items associated with the processing of digital objects to be managed in the same manner, providing opportunities for the exchange of data between different systems. During the design and creation of software systems for digital object processing it is necessary to ensure that not only the technical compatibility (related to the use of common technical standards – file types, metadata, etc.) but also the semantic interoperability using common thesauri (glossaries for the used terminology) will be supported. The access to digital artefacts can be supported by various means: promotional sites, virtual catalogues, virtual tours, virtual museums,

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

71

libraries and archives, cultural-historical portals, business applications (in the fields of tourism, auctions, genealogy, history of art, criminology), etc. The work presented in this chapter aims to create a digital repository for digitalised objects and a technological environment which will be used for the maintenance of a range of distributed museum collections in an aggregator; the long-term goal is to prepare the ingested materials for integration in the European digital library. The content of that library is expected to be maintained by selected collections from the museums and galleries fund from the Plovdiv region. This task is synchronised with the idea of storing the valuable historical monuments in the European cultural environment and the complete preservation of their identity and particulars. Certain experience has been accumulated with the application of appropriate metadata. When the structure of the collections from the museum fund is specified, positive results are observed in the process of retrieving data from the description of the object available in them.

2 Aggregators of Digital Content for Cultural Artefacts in EU Some of the conclusions reached by the study made in early 2010 in the EU-countries [Piccininno, 2009] for aggregators of metadata for cultural objects and used technologies to deliver content to Europeana, are: 

All aggregators share the crucial goal to provide with integrated access to digital cultural resources via the Internet;



Initiatives to create aggregators have been shown by creating and working on projects related to Europeana and the involvement of cultural institutions that support digital content. In both cases, similar approaches and technical solutions are available. The period of their creation and development is 2002-2010, and the appearing of the new aggregators has escalated in the last two years.



Sixty percent of the aggregators are related to national portals. EUfunded initiatives, supporting about twenty percent of them, and the lowest (about seven percent) are regional aggregators. There is a trend of a growing interest for developers and institutions to develop regional aggregators.

According to their application area there are two basic types of aggregators: 

Aggregators with a global purpose – they are the result of larger initiative to improve online accessibility and usability of digital resources of libraries, archives and museums, to promote research for

72

Access to Digital Cultural Heritage ...

development search functions and the retrieval of integrated information to accelerate the digitization and to improve the training process for application of modern technologies; 

Aggregators focused on specific areas – they provide the technological tools for documenting and searching of specific subjects and topics (special cultural objects, musical instruments, educational problems, biodiversity, etc.).

The main characteristic of both types of aggregators is related to their searching capabilities. Services provided on both aggregators share the same features: they are portals for semantic search and navigation for various types of digital objects (text, images, video and audio files) and have options for storing and sharing content. Most of the aggregators (over sixty percent) are designed as a resource for preparation of data digitalized by cultural institutions that lack the capacity to develop and maintain their own digital repository. Only one third of the aggregators that provide access to metadata and digital content for heritage contain four types of digital objects (audio, video, text and image). The main conclusions that can be outlined are about the importance of the contents of the collections as a substantial criterion for assessing, the development of new features and services, and the need for its continuous enrichment. All aggregators are ready to deliver content to Europeana. While the survey was taking place, in early 2010, only 20% of aggregators were ready to deliver content and 60% planned to implement this in the period of 2010-2011. 15% of the content has arrived, or will reach Europeana through the development of projects (TEL, EuropeanaLocal, ATHENA). An aggregator, in the context of Europeana, is an organization that collects metadata from a group of content providers and then transmits them to Europeana. Aggregators collect data from individual organizations and standardize file formats and the corresponding metadata in accordance with the procedures for Europeana. The administrators of the aggregator are committed to support the efforts of the content providers through technological assistance, consulting and training.

3 The Prototype REGATTA–Plovdiv The task was to create a regional aggregator of digital artefacts based on the standard used by Europeana. It is presumed that the aggregator is accessible to each regional cultural and historical institution (for storing digitalized artefacts), as well as end users (for resource discovery). The

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

73

creation of a regional aggregator is the first step towards presenting and promoting the rich heritage of Plovdiv and its region in the European digital space. The open structure of the aggregator enables the creation of data models for various types of digitized cultural objects. This allows different types of collections to be presented, including museum collections, archaeological sites, and immovable heritage from Ancient, Mediaeval and the National Enlightenment periods in Bulgaria. The chosen approach supports the idea of preserving the valuable national monuments in the European area of culture, keeping their identity and uniqueness. The experimental "REGional Aggregator of heTerogeneous culTural Аrtefacts" (REGATTA) is the basic practical outcome. Currently the first application of REGATTA is applied for the Plovdiv region and can be accessed on http://www.plovdiv-eu.com. It was designed following the standards of Europeana [EAH, 2010] and characteristics specified in the so-called "passport of cultural values" [Reg.6, 2009]. Its purpose is to bring together objects from the collections of museums and other cultural and historical institutions.

Figure 4. Aggregation of metadata

The foundation of the model for digital library is the structure of the metadata collections for specific items. In historical context, the process of creating such structures recaps the efforts and resources, well-known from the systems of cataloguing, search and retrieval of information in library systems. The structure of the present project [Hadjikolev et al, 2010/UBS] includes modules for the aggregating of metadata from corresponding resources, their storing in a repository and providing services through their processing. An important component for the functionality and

74

Access to Digital Cultural Heritage ...

performance of this architecture is the metadata aggregator. Figure 4 shows the technology of metadata aggregation.

3.1

The Functional Scheme of REGATTA

The selection and preparation of structured metadata is the basis for designing the digital library. The metadata is well-known and is used as a tool for information library services, particularly for searching and finding information. The modern technologies for its use are applied by the Digital Scientific Library NSDL 91. Modern museums tend to keep and serve two kinds of collections: physical and digital. The textual description of the subject of a digital image is regarded as metadata associated with the image. As previously stated, prerequisites for the project were the effective structuring of object collections and maintenance of metadata according to relevant standards – in this case the standards used by Europeana. The functional structure of REGATTA follows the framework suggested in the Open Archival Information System (OAIS) [OAIS, 2002] which has also been adopted as the international standard ISO 14721:2003. This model finds its successful implementations as a common framework with concretizations in application areas for the so-called GLAM (Galleries, Libraries, Archives, Museums). Figure 5 shows the functional schema of REGATTA conforming to OAIS.

Figure 5. REGATTA Functional Entities

The aggregator of digital collections is a web-based technology, enabling the numerous different users not only to publish objects on the Internet but also to create their own models of data, related to these objects [EAH, 2010]. All the objects with the same data model are combined in one collection.

91

http://nsdl.org

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

75

The main features of the technology are: 

Multiusers – users with different roles are organized in a hierarchy with defined access rights to services. Each user can log in and change the basic details of his/her account;



Multilingual – objects can be introduced in one or more languages. For the realization of this mechanism, localized files with basic terms are used as well as localized database;



Catalogue of objects – the standard options for entering, editing and deleting objects and for different types of searches (keywords, categories, dates) are established;



Categories of objects – the standard options for input, edit and delete categories are provided;



Collections of objects –modelling of objects data is enabled.

Treating collections as part of the model is very natural for GLAM institutions and in the case of REGATTA it differs from Europeana's approach. Browsing within a specific collection is natural to visitors of cultural institutions and has a number of benefits in the digital space because it contextualizes the objects and allows easy discovering of multiple objects which are thematically or chronologically related. A good example of that is Ireland's gateway to Irish digital collections and resources DHO:Discovery 92 . It supports the interdisciplinary and inter-institutional sharing of knowledge throughout the Humanities Serving Irish Society (HSIS) consortium and across digital research collections of Irish interest [Gourley and Viterbo, 2010]. A key requirement of DHO's website development was the need to support both the development of thematic research collection project websites, led by the partners and focusing on a single collection of resources, and the need to provide a generic cross collection interface to discover and re-use resources across the whole repository. The goals behind REGATTA's creation were similar – to keep the specificity of each collection in order to assure most colourful representation of each object in its natural environment and to build a common frame which allows different kinds of objects having some semantic coherence on different levels.

3.2

Data Model in REGATTA

In the case of uniform collections, the different kinds of producers usually apply the same metadata model. Here, the collections present different objects. For instance: texts and images for movable artefacts; 3D representation for immovable sites; music or video for representing 92

http://discovery.dho.ie

76

Access to Digital Cultural Heritage ...

folklore and customs, etc. and each object of these different types is supplied with corresponding metadata description. Because of this, REGATTA allows producers to add their own object characteristics. The incorporated technology that reflects the specifics of data input includes [Hadjikolev et al, 2010/UBS]: 

Creation of a basic scheme for object description;



Hiding some of the fields from the forms during data input;



Adding additional specific fields/characteristics from the producers;



Defining names of the models.

The technology adopts the objectively arisen lack of correspondence between the basic characteristics described in the standards and the concrete available data of the institutions. For example, the standard fields in the passport [Reg.6, 2009] are 26 well defined fields. In the passports of the Ethnographic Museum in Plovdiv the objects are described in 37 fields. They also contain fields that are not relevant to the main features. Moreover, not all objects cover the full set of features required. The creation of data models ensures additional classification of the objects and facilitates data input. Collection in the sense of the created technology is a set of objects with the same data model. Using the data modelling mechanism, the different institutions can create a collection of object descriptions based on the already existing metadata schemas or can create their own. The main requirements of the system are: 

Compatibility with the Europeana standards, i.e. the objects provided in the portal can be easily exported to the Europeana portal;



Compatibility with the characteristics required for the movable cultural properties' passport in Bulgaria; the comparison of the national standards with the Europeana standards, where in result many common features are found. The correlation between the names of the characteristics of the two standards is described in the help information provided for each entry field. For example, "Name of the object" of the movable cultural property's passport has a corresponding label in Europeana "dc:title". Different characteristics of the two standards are divided into subpages in order to describe the shape of an object – "Metadata" and "Passport". The "Metadata" are specific features of the Europeana's standards, which are rarely used. The "Passport" contains specific characteristics of Bulgarian museum exhibits. This possibility to introduce specific passport data enables the institutions to use the system as a data repository as well.

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

3.2.1

77

Functional Elements

Ingest: The process of incorporating the digital objects in REGATTA takes place in three phases: Preliminary Phase, Transfer Phase, and Validation Phase (Figure 6). The Preliminary phase includes identification of information about objects that will be presented in the aggregator. The content provider creates a model of the collection or selects any of the already defined models in the REGATA collections. Here, the assessment of the resources (time, people, financial) is done as well. In the Transfer phase, the data input is carried out. Each assistant can manage only objects stored by them. The content providers can monitor the work of their assistants. In the presence of well-structured digital information, the provider may prepare a tool for automatic transfer by defining the appropriate mappings. The Validation phase includes verification of data entered and the deletion of errors and omissions. Only after this phase, digital objects become part of the REGATTA's public record and can be accessed by users. Most of the functions are already established, however some of the functions are in the process of planning and development.

Figure 6. REGATTA Ingest

Archival Storage: This module provides services and functions for storage, maintenance and restoration of digital objects. This includes the receiving of digital objects from the Ingest and adding them to the backup repository, the performance of routine and special checks for faults that periodically backup (duplicate) data for recovery after a system failure. Data management: The module provides services and functions for implementation, maintenance and access to both descriptive information, which identifies the owner of the archive, and to the administrative data used to manage the archive. This includes functions for managing database records, updating the archive, performing operations on data retrieval efficient sets and generating relevant reports and more.

78

Access to Digital Cultural Heritage ...

Administration: The administration module provides services and functions used for the overall functioning of the aggregators: registration and maintenance of accounts, defining collections, data entry in aggregators (digital objects and descriptive information), management of providers' standard sites, search, retrieval and formatting the data resulting from user requests and more. Access: This module is concerned with the access to the aggregator by the two main types of users which can be discerned (Figure 7): humans and robots. The humans use the REGATTA-content through the base portal. Each content provider has a standard website, containing their objects and additional information for them. The second type of users is web applications, which extract data from the aggregator and put it under additional processing. The simpler types are different kinds of search engines that use data directly from the basic model. The more sophisticated web applications use data and links given by the aggregator for incorporating into static or dynamic websites, 3D-tours, virtual excursions, etc. For these purposes REGATTA provides the standard services for generating content. There is an option to define user styles, used by REGATTA to return organized content, which can be incorporated directly into the external web-application without additional processing.

Figure 7. REGATTA Access

3.3

Technological Aspects

The fundamental technological aspects of the current aggregators are in accordance with the protocol OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) established and developed within the Open Archives Initiative community 93 . It is used for extraction and collection of metadata from descriptions in the information providers. The developments, based on this protocol, prevailingly maintain the metadata standard Dublin Core 94. 93 94

http://www.openarchives.org http://dublincore.org

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

79

The design of the systems associated with the Europeana initiative complies with the established rules for preparation of digital libraries, described in [DL, 2007]. The tasks performed by these projects are: 

Preparatory work with digital objects – the expert services of museum professionals are widely used;



Introduction and management of digital objects collections – images, text, video, audio and more;



Retrieval and management of metadata;



Analysis of objects and their descriptions aimed to support their structure and clustering;



Development of modules for access and (integral, semantic, contextbased) search of digital objects;



User requirements: a graphical interface, review services and interactive presentation of digital content and objects, multilingual support;



Administration, supporting roles for the various categories.

3.3.1

User Aspects

The system is maintained in two languages – English and Bulgarian. Every object can be introduced simultaneously in both of these languages and it is displayed in the language version of the portal. Unfortunately most of the descriptions of museum exhibits in Bulgaria do not have English translation. The roles – in the completed system there are several types of users. Each of them has a role that is usually chosen when the user is registering and that specifies what rights they have to use the services: 

Unregistered user – only deals with content – static pages, pages of sites and institutions, carries out different types of searches;



Registered user/Institution – there are two types of registered users – generic users and Institutions. They can create, edit and delete categories, collections, sites and websites (as a sub-domain). The difference is that institutions can activate their objects, while the viewers can activate their objects only when they are authorized by the system administrator. This restriction is necessary for content security reasons;



Consumers to institutions – each institution may set up helpers for the entering of objects. After a review is made over the additions, the institution activates the objects entered by the helpers. After this operation, the objects become inaccessible for the helpers who

80

Access to Digital Cultural Heritage ...

entered them. In some cases, such helpers could be students or employees temporarily engaged with the institution. 

Administrator – controls major categories and collections of the portal as well as objects created by ordinary users.

The work on the development of the online cultural and historical websites continues through the creation of additional opportunities for both rounds of consumers – the museum workers and other visitors with diverse interests.

4 Virtual Tours in REGATTA The virtual tours provide a realistic way to create a full exhibition of architectural sites, museums and galleries on the Internet. Virtual tours are recommended as a good alternative way to visit, especially for people with special educational needs. They should include the necessary information, in order for consumers to receive the same knowledge as in an actual site visit. According to the method of presentation, there are two kinds of virtual tours: three-dimensional and "presentation" type. In a three-dimensional virtual tour (created with the help of photogrammetry) images are projected onto surfaces like walls, buildings floors, etc. In the second type of virtual tours (so-called "presentations"), panoramic images are used. In terms of ways and means for the implementation of virtual tours, the following types can be distinguished: text-based, photo-based, videobased, panoramic and virtual reality in real time [Sobota et al, 2009]. Text virtual tours are a narrative description of the space and content of the display site. When using a program to convert text to speech the tour is converted to a virtual audio tour. In the photo-based virtual tours, objects are displayed through a series of images and their textual description. Depending on the software, photo-based virtual tours can be interactive (e.g. clicking an artefact can zoom, or trigger audio or text description). For example, front presentation in the site of Van Gogh Museum 95 is constructed as a photo-based virtual tour. In the video-based virtual tours, the video view of a typical visitor walking supplemented with some audio information (speech guide, music and/or special effects) and/or text information are proposed. One example is Video and Audio Tour in the British Museum London 96, which can be seen on YouTube. 95 96

http://www.vangoghmuseum.nl/ http://www.youtube.com/watch?v=b71Oi677irI

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

81

The building of panoramic virtual tours is based on a series of consecutive overlapping images that are "sewn" together in order to create a continuous 360° view of the object, or it is done by using a set of special panoramic shots. These tours are interactive. An excellent example is shown on the website of the Vatican Museums97. Virtual reality in real time (3D-virtual tours) are built on the basis of software modelling of three-dimensional objects that are used to make the user feel like being among the exhibits. Users can manage their way through virtual reality as if walking among real objects.

4.1

Panoramic Virtual Tours

Panoramic virtual tours provide the opportunity to view the panoramic images to be viewed in an interactive way. These technologies enable users to examine (crawl, spin, zoom and see additional information about a specific artefact) panorama of the situation as though users are inside the real object. Since 1996 the team of Panoguide 98 has been aiming to provide a free central resource of information and discussion about panoramic photography. The virtual tour is prepared by using multiple panoramic images that are linked through the so-called hot-spots. A hot-spot is a part of the panoramic images which allows interaction with and this can provoke an action – moving to another panoramic image or displaying further information. The most common example is a hot-spot on the door, which transfers the user to the panorama of the room behind the door. For a more realistic representation, sometimes the virtual tours are accompanied by sounds. Panorama is created in various shapes and sizes depending on the chosen projection, showing how the prospect of panoramic images is changed by software to provide a full or partial 3D-scene or a realistic 2D-scene on the computer screen. There are several types of projections used in the creation of panoramas: 

Full ball formats – displayed on all of the surrounding space, visible 360° by horizontal, 90° up and 90° down. Two types of techniques are used here: -

97 98

Ball – a panoramic image is projected onto the sphere inside, and the panorama ratio height:width is 1:2;

http://mv.vatican.va/3_EN/pages/MV_Visite.html www.Panoguide.com

82

Access to Digital Cultural Heritage ...



Cubic – a spherical view is based on the cube, i.e. using six photowalls in the ratio height:width equal to 1:1;

Partial formats – a partial view is shown as a horizontal visible maximum of 360°, but the vertical maximum is 120°. The used techniques are: -

Cylindrical – a display area inside the surrounding wall of the cylinder or part of it (used for landscape panoramas);

-

Straightforward – the horizontal and vertical visibility is restricted to 120° (used for architectural objects);

-

Partial ball – implemented as a full ball, by cutting the highest or lowest point of the panorama.

For recording digital 360° panoramic images technologies are used [Maas and Schneider, 2004]:

the

following



"Stitching" of images through an ordinary camera;



A simple camera with wide angle lens (180°);



A simple camera with hyperbolic mirror;



Camera with rotating sensor linear typesetting;



Camera with multisensory system (four or more sensors, equipped with wide angle lenses).

Also, there is a wide variety of software for making panoramas and panoramic virtual tours (PTViewer 99 , Spi-V 100 , QuickTime 101 , 0-360 UnWrapper 102 , Panoweaver 103 , etc.), as well as file formats for storage (QTVR, JPG, IVR, etc.).

4.2

3D-Virtual Tours

Virtual tours can offer a way to travel back in time by producing objects that are currently not-existing or an earlier view of the existing ones. Also these tours can be used for online visits to existing sites, offering a simulation of an actual visit. For this purpose, 3D computer graphics are used to create mathematical 3D-models of the objects in the scope of the virtual tour. The resulting 3D-model is "decorated" to mimic the real object (texturing elements of the model; adding windows, doors, curtains, furniture, artefacts, etc.). The specialized software grasps the opportunity to crawl and view the model in its individual parts. The resulting virtual tour allows movement 99

http://www.fsoft.it/panorama/ptviewer.htm http://fieldofview.com/projects/spv 101 http://www.apple.com/quicktime/ 102 http://www.0-360.com/software.asp 103 http://www.easypano.com/ 100

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

83

and exploration of objects in real time without any "jumps" in space as hot-spots of panoramic virtual tours. There are four popular ways to create a 3D-model: 

Polygonal modelling – a form of the model that allows to be drawn on with the use of the polygon tool, which is then divided and screened until a desired 3D-shape is acquired and finally, smoothing to make the object look realistic;



NURBS (Non Uniform Rational BSpline) modelling – mathematical curves are painted by a set of equations that have control points that can change the shape of the curve;



Modelling with splines;



Modelling with primitives – use of geometric primitives such as spheres, cylinders, cones and cubes, which help to build a more complex model.

There are many software programs for 3D modelling: 3DS Max and 3DS Max Design 104 , Maya 105 , Blender 106 , DAZ Studio 107 , Cinema 4D 108 , Houndini109, Poser 110, ZBrush 111, Google SketchUp 112, etc.

5 Presentation of Plovdiv Ethnographic Museum in REGATTA The pilot implementation of presenting movable and immovable artefacts in REGATTA was made in collaboration with the Plovdiv Ethnographic Museum.

5.1

Movable Artefacts

Concerning technical interoperability [IDABC, 2004], the processes of migration between other kinds of presentation of the artefacts have to be decided for each case separately. But the first question is "Is there compatibility between the fields of Europeana and the fields in the existing passports of the objects?". The exhibits of the Regional Ethnographic Museum – Plovdiv are allocated to funds/departments and collections defined in [Reg.6, 2009]. The departments in the museum are "Agriculture", "Crafts", "Woven 104 105 106 107 108 109 110 111 112

http://usa.autodesk.com/3ds-max/ http://usa.autodesk.com/maya/ http://www.blender.org/ http://www.daz3d.com/ http://www.maxon.net/ http://www.sidefx.com/ http://poser.smithmicro.com/ http://www.pixologic.com/ http://sketchup.google.com/

84

Access to Digital Cultural Heritage ...

Fabrics and Apparel", "Furniture and Interior", "Ritual Musical Instruments and Props", "Photo Library and Artworks". The Crafts department contains collections such as "Jewels", "Wrought Iron", "Cold Steel" and others that the user himself can create. The Ethnographic Museum in Plovdiv maintains two types of documents in electronic format: (1) passports of the objects in Word format and (2) inventory books in Excel format. They contain the socalled "Scientific passports", made under state requirements [SR, 2009] and reflect the metadata that are mandatory for Bulgarian museum institutions and in the meantime are conformable to MARC standard. Table 2 shows the compatibility between existing data for digital object of the Plovdiv Ethnographic Museum and metadata for Europeana. The sign 'Y' indicates that the metadata are available (in one of the two documents); sign 'N' means that metadata are not available; sign 'A' points that metadata could be created automatically during the process of storing objects into the corresponding digital repository. Table 2.

Relation between metadata (Europeana) and data concerning digital objects of the Plovdiv Ethnographic museum

Strongly recommended dc:title dcterms:alternative dc:creator dc:contributor dc:date dcterms:created dcterms:issued

Additional Y Y Y Y Y Y A

Recommended

Europeana

dc:format dcterms:extent dcterms:medium dc:identifier dc:rights dcterms:provenance dc:relation

A Y Y Y A Y Y

europeana:country europeana:hasObject europeana:isShownAt europeana:isShownBy europeana:language europeana:object europeana:provider

A А А А А А А

dcterms:conformsTo

A

europeana:type

А

dc:coverage dcterms:spatial dcterms:temporal dc:description

Y Y N Y

dcterms:hasFormat dcterms:isFormatOf dcterms:hasVersion dcterms:isVersionOf

A N N N

europeana:unstored europeana:uri europeana:usertag europeana:year

Y A А А

dcterms:isPartOf dc:language dc:publisher dc:source dc:subject dc:type

Y A Y Y Y Y

dcterms:hasPart dcterms:isReferencedBy dcterms:references dcterms:isReplacedBy dcterms:replaces dcterms:isRequiredBy

Y N N N N N

dcterms:requires dcterms:tableOfContents

N Y

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

85

The work with these archives requires analysis of the readiness for creation of metadata in accordance with Europeana standards [EAH, 2010]. The research of the "Crafts" fund of the Ethnographic Museum helped us to specify the basic classes and subclasses of elements in the corresponding groups required by the Europeana metadata scheme. "Identification signs" – contains descriptive criteria (typical for the traditional metadata description as well) such as title, archive number, period, place of residence, type, annotation, etc. of the specific object. "Technical information" digitalization of the object.



includes

information

related

to

the

As an illustration we present part of the xml description in accordance with the requirements of Europeana: Pitcher Earthen Jar Unknown master from Troyan Darin Kambov 1860–1870 Revival 2007

Then some "additional elements" follow – the examination of the scientific passports of the exhibits reveals a satisfactory amount of information for their retrieval – a description of the original object, physical characteristics, data related to conservation, digitalisation, etc. Several questions are examined concerning the automatic transfer of data objects from the collections of the Ethnographic Museum: 

is there a correspondence between the fields of Europeana and the scientific fields in the passport of the object;



can data be automatically extracted from the inventory book (Excelfile) and transferred into the database;



can data be automatically extracted from the passport (Word-file) and transferred into the database;



can we analyze the information from the inventory book and the passport, concerning one particular object, in order to optimize performance.

Despite the aggregator was created quite recently, the automatic transfer of data concerning the objects from the collection of the Ethnographic museum was successful [Hadjikolev et al, 2010/MathTech].

86

Access to Digital Cultural Heritage ...

The Excel-files with simple transformations (e.g. word processing functions or MS Access) were transferred to a database table (in this case – MySQL). After the initial transfer in a secondary table, the data were distributed into the original tables of the object. The transformation of Word-documents into a format, suitable for automated processing, has proved a more difficult task. There are technological solutions, but their use for treatment of a particular file format is meaningful if a joint decision can be accomplished, related to other similar tasks as well. Table 3 shows the bijection between fields of the scientific passport and REGATTA-metadata on the example of one concrete exhibit (shown in Figure 8) of the museum. Currently, the catalogue includes more than 4500 objects from the "Crafts" fund at the Ethnographic Museum. Table 3.

Scientific passport N4501 of the Plovdiv Regional Ethnographic Museum and correspondence to aggregator metadata

Passport fields

Values

Aggregator metadata

Section

Crafts

Collection

Wrought Iron

Name Folk name

Candlestick, wall, Double-arm

Created by the administrator Created by the administrator dc:title dcterms:alternative

Inventory number Dating Number of exemplars

4501 2005 1

dc:identifier dc:date

Material Sizes Weight

Wrought iron H=78cm; W=38cm

dcterms:medium dcterms:extent dcterms:extent

Carats

dcterms:extent

Object is one with inv.numbers Producing place

dc:relation Plovdiv

Keeping place

Craft Fund of REM-Plovdiv

Author of the original Technique

Georgi Manolov Craft mastering

dc:creator

Technology History of the object The object is from the group of Condition Conservation and restorations

dcterms:provenance Professional Very good

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts Object description

Candlestick, wrought iron, wall, double-arm; examination work of Georgi Manolov to obtain a master title from the Association of Masters of Folk Crafts

Literature

87

dc:description

dcterms:references

Surrogates

no

Object moving Remarks Passport maker

Sonya Semerdjieva

Appointment

Chief organizer

Automatically for this object the system filled in: europeana:isShownAt europeana:isShownBy europeana:type europeana:provider

http://www.plovdiv-eu.com/object.php?id=3 http://www.plovdiveu.com/images/user_objects/3/1374422587129031399.jpg IMAGE In progress

Figure 8. Exhibit N 4501 of the Plovdiv Regional Ethnographic Museum

5.2

Virtual Tours of the Plovdiv Ethnographic Museum

The information for immovable sites in REGATTA is still in the process of unification and enrichment. Currently the process of building proper presentation of such sites, including gathering historical materials, building design and scenarios go in parallel with such processes of assuring rights for representing immovable sites in digital form

88

Access to Digital Cultural Heritage ...

concerning different institutions, such as Ministry of culture, the Holy Synod, municipalities, etc. The successful examples of integrating a learning process with collaboration opportunity to develop a shared syllabus and associated teaching and learning resources for humanities visualization, such as educational and research project Second Life, presented in [Denard et al, 2010], gives us assurance to start programs in informatics specialties in Plovdiv Universities for using 3D and VR-representation of these sites as practical works. The first steps for presenting immovable sites are already made [Stoyanov et al, 2011]. 3D and VR representations of the Plovdiv Regional Ethnographic Museum have been created. The connections have been established between a VR representation and already digitized craft collection. Now, the process of incorporating these resources in REGATTA is ongoing. In parallel, the construction of the collections that represent immovable sites are being expanded with the information in correspondence with [CARARE, 2010] in order to establish easy incorporation into Europeana. Below, two kinds of virtual tours of the Plovdiv Ethnographic Museum are presented. 5.2.1

Panoramic Virtual Tour

An experiment was carried out to create a spherical panoramic virtual tour of the Plovdiv Ethnographic Museum. A simple camera with a hyperbolic mirror is used to capture the panoramic view.

Figure 9. 360° panoramic photo image of the museum

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

89

Figure 10. "Extended" 360° image of the museum

The resulting picture is shown in Figure 9. The software 0-360 UnWrapper 113 is used for generating the spherical panorama (Figure 10). The construction of the panoramic virtual tour itself is realized through the Tourweaver 114 software. The resulting file format ".Swf" does not need a special software to be displayed and the virtual tour can be seen using the popular Adobe Flash Player.

Figure 11. View from the panoramic virtual tour of the museum

113 114

www.0-360.com http://tourweaver.en.softonic.com/

90

Access to Digital Cultural Heritage ...

For easier orientation of the user, additional maps are inserted as an accompanying element of the virtual tour in order to allocate possible entry points views of the active view point of the museum space (Figure 11). Entry points to the input views are hot-spots. The point of view is an active radar, which is a hot-spot with very specific action – to show the position and direction of the current view displayed in the map. Maps of the garden and every floor of the museum were created with the Google SketchUp 115 Software.

Figure 12. Presentation of an artefact in the frame of a panoramic tour

Besides the standard navigation elements (direction, zooming in and out, skip a particular place) a lot of hot-spots are placed for quick access to other views (rooms) or for more important artefacts of the museum. The artefacts are presented by the image and text information (metadata) such as name, origin, date and time of creation, author, etc. (Figure 12). Performed acts of hot-spots are generated by JavaScript. 5.2.2

3D-Virtual Tour

Here is presented the development of 3D-model of the Plovdiv Ethnographic Museum and its use for the realization of 3D-virtual tours. Modelling primitives are used for the building of the museum model. The complex objects are split into built primitive forms. The construction is done piece by piece, starting with the walls and ending with the roof and the environment. Special attention is paid to the modeling of some artistic details of the building. A 3D-model of the museum has been built (Figure 13), both from the outside and the inside. The modelling is based on the original scaling dimensions of the museum.

115

http://sketchup.google.com/

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

91

The software program Google SketchUp was used for the 3D modelling of the museum. The program is specifically designed for architects and civil engineers. Google SketchUp is easy to learn and use.

Figure 13. Monochromatic 3D model of the museum

Figure 14. Figure 6. Textured 3D model of the museum

The monochromatic 3D-model is textured with real images from the museum in order to obtain a photorealistic model of the building (Figure 14). Photographs of the museum, which is used for the textures, need to be edited for the establishing the correct perspective of the model (to change the angle of shooting), removing unnecessary objects from the

92

Access to Digital Cultural Heritage ...

picture, like trees and people, and adjusting the image resolution. For such purposes, the images are pre-processed with Adobe Photoshop 116. The textured 3D-models are imported into the Unity3D 117 software to build a virtual tour. This software allows composing of the scene to make animations, adjusting the lighting and the movement inside the museum. Such movement is done through the figure of a man crawling out of the building, guided by the user. The hot-spots for pointing the more significant artefacts of the museum can also be placed here. The artefact will be also enhanced through image and text information (metadata) (Figure 15). Performed acts of hot-spots are implemented through JavaScript.

Figure 15. Show metadata artefacts in 3D-virtual tour

6 The Next Step – Enforcing the Data Management with Data Mining Tools In the frame of the project which supports the realization of the REGATTA aggregator, several tools were invented that use different kinds of data mining techniques for automated metadata extracting and categorization. An approach for indirect spatial data extraction by learning restricted finite state automata is presented in [Blagoev et al, 2009]. It uses heuristics to generalize initial finite-state automata that recognizes the positive examples without extracting any non-positive examples from the training data set. The created system InDES was tested over extraction of spatial metadata from websites and shows promising results. It gives us assurance that such an approach can be used for metadata extraction 116 117

http://www.adobe.com/ http://unity3d.com/

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

93

from objects descriptions and this way can be applied in the process of migration from older representations of the objects in cases when the descriptions are in non-structured form. Association rule mining (ARM) is a popular and well researched method for discovering interesting rules from large collections of data. It has a wide range of applicability, such as market basket analysis, gene expression data analysis, building statistical thesaurus from text databases, finding web access patterns from web log files, discovering associated images from huge sized image databases, etc. One approach for association rule mining, which uses the possibilities of the multidimensional numbered information spaces as a storage structures is presented in [Mitov et al, 2011]. The ArmSquare algorithm is implemented in data mining environment system PaGaNe [Mitov, 2011]. The possibilities of extracting frequent item-sets can be used for enforcing connections between metadata elements within created ontology. Based on similar techniques, but in the field of categorization, are class-association rules (CAR) algorithms. Compared to other classification methods, associative classifiers hold some interesting advantages [Zaiane and Antonie, 2005]. Firstly, high dimensional training sets can be handled with ease and no assumptions are made about the dependencies between attributes. Secondly, the classification is often very fast. Thirdly, the classification model is a set of rules which can be edited and interpreted by human beings. The created algorithm PGN, which is a kind of CARalgorithm is also implemented in PaGaNe. It was implemented in the field of analyzing semantic attributes, extracted from art images using content-based image retrieval. The rules, extracted by PGN, form the semantic profiles of the examined movements [Ivanova, 2011]. Similarly within the frame of the data management and the access of the aggregator the classifier PGN can be used for enforcing information discovery. The integration of data in the virtual space has to be examined also in the context of global Semantic Web. As it is mentioned in [Berners-Lee, 2009] "the Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the Web of Data". This Web of Data enables new types of applications. RDF links enable you to navigate from a data item within one data source to related data items within other sources using a Semantic Web browser. Such links can also be followed by the crawlers of Semantic Web search engines, which may provide sophisticated search and query capabilities over crawled data [Bizer et al, 2009]. In order to supply such integration

94

Access to Digital Cultural Heritage ...

W3C created Library Linked Data incubator group 118, whose mission is to help increase global interoperability of library data on the Web, by bringing together people involved in Semantic Web activities – focusing on Linked Data – in the library community and beyond, building on existing initiatives, and identifying collaboration tracks for the future.

7 Conclusion Work on the development of online cultural and historical objects continues through the creation of additional new opportunities for both rounds of consumers – the museum curators and other visitors with diverse interests. In the modern museology the conception that the value of the artefact depends not only on information contained herein, but also on the facilitated distributed access to it is already emerging. Digital libraries are the appropriate solution to make museums information centres for a wide range of information services for museum professionals as well as for visitors and users of museum information. The presented technology supports dynamic creation of categories and standardized collections of objects. The opportunities for modifying allow associating additional collections in the system. It can be used in the construction of catalogues with various objects – cultural, historical, natural, personal data, etc.

Bibliography [Berners-Lee, 2009] Berners-Lee, T.: Linked Data. http://www.w3.org/DesignIssues/LinkedData.html [Bizer et al, 2009] Bizer, C., Heath, T., Berners-Lee, T.: Linked data – the story so far. Int. Journal SWIS, 5(3), 2009, pp. 1-22. [Blagoev et al, 2009] Blagoev D., Totkov, G., Staneva, M., Ivanova, K., Markov, K., Stanchev, P.: Indirect spatial data extraction from web documents. IBS ISC 14: New Trends in Intelligent Technologies, Sofia, 2009, pp. 89-100. [CARARE, 2010] White Paper on CARARE Technical Approach. http://www.carare.eu/fre/Media/Files/White-paper-on-CARARE-technical-approach [Chen et al, 2005] Chen, C.-C., Wactlar, H., Wang, J. Z., Kiernan, K.: Digital imagery for significant cultural and historical materials – An emerging research field bridging people, culture, and technologies. Int. Journal Digital Libraries, 5(4), 2005, pp. 275-286 [Cousins, 2011] Cousins, J.: The cultural heritage of Europe: building the value proposition for Europeana. In: The Journal for the Serials Community, 24 (1), 2011, pp. 69-78.

118

http://www.w3.org/2005/Incubator/lld/

Chapter 2: REGATTA – Regional Aggregator of Heterogeneous Cultural Artefacts

95

[Denard et al, 2010] Denard, H., Salvatori, E., Simi, M.: An integrated approach to digital cultural heritage. 3rd Int. Conf. dedicated on Digital Heritage EuroMed2010, Limassol, Cyprus, 2010: Short Papers, Archaeolingua, Budapest, pp.73-78. [DL, 2007] Digital Libraries: Research and Development. LNCS, Vol. 4877, 2007, pp. 22-35 [EAH, 2010] Europeana Aggregators' Handbook. http://version1.europeana.eu/c/document_library/get_file?uuid=94bcddbf-36254e6d-8135-c7375d6bbc62&groupId=10602 [EM-TR, 2007] Bulgaria – Collective Memory and National Identity. Tech. Report, Ethnographical Museum with Institute – BAS, 2007. [Gourley and Viterbo, 2010] Gourley, D., Viterbo, P.: A sustainable repository infrastructure for Digital Humanities: the DHO experience. 3rd Int. Conf. dedicated on Digital Heritage EuroMed2010, Limassol, Cyprus, 2010: Project Papers, LNCS 6436, pp. 473-481. [Hadjikolev et al, 2010/MathTech] Hadjikolev, E., Vragov, G., Totkov, G.: Aggregator for standardized collection of metadata for cultural and historical objects. National Conference with International Participation MathTech, Shumen, Bulgaria, 2010, UI "Ep. K. Preslavski", 2011, vol.1 pp. 141-149 (in Bulgarian) [Hadjikolev et al, 2010/UBS] Hadjikolev, E., Totkov, G., Vragov, G.: Digital technologies for presenting museum collections. Proc. of Annual Scientific Conference of Union of the Bulgarian Scientists, Plovdiv, Bulgaria, 2010, pp. 192-198 (in Bulgarian) [IDABC, 2004] European Interoperability Framework for pan-European eGovernment Services. http://ec.europa.eu/idabc/en/document/3761/5845.html [Ivanova, 2011] Ivanova, K.: A Novel Method for Content-Based Image Retrieval in Art Image Collections Utilizing Color Semantics. PhD Thesis, Hasselt University, Belgium, 2011. [Krasteva, 2007] Krasteva, St.: The Museology – Meeting between Alpha and Omega of Self-knowledge. Sofia, 2007 (in Bulgarian). [Maas and Schneider, 2004] Maas, H.-G., Schneider, D.: Photogrammetric processing 360 panoramic images. GIM International 7(4), 2004, pp. 68-71. [Mitov et al, 2011] Mitov, I., Ivanova, K., Depaire, B., Vanhoof, K.: ArmSquare: an association rule miner based on multidimensional numbered information spaces. Proc. of First Int. Conf. IMMM, Barcelona, Spain, 2011, pp.143-148. [Mitov, 2011] Mitov, I.: Class Association Rule Mining Using Multi-Dimensional Numbered Information Spaces. PhD Thesis, Hasselt University, Belgium, 2011. [Nedkov, 1998] Nedkov, S.: Museums and Museology. Sofia, 1998. [OAIS, 2002] Reference Model for an Open Archival Information System (OAIS): Blue book. Consultative Committee for Space Data Systems, January 2002, http://public.ccsds.org/publications/archive/650x0b1.PDF [Piccininno, 2009] Piccininno, M.: Analysis of the Еuropeana and Аthena survey for the aggregators. www.athenaeurope.org/getFile.php?id=609 [Reg.6, 2009] Regulation Number 6 from 11.12.2009 for Creating and Managing of Museum Funds. Ministry of Culture of the Republic of Bulgaria – State Gazette 2/2010, (in Bulgarian). [Sobota et al, 2009] Sobota, B., Korecko, S., Perhac, J.: 3D modeling and visualization of historic buildings as cultural heritage preservation. In 10th International Conference on Informatics, Herl'any, Slovakia, 2009.

96

Access to Digital Cultural Heritage ...

[Somova et al, 2010] Somova, E., Vragov, G., Totkov, G.: Toward regional aggregator of digitalized cultural-historical objects. Proc. of National Conf. Education in Information Society, Plovdiv, Bulgaria, 2010, pp. 154-161 (in Bulgarian) [Stoyanov et al, 2011] Stoyanov, K., Yordanova, B., Somova E., Totkov, G.: Two experimental virtual tours in the Plovdiv Ethnographic Museum. Proc. of National Conf. Education in Information Society, Plovdiv, Bulgaria, 2011, pp. 35-43 (in Bulgarian) [Zaiane and Antonie, 2005] Zaiane, O., Antonie, M.-L.: On pruning and tuning rules for associative classifiers. In Knowledge-Based Intelligent Information and Engineering Systems, LNCS Vol.3683, 2005, pp. 966-973.

Chapter 3: Automated Metadata Extraction from Art Images Krassimira Ivanova, Evgenia Velikova, Peter Stanchev, Iliya Mitov 1 Introduction Each touch to the artwork causes building the bridge between cultures and times. The unique specific of visual pieces of arts is that they are created by a cognitive process. It can therefore be instructive not to only understand the way we look at an artistic image, but also to understand how a human being creates and structures his artwork. As was mentioned in [Chen et al, 2005] "research on significant cultural and historical materials is important not only for preserving them but for preserving an interest in and respect for them". Many art masterpieces have created over the centuries, which are scattered throughout the world. The direct touch to these treasures for most of peoples is impeded by many obstacles. On the other hand, the understanding of the art masterpiece is the process of learning not only of the artefact itself as well as the environment of its creation. Since its first edition published in 1962 Janson's History of Art [Janson, 2004] is one of the most valuable sources of information spanning the spectrum of Western art history from the Stone Age to the 20th century. It became a major introduction to art for kids and a reference tool for adults trying to remember the identity of some embarrassingly familiar image. The colourful design and vast range of extraordinarily high-quality illustrations does not only present "dry" information, but also evokes deep emotional fulfilment by the touch to masterpieces. However, nowadays online search engines have increased

98

Access to Digital Cultural Heritage ...

the appetite of web surfers for context and information, and there are numerous digital collections offering easy access to digital items. They present the colourfulness of art history as well as relevant metadata, provide additional information from purely technical details, ranging from the way of creating the artefacts to deeply personal details from the life of the creators, which help the observers to understand the original message in the masterpieces. For this purposes, the development of the image retrieval techniques became very important for creating appropriate and facile search engines. The field of image retrieval has to overcome a major challenge: it needs to accommodate the obvious difference between the human vision system, which has evolved genetically over millenniums, and the digital technologies, which are limited within pixels capture and analysis. We have the hard task to develop appropriate machine algorithms to analyze the image. These algorithms are based on completely different logic and "instruments" compared to the human process of perception, but would give similar results in interpreting the input image. In the context of this thesis the challenges are even bigger because we focus our efforts on image analysis of the aesthetic and semantic content of art images. Naturally, the interpretation of what we – humans – see is hard to characterize, and even harder to teach to a machine. Yet, over the past decade, considerable progress has been made to make computers learn to understand, index and annotate pictures representing a wide range of concepts. In spite of the fact that computers still not wield the human vocabulary and semantic, their methods and abilities for analysis information already make him irreplaceable assistant in many fields of study of art. Computers can analyze certain aspects of perspective, lighting, colour, the subtleties of the shapes of brush strokes better than even a trained art scholar, artist, or expert. As David Stork has mentioned [Stork, 2008] "the source of the power of these computer methods arises from the following: 

the computer methods can rely on visual features that are hard to determine by eye, for instance subtle relationships among the structure of a brush stroke at different scales or colours;



the computer methods can abstract information from lots of visual evidence;



computer methods are objective, which need not mean they are superior to subjective methods, but promise to extend the language of to include terms that are not highly ambiguous."

Chapter 3: Automated Metadata Extraction from Art Images

99

2 Semantic Web During the years, the ability of processing the information as well as expanding the ways of data exchanging is increasing in parallel. The development of computing and communication capacities allows to place the user in the central point of the process of information exchange and to enable him to use all power of the intellectualized tools for satisfying his wishes. Amit Agarwal [Agarwal, 2009] provides a simple and clear comparison between Web 1.0, Web 2.0 and Web 3.0 (Table 4). Table 4.

"Comparison table" between Web 1.0, Web 2.0, Web 3.0 (excerpt from [Agarwal, 2009])

Web 1.0 "the mostly read only web" Focused on companies Home pages Owning content

Web 2.0 "the wildly read-write web" Focused on communities Blogs Sharing content

Britannica Online Directories ("taxonomy") Netscape

Wikipedia Tagging ("folksonomy") Google, Flickr, YouTube

Web 3.0 "the portable personal web" Focused on the individual Lifestream Consolidating dynamic content The semantic web User behavior ("me-onomy") iGoogle, NetVibes

Starting from read-only content and static HTML websites in Web 1.0, where people are only passive receivers of the information, Web 2.0 became as participation platform, which allows users not only to consume but also to contribute information through blogs or sites like Flickr 119 , YouTube 120, etc. These sites may have an "Architecture of participation" that encourages users to add value to the application as they use it. According to David Best [Best, 2006], the characteristics of Web 2.0 are: rich user experience, user participation, dynamic content, metadata, web standards and scalability. Further characteristics such as openness, freedom and collective intelligence by way of user participation, can also be considered as essential attributes of Web 2.0. The pros and cons of using such paradigm as well as other one are many; for a good range of initiatives of social media outreach in the cultural heritage institutions see [WIDWISAWN, 2008]. Let´s mention alternatives, discussed from Eric Raymond in [Raymond, 1999] concerned two fundamentally different development styles, the "cathedral" model of most of the commercial world versus the "bazaar" model of the Linux open-source world, where the advantages of such social self-build 119 120

http://www.flickr.com/ http://www.youtube.com/

100

Access to Digital Cultural Heritage ...

systems are shown. Here the situation is similar. For instance, while the Encyclopaedia Britannica Online 121 relies upon experts to create articles and releases them periodically in publications, Wikipedia relies on anonymous users to constantly and quickly contribute information. And, as in many examples, the happy medium is the right position. Many art repositories and portals are used for educational purposes; consequently control over the main presented text is very important. On the other hand, they are natural places for users to share their own opinion and to have a space for communication. The interest of users measured in number of hits and traces of their activity grows when they are able to add their own content or to comment on existing commentaries [Ivanova et al, 2010/Euromed]. In the area of art images social networking sites can help extent the number of users consulting an image; for example the Library of Congress explained at the American Library Association annual conference in 2010 that the number of visitors consulting images which can be both accessed on the Library of Congress website and on flickr.com attracted higher number of visitors on Flickr. The user generated comments on Flickr also helped to improve the metadata records the Library maintained. Not much time passed before the idea of "Web 3.0" appeared. Amit Agarwal suggests that Web 3.0 is about semantics (or the meaning of data), personalization (e.g. iGoogle), intelligent search and behavioural advertising among other things [Agarwal, 2009]. While Web 2.0 uses the Internet to make connections between people, Web 3.0 will use the Internet to make connections with information. The intelligent browsers will analyze the complex requests of the users made in natural language, will search the Internet for all possible answers, and then will organize the results. The adaptation to user specifics and aptitudes (personalisation) will be based on capturing the historical information thorough searching the Web. Many of the experts believe that the Web 3.0 browser will act like a personal assistant. The computer and the environment will become artificial subjects, which will pretend to communicate in real manner as real humans. Of course, the problems of applying rights policies in such a new atmosphere are crucial. However, addressing the rights is an additional issue which needs to be solved. A core problem in this domain continues to be finding appropriate combination of retrieval methods and techniques, which can lead to high quality image discovery. In the era of Web 3.0 bridging the semantic gap stands crucial.

121

http://www.britannica.com/

Chapter 3: Automated Metadata Extraction from Art Images

101

3 The Process of Image Retrieval Information retrieval is the science of searching for digital items, based both on their content and the metadata about them. Information retrieval can be done on different levels, from personal digital collections to world repositories in the WWW. It is interdisciplinary and attracts the interest of wide range of researchers and developers from a number of domains: computer science, mathematics, library science, information science, information architecture, cognitive psychology, linguistics, statistics, physics, etc. Image retrieval is part of it; it focuses on the processes of browsing, searching and retrieving images from large collections of digital images. There are two basic methods in image retrieval: text-based retrieval and content-based image retrieval (CBIR), which are used separately or together. Traditional text-based indexing uses controlled vocabulary or natural language to document what an image is or what it is about. Newly developed content-based techniques rely on a pixel-level interpretation of the data content of the image. The upper stage of indexing techniques – concept-based indexing is based on mixing of simple text-based and content-based tools taking into account additional information for interconnections between perceived information from the main player of this process – "the user".

3.1

Text-Based Retrieval

Search systems based on textual information contain metadata about the images such as captioning, keywords, or descriptions of the images; the retrieval is performed over the annotated words. These methods are easily implemented using already existing technologies, but require manual data input for each image in the system. Manual image annotation is time-consuming, laborious and expensive and is a potential bottleneck because the speed of manual description and data entry is lower than the speed of digitisation. This is unpractical for the huge collections or automatically-generated images. Usually text-based descriptions are not considered accurate and precise and are often incomplete. Another problem with text annotation is that it often does not conform any defined vocabulary in a particular domain and may not describe the relations of the objects in the images – besides the subjectivity of judgements of different people who entry data. Another inconvenience comes also from the lack of universal solutions for dealing with synonyms in the language, and difference of the users' languages. Additionally, the increase in social web applications and the semantic web have inspired the development of several web-based image annotation tools as crowdsourcing solutions. In the frame of text-based retrieval we

102

Access to Digital Cultural Heritage ...

can put also context-based technique, where retrieval is based on the analysis of free textual information, which became a context of the image [Hung et al, 2007]. The current efforts in the area of structuring information in digital repositories are focused mainly in two directions: 

Assistance in the processes of ordering and classifying the metainformation (such as Getty's AAT, ULAN, TGN, CONA). The use of these ontological structures in image retrieval processing leads to a decreasing metadata amount and expands the research scope utilizing defined interconnections between concepts;



Development of metadata schemas and structures to classify image information (for instance Dublin Core, VRA Core, CIDOC CRM). They provide conceptual models intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information.

As example for successful project aimed to use the advantages of such directions is developed by the team of Radoslav Pavlov "Bulgarian Iconographical Digital Library (BIDL)" [Pavlov et al, 2010]. A tree-based annotation model had been developed and implemented for the semantic description of the iconographical objects. It provides options for autocompletion, reuse of values, bilingual data entry, and automated media watermarking, resizing and conversing. A designated ontological model, describing the knowledge of East Christian Iconographical Art is implemented in BIDL; it assists in the annotation and semantic indexing of iconographical artefacts. The global vision of BIDL is based on a longterm observation of the end users preferences, cognitive goals and needs, aiming to offer an optimal functionality solution for the end users [Pavlova-Draganova et al, 2010].  Ontologies as a Form of Ordering and Classifying the Metainformation A conceptualization is an abstract, simplified view of the world that we wish to represent for some purpose. Every knowledge base, knowledgebased system, or knowledge-level agent is committed to some conceptualization, explicitly or implicitly. Ontologies are used in informatics as a form of knowledge representation about the world or some part of it [Gruber, 1993]. As Gruber said "Ontology is a formal, explicit specification of a shared conceptualization". The term is borrowed from philosophy, where an ontology is a systematic account of Existence. In computer science an ontology is a formal representation of the knowledge by a set of concepts within a domain and the relationships

Chapter 3: Automated Metadata Extraction from Art Images

103

between those concepts. It is used to reason about the properties of that domain, and may be used to describe the domain. According theirs specificity the ontologies are: Generic ontologies (synonyms are "upper level" or "top-level" ontologies), in which defined concepts defined are considered to be generic across many fields; Core ontologies, where defined concepts are generic across a set of domains; and Domain ontologies, which express conceptualizations that are specific for a specific universe of discourse. The concepts in domain ontologies are often defined as specializations of concepts in the generic and core ontologies. The borderline between different kinds of ontologies is not clearly defined because core ontologies intend to be generic within a domain. During the image processing ontologies on different levels, includes visual, abstract and application-level concepts, can be used. The using of ontological structures in image retrieval processing allows decreasing of metadata amount and expands the research scope, using defined interconnections between concepts, specified in used ontologies.  Metadata Schemas and Structures Resource Description Framework Schema (RDFS) is a family of specifications for the description of resources by setting the metadata for resources. RDF was developed by World Wide Web Consortium (W3C) 122 and adopted by Internet Society (ISOC) 123 as a standard for semantic annotation. RDFS is used as a basic method for conceptual description or modelling of information contained in web resources with different formats and syntax. This mechanism for describing resources is a major component of current activities of the Semantic Web, in which automated software can store, exchange and use information disseminated thorough the Internet. It gives the opportunity to the users to operate more efficiently and safely with information. The ability to model heterogeneous abstract concepts through the RDF-model leads to increasing its applying for knowledge management of the events related to the activities of the Semantic Web. The basic idea consists in using special expressions for describing content resources. Each expression describes the relationship "subject – predicate – object", which in RDF terminology is called a triplet. The identification of the subject, predicates and objects in RDF is made by Uniform Resource Identifier (URI). URI is a string, which unique identify a resource in the digital space: document, image, file, folder, mailbox, etc. 122 123

http://www.w3.org/TR/rdf-schema/ http://www.isoc.org/

104

Access to Digital Cultural Heritage ...

The most popular examples of the URI are URL (Uniform Resource Locator) and URN (Uniform Resource Name). URL is a URI, which identify the resource and in parallel provides information about its location. URN is a URI, which identifies the resource in a specific namespace (i.e. in context). In order to avoid the limitations of using only a set of Latin characters and characters W3C and ISOC gradually imposed a new standard IRI (International Resource Identifier), which are free to use any Unicode-characters. RDF-expressions are represented by labelled directed multi-graph. Such RDF-data model is naturally suited to certain types of knowledge representation to relational model and other ontological models traditionally used in computers today. However, in practice, RDF-data often continue to be stored in relational databases or through their own descriptors, called Triplestores or Quadstores. RDFS (Resource Description Framework Schema) and OWL (Web Ontology Language) 124 indicate the possibility of using RDF as a base to build additional ontology languages. The most critical part is to define and collect some metadata that described the analyzed object. There exists a number of text-based indexing initiatives deal with the development of metadata schemas and structures to classify image information. We could mention for example Dublin Core 125 , which is used primarily for retrieving resources on the web, VRA Core 126, which has elements to describe both an original work of art and its surrogate, CIDOC CRM 127 that gives conceptual reference model intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information.

3.2

Content-Based Image Retrieval (CBIR)

Content-based image retrieval, as we see it today, is any technology that in principle helps to organize digital images based on their content. By this definition, anything ranging from an image similarity function to a robust image annotation engine falls into the range of CBIR. This characterization of CBIR as a field of study places it at a unique juncture within the scientific community. While we witness continued effort in solving the fundamental open problem of robust image understanding, we also see specialists from different fields, such as, computer vision, machine learning, information retrieval, human-computer interaction, database systems, Web and data mining, information theory, statistics, 124 125 126 127

http://www.w3.org/TR/owl-features/ http://dublincore.org/documents/dces/ http://www.vraweb.org/organization/committees/datastandards/index.html http://www.cidoc-crm.org/

Chapter 3: Automated Metadata Extraction from Art Images

105

and psychology contributing and becoming part of the CBIR community [Wang et al, 2006]. John Eakins and Margaret Graham affirm that Content-Based Image Retrieval is a term first used in 1992 by Toshikazu Kato [Eakins and Graham, 1999], when explaining his experiments on automatic extraction of colour and shape of paintings stored in a database [Kato, 1992]. Since then the term was used to describe the process of image retrieval from large collections using features extracted from the content of the image based on their visual similarity with a query image or image features supplied by an end user. Before designing and constructing CBIR system one very important step is selecting the domain where the system will be used. Different domains would be addressed by specific functional and non-functional requirements, which have to be covered by the system. During the years a wide spectrum of areas refers to CBIR system, such as medical diagnostic, geographical information and remote sensing systems, crime prevention, the military, intellectual property, photograph archives, architectural and engineering design, art collections, etc. From the point of view of the application area the images can represent different type of sensor-related data, projected or directly received in digital formats. The digital imagery includes colour and black-and-white photographs, infrared photographs, video snapshots, radar screens, synthetic aperture radar formats, seismographs records, ultrasound, electrocardiographic, electroencephalographic, magnetic resonance and others. Typically, a content-based image retrieval system consists of three components: 

Feature design;



Indexing;



Retrieval.

The feature design component extracts the visual feature(s) information from the images in the image database. The indexing component organizes the visual feature information to speed up the query or processing. The retrieval engine processes the user query and provides a user interface. During this process the central issue is to define a proper feature representation and similarity metrics. CBIR systems extract visual features from the images automatically. Similarities between two images are measured in terms of the differences between the corresponding features. To take into account the subjectivity of human perception and bridge the gap between the high-level concepts and the low-level features, relevance feedback is used as a means to enhance the retrieval performance.

106

Access to Digital Cultural Heritage ...

All these steps are highly dependent of the domain where CBIR technology is applied. For instance, in the fields such as aerial image retrieval and medicine the goal is exactly defined, the searched objects in the images has homogeneous specifics, the received results usually do not need communication with the user to refine the queries. Absolutely different is the situation in the areas that are connected with the creative side of the human beings, such as art, architecture and design. The different kinds of users also stamp different requirements into specifics of CBIR systems. Handling with digital copies of artworks has a wide spectrum of different directions and concern different types of users: 

Museum workers: Analysis of the artwork itself, its lifecycle, preservation and restoration are very important but heavy tasks where automatic image processing techniques have proved theirs usability during last decades;



Universal citizens: Taking into account that artwork brings a specific authors' message to the viewer the computer should provide the ability to present history, context, and relevance in order to enrich education, enhance cross-cultural understanding, and sustain one's heritage and cultural diversity;



Computer scientists: Except wide standard questions for serving the processes of image analysis and managing repositories, the grand challenge is determining image semantics and automatically verbalizing it.

4 The Gaps One of the most felicitous analogies for presenting the existing semantic gap in area of Content-Based Image Retrieval can be found in "The Hitch-Hiker's Guide to Galaxy" by Douglas Adams. In this story, a group of hyper-intelligent pan-dimensional beings demand to learn the "Answer to Life, the Universe, and Everything" from the supercomputer Deep Thought, specially built for this purpose. It takes Deep Thought 7½ million years to compute and check the answer, which turns out to be "42". The efforts of covering the semantic gap in CBIR are turned to avoid these misunderstanding between human perceiving and the ways of communications and computer manner of low-level representations [Ivanova and Stanchev, 2009]. Search in the context of content-based retrieval analyzes the actual contents of the image. The term content might refer to colours, shapes, textures, or any other information that can be derived from the image itself. Acknowledging the need for providing image analysis at semantic

Chapter 3: Automated Metadata Extraction from Art Images

107

level, research efforts set focus on the automatic extraction of image descriptions matching human perception. The ultimate goal characterizing such efforts is to bridge the so-called semantic gap between low-level visual features that can be automatically extracted from the visual content and the high-level concepts capturing the conveyed meaning [Dasiapoulou et al, 2007]. The semantic gap is not a unique cause of difficulties in the process of information retrieval where issues can arise on the whole range starting from the primary object' complexity and ending with end-user subjectivity. Currently different gaps are being discussed in the research literature: sensory, semantic, abstraction, and subjective.

4.1

Sensory Gap

The sensory gap is "the gap between the object in the world and the information in a (computational) description derived from a recording of that scene" [Smeulders et al, 2000]. The sensory gap exists in the multimedia world as the gap between an object and the machine's capability to capture and define that object. Digitalization has a big challenge when applied to art-works and this is to develop techniques for creating digital objects, which allow capturing the paintings in good quality. Circumstances (such as the condition of the pictures, the lighting, the capabilities of used photo-cameras or scanners, the chosen resolution, etc.) play a major role in this process. The sensory gap in this area inevitably results in the impossibility to present real sizes of the pictures or to present all pictures in one proportional scale. One can only see the proportion of the height and length. For instance Picasso's Guernica is 3.59 m x 7.76 m, while the miniatures of Isaac and Peter Oliver not exceed 2.5 cm x 2.5 cm. This sensory gap can be omitted only with additional metadata, taken from the camera or manually added to the picture. The granularity of digitalized sources of artworks is in accordance with their usage. Figure 16 summarizes the connections between different kinds of users and the amount and quality of corresponding digital sources. For the purposes of professional analysis in museums, special kinds of images, received from different photographic processes such as multicolour banding, x-rays and infra-red imaging are used. For the purposes of professional printing industry very high definition and quality images are needed [Maitre et al, 2001]. The royalties and copyright restrictions from one side [Mattison, 2004], the necessity of high speed delivery on the Internet from the other side, and the limitations of visual devices

108

Access to Digital Cultural Heritage ...

(monitors) from the third side, impose restrictions of the sizes and resolutions of digital images.

Figure 16. Digitized art images: quality and usage

Usually in the Internet space the presentations of the art paintings vary from: 

About 100x100 pixels for front presentation of the paintings as thumbnails;



Image surrogates designed for presenting the painting on the screen, usually about 500 pixels by width or height, supplemented by additional text information, concerning the picture – author, sizes, techniques, locality, history of the creation, subject comment, etc.;



High-definition images for web access, usually up to 1500 pixels by width or height;



Finally, up to 4000 pixels by width or height; often watermarked items. The access to high-resolution and ultra-magnified images usually is defined by policies which set sets of use restrictions.

The fact that a digitized work of art is not the work itself but an image (instance) of this work, acquired at a certain time under specific conditions, makes semantic-based indexing and retrieval an absolute necessity in this area. For example, a query on "Mona Lisa" should retrieve all images of the painting regardless of their size, view angle, restoration procedures applied on the painting, etc. [Chen et al, 2005a].

4.2

Semantic Gap

The semantic gap is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data has for a user in a given situation [Smeulders et al, 2000]. The semantic gap is larger in visual arts images than in natural images since artworks are most often not perfectly realistic.

Chapter 3: Automated Metadata Extraction from Art Images

109

In simple terms, the semantic gap in content-based retrieval stems from the fact that multimedia data is captured by devices in a format, which are optimized for storage and very simple retrieval, and cannot be used to understand what the object "means". In addition to this, user queries are based on semantic similarity, but the computer usually processes similarity based on low-level feature similarity. In early systems low-level representation was considered most reliable. Later, to bridge this gap, text annotations by humans were used in conjunction with low-level features of the objects. To extend the annotation list, different ontology systems have been used for further improving the results. It is applied generic ontologies as WordNet 128 , specialized international standard structures as Dublin Core Element Set 129 and VRA Core Categories 130 , or special ontologies designed for description of artefacts, such as IconClass 131 and Categories for the Description of Works of Art (CDWA) 132. The annotations are used to group images into a certain number of concept bins, but the mapping from image space to concept space is not one-to-one, since it is possible to describe an image using a large number of words. The labelling of images is made not only for the whole image, but also for separate parts of the image [Enser et al, 2006]. The semantic gap is very critical to content-based multimedia retrieval techniques. As authors in [Smeulders et al, 2000] state: "The aim of content-based retrieval systems should be to provide maximum support in bridging the semantic gap between the simplicity of available visual features and the richness of the user semantics".

4.3

Abstraction Gap

The abstract aspects are specific to art images and differ from the semantic challenge. There are two major directions in this area, the first one addressing cultural specifics and the second one addressing technical differences. Cultural abstraction relates to information inferred from cultural knowledge [Hurtut, 2010]. Artistic style analysis belongs to that category. According to the American Dictionary [Pickett et al, 2000] style is "the combination of distinctive features of artistic expression, execution or performance characterizing a particular person, group, school or era."

128 129 130 131 132

http://wordnet.princeton.edu/ http://www.dublincore.org/ http://www.vraweb.org/vracore3.htm http://www.iconclass.nl http://www.getty.edu/research/conducting_research/standards/cdwa/

110

Access to Digital Cultural Heritage ...

According to [Hurtut, 2010] style and semantic depiction share the same visual atomic primitives (lines, dots, surfaces, textures). Manual style recognition is a very difficult task, requiring the knowledge of numerous art historians and experts. An artwork from the blue period of Picasso for instance is recognizable not only because of its blue tonality, but because of many semantic features and iconographical cues. Technical abstraction deals with questions about real artist of the artwork (artwork authentication), but can be focused into artistic praxis such as perspective rendering, pigment identification, searching for preliminary sketches in underlying layers, pentimenti, engraving tools, etc. These aspects can be analyzed using different imaging techniques such as X-rays, UV and infrared imaging.

4.4

Subjective Gap

The subjective gap exists due to users' aspirations and the descriptions of these aspirations. It may be difficult for a user to express what he wants from a multimedia retrieval system [Agrawal, 2009]. The subjective gap also exists due to the non-availability of any features which user wants to express. Some authors [Castelli and Bergman, 2002] identify this as an intermediate level of extraction from visual content, connected with emotional perceiving of the images, which usually is difficult to express in rational and textual terms. The subjective gap is similar to the semantic gap; it refers to the lack of ability of the user to describe his needs (queries) to a retrieval system. To bridge this gap, instead of defining user's requirements at a very fine granularity level, higher level concepts can be used. The relevance feedback technique combined by neural network or fuzzy systems can bridge this subjective gap to some extent [Grosky et al, 2008]. In the recent years the term "Emotional Semantic Image Retrieval" enjoys growing popularity in scientific publications. The visual art is an area, where these features play significant role. Typical features in this level are colour contrasts, because one of the goals of the painting is to produce specific psychological effects in the observer, which are achieved with different arrangements of colours. W. Bruce Croft introduced the concept "aesthetic gap" as "the lack of coincidence between the information that one can extract from low-level visual data and the interpretation of emotions that the visual data may arouse in a particular user in a given situation" [Croft, 1995]. Aesthetics is similar to quality as perceived by a viewer and is highly subjective. Modelling aesthetics of images will evolve in near future. The thesis of Rittendra Datta, presented in 2009 [Datta, 2009] is focused just to the

Chapter 3: Automated Metadata Extraction from Art Images

111

semantic and aesthetic inference for image search, using statistical learning approaches. Emotional abstraction relates to emotional responses evoked by an image. These issues are addressed in the research domain called affective computing which enjoys widespread attention among computer scientists beyond those working on cultural heritage. In principal artworks by their nature are images that naturally evoke affective effects. Due to their implicit stylization, one does not look at artistic images with the same kind of attention and expectation as for natural images. Many approaches try to bridge the gap between selected low-level features and several emotions expressed with pairs of words, e.g. warmcool, action-relaxation, joy-uneasiness [Colombo et al, 1999]. In [Weining et al, 2006] is shown that emotional expression of the image is closely connected with such low-level characteristics as colour and luminance distributions, saturation and contrast information as well as edge sharpness in images.

5 User Interaction Bridging the gaps is closely connected with user interaction. This is the place where the user and the system communicate. The main focus in the creation of digital art resources has to be usercentred rather than system-centred since most of the issues around this content are related to making it accessible and usable for the real users [Dobreva and Chowdhury, 2010].

5.1

Complexity of the Queries

In the image retrieval systems, an important parameter to measure user-system interaction level is the complexity of queries supported by the system. The queries can use different modalities, such as: 

Direct entry of values of the desired features (query by percentage of properties). This method is not usually used in current systems because it is not particularly convenient for the users;



Image, also known as query by example: Here, the user searches for images similar to a query image. Using an example image is perhaps the most popular way of querying a CBIR system in the absence of reliable metadata;



Graphics, or query by sketch: This consists of a hand-drawn or computer-generated picture, where graphics are used as a query.



Keywords or Free-Text: This is a search in which the user makes a query in the form of a word or group of words, selected from previously defined set or in free form. This is currently the most

112

Access to Digital Cultural Heritage ...

popular way in web image search engines like Google, Yahoo! and Flickr. Usually this search is based on manually attached metadata or context driven information. Numerous current efforts are directed to finding the methods for automated labelling of the images – a challenge for the CBIR systems in present days; 

Composite: These are methods that combine one or more of the aforesaid modalities for querying a system. This also covers interactive querying such as the one implemented in relevance feedback systems.

Exploring user needs and behaviour is a basic and important phase of system development and is very informative when done as a front-end activity to system development. Currently users are mostly involved in usability studies when a set of digital resources has already been created and is being tested (for an overview on usability evaluation methods in the library domain see [George, 2008]). It would be really helpful to involve users on early stages of design and planning the functionality of the product which is being developed.

5.2

Relevance Feedback

Relevance feedback is a very important step in image retrieval, because it defines the goals and the means to achieve them. Relevance feedback provides a compromise between a fully automated, unsupervised system and one based on subjective user needs. It is a query modification technique which attempts to capture the user's precise needs through iterative feedback and query refinement. It can be thought of as an alternative search paradigm to other paradigms such as keyword-based search. In the absence of a reliable framework for modelling high-level image semantics and subjectivity of perception, the user's feedback provides a way to learn case-specific query semantics. A comprehensive review can be found in [Zhou and Huang, 2003] and [Crucianu et al, 2004]. The goal in relevance feedback is to optimise the amount of interaction with the user during a session. It is important to use all the available information to improve the retrieval results. Based on the user's relevance feedback, learning-based approaches are typically used to appropriately modify the feature set or similarity measure. In practice, learning instances are very small number. This circumstance has generated interest in novel machine-learning techniques to solve the problem, such as one-class learning, active learning, and manifold learning. Usually, classical relevance feedback consists of multiple rounds, which leads to loosing the patience in the user. Recent developments are directed to find techniques for minimizing the rounds. One decision is to use information of earlier user logs in the system. Another approach is

Chapter 3: Automated Metadata Extraction from Art Images

113

presented in [Yang et al, 2005], where a novel feedback solution for semantic retrieval is proposed: semantic feedback, which allows the system to interact with users directly at the semantic level. This approach is closely neighboured to the new relevance feedback paradigms aimed to help users by providing the user with cues and hints for more specific query formulation.

5.3

Multimodal Fusion

Multimodal fusion is linked to the integration of information in humanmachine interactive systems where several communication modalities are proposed to the user. At recent years the advance in hardware and communication techniques made possible the using of advantages of multimedia. The presentations concerning some semantic unit stands more attractive and rich, using different modalities as images, text, free text, graphics, video, and any conceivable combination of them. Thus far, we have encountered a multitude of techniques for modelling and retrieving images, and text associated with these images. The trying for solving the retrieval tasks using only independent, media-specific methods is not good decision, because the information from the context (which can be extracted from neighbouring retrieval process over another modality) can be very helpful for current retrieval process. The user can best describe his queries only by a combination of media possibilities. Here lies the need for multimodal fusion as a technique for satisfying such user queries. Research in multimodal fusion therefore attempts to learn optimal combination strategies and models. If we observe only image retrieval process, here multimodal fusion can be considered in case of different modalities of presenting the image for the purposes of image retrieval. For example, if we select the colour as a discriminative feature, several images may have same colour, but when combined with other modality such as texture, they can be classified to their respective categories with higher confidence. Each modality extracts certain aspect of an image and they are interdependent to each other. In the presence of many modalities, it is important to identify the best way to fuse them. The fusion schemes can be based on whether we fuse the image data from different modalities first and then conduct experiments or we do other way round [Snoek et al, 2005]. 

Early fusion: Fusion scheme that integrates unimodal features before learning concepts;



Late fusion: Fusion scheme that first reduces unimodal features to separately learned concept scores, then these scores are integrated to learn concepts.

114

Access to Digital Cultural Heritage ...

In early fusion, we need one-step learning phase only, whereas late fusion requires additional learning step. According to the architecture, fusion schemes can be grouped into three main categories: 

Parallel architecture: all the individual classifiers are invoked independently, and their results of each of them are combined. The results may be selected based on equal weight or they may be assigned different weight based on certain user selected criteria;



Serial combination: individual classifiers are applied in a sequentially ordered in increasing order of theirs computation costs;



Hierarchical: the individual classifiers are placed into a decision-tree like structure.

Fusion learning is an offline process while fusion application at real time is computationally inexpensive, which makes multimodal fusion very useful method for image retrieval.

6 Feature Design The process of feature design is achieved to make mathematical description of an image for the retrieval purposes as its signature. Most CBIR systems perform feature design as a pre-processing step. Once obtained, visual features act as inputs to subsequent image analysis tasks, such as similarity estimation, concept detection, or annotation. The process of feature design is achieved to make mathematical description of an image for the retrieval purposes, as its signature. The extraction of signatures and the calculation of image similarity cannot be cleanly separated. One the one hand, the formulation of signatures determines the necessity of finding new definitions of similarity measures. On the other hand, intuitions are often the early motivating factors for designing similarity measures in a certain way, which puts requirements on the construction of signatures. In terms of methodology development, a strong trend which has emerged in the recent years is the employment of statistical and machine learning techniques in various aspects of the CBIR technology. Automatic learning, mainly clustering and classification, is used to form either fixed or adaptive signatures, to tune similarity measures, and even to serve as the technical core of certain searching schemes, for example, relevance feedback. The fixed set of visual features may not work equally well to characterize different types of images. The signatures can be tuned either based on images alone (when some property does not characterize the image – than signatures vary according to the classification of images) or by learning from user feedback (when the user is not interested in a particular feature).

Chapter 3: Automated Metadata Extraction from Art Images

115

In contrast with early years, where global feature representations for images, such as colour histograms and global shape descriptors were used, currently the focus shifts towards using local features and descriptors, such as salient points, region-based features, spatial model features, and robust local shape characterizations.

6.1

Taxonomy of Art Image Content

Johannes Itten [Itten, 1961] has given very good formulation of messages that one artwork sends to the viewer. He points three basic directions of evincing colour aesthetics: 

Impression (visually);



Expression (emotionally);



Construction (symbolically).

These characteristics are mutually connected and cannot live of full value alone: symbolism without visual accuracy and without emotional force would be merely an anaemic formalism; visually impressive effect without symbolic verity and emotional power would be a banal imitative naturalism; emotional effect without constructive symbolic content or visual strength would be limited to the plane of sentimental expression. Each artist works according to his temperament, and must emphasize one or another of these aspects [Itten, 1961]. Different styles in art paintings are connected with used techniques from one side and aesthetic expression of the artist from other side. The process of forming artist style is a very complicated one, where current fashion painting styles, social background and personal character of the artist play significant role. All these factors lead to forming some common trends in art movements and some specific features, which distinguish one movement to another, one artist style to another, one artist period to another, etc. On the other hand the theme of the paintings also stamps specifics and can be taken into account. The compositions in different types of images (portraits, landscapes, town views, mythological and religious scenes, or everyday scenes) also set some rules, aesthetically imposed for some period. When humans interpret images, they analyze image content. Computers are able to extract low-level image features like colour distribution, shapes and texture. Humans, on the other hand, have abilities that go beyond those of computers. The humans draw own subjective conclusions. They place emphasis on different parts of images, identify objects and scenes stamping theirs subjective vision and experience. The emotion that one person gets from seeing an image, and

116

Access to Digital Cultural Heritage ...

therefore associates with it, may differ from another person's point of view. Trying to define some useful grounds for bridging the gaps between interpreting the information from human and from computers several taxonomies of image content as extracted by the viewer of an image had been suggested. Alejandro Jaimes and Shih-Fu Chang focus on two aspects of image content – the received visual percepts from the observed images and underlying abstract idea, which corresponds to concepts, connected with the image content [Jaimes and Chang, 2002]. In his brilliant survey for 2D artistic images analysis Tomas Hurtut [Hurtut, 2010] expands the taxonomy suggested by Bryan Burford, Pam Briggs and John Eakins [Burford et al, 2003]. He gives profiling of extraction primitives and concepts accounting the specific of artworks, splitting image categories into three groups: image space, object space and abstract space. For the purposes of this study we adopted Hurtut's proposition adjusting the distribution of features in the groups. We examine image space, semantic space and abstract space: 

Image space contains visual features, needed to record an image through visual perception. Image space includes perceptual primitives (colour, textures, local edges), geometric features (strokes, contours, shapes) and design constructions (spatial arrangement, composition);



Semantic space is related to the meaning of the elements, their potential for semantic interpretation. Semantic space consists of semantic units (objects), 3D relationship between them (scene, perspective, depth cues) and context (illumination, shadow);



Abstract aspects that are specific to art images and reflect cultural influences, specific techniques as well as emotional responses evoked by an image form the abstract space.

Our vision on classifying feature percepts and concepts is presented on Fig. 17 (they slightly differ from Hurtuts' proposition) while pointing examples of used techniques for extracting visual primitives as well as some of closer relationships between concepts from defined spaces [Ivanova et al, 2010/MCIS]. All concepts are mutually connected – for instance emotional abstractions depends on specific expressive power of the artists (which is closely connected with visual perception primitives), thematic of the painting (concerning objective semantics of the paintings) as well as with the viewpoint of observer with his/her cultural and psychological peculiarities. Resolving these questions come up against the problem that in real-world contexts, it is in fact dynamic in nature. The

Chapter 3: Automated Metadata Extraction from Art Images

117

information that one can extract from the visual data for a one-time trained image recognition model does not change, but on the other hand, the interpretation that the same data have for a user in a given situation changes across users as well as situations.

Figure 17. A taxonomy of art image content, inspired from [Burford et al, 2003] and [Hurtut, 2010]

6.2

Visual Features

According to Figure 17, main features, extracted from image space, are: 

Perceptual features, especially colour, texture and interesting points features;

118

Access to Digital Cultural Heritage ...



Geometric features, where the main focus for art image analysis is on contours and shapes as a source for further semantic interpretation and on strokes, which together with the previous ones are the source for extracting technical abstractions;



Design constructions, connected with absolute or relative spatial relations.

Image features can be extracted at a global level to represent the entire image or the image can be split into parts and then features are computed in a local level from each part. The most commonly used features include those reflecting colour, texture, shape, and salient points in an image. On a global level, features are computed to capture the overall characteristics of an image. The advantage is high speed for both processes: construction of signatures and computing similarity. The processing on the local level increases the robustness to spatial transformation of the images and gives more detailed representation of specific features of the image. Both approaches have their advantages: global features help to build an integral overview of the image as well as local ones can capture more detailed specifics. 6.2.1

Colour Features

Colour features are focused on the summarization of colours in an image. A set of colour descriptors are included in the MPEG-7 standard, which reflect different aspects of colour presence in an image. Dominant Colour descriptor presents the percentage of each quantized colour over the observed area. Scalable Colour descriptor builds a colour histogram, encoded by a Haar transformation. Colour Layout descriptor effectively represents the spatial distribution of colour in an image or in an arbitrary shaped region in a very compact form and is resolution-invariant. Colour Structure descriptor captures both colour content (similar to a colour histogram) and information about the structure of this content. Usually the exploration of colour features is attended with conversing colour representation to other colour spaces, which are more comprehensive for the human vision and in this way facilitate the choice of appropriate distance measures. 6.2.2

Texture Features

Texture features are intended to capture the granularity and repetitive patterns of surfaces within in a picture. Their role in domain-specific image retrieval is particularly vital due to their close relation to the underlying semantics. In image processing, a popular way to form texture features is by using the coefficients of a certain transformation on the original pixel values, or by statistics computed from such coefficients.

Chapter 3: Automated Metadata Extraction from Art Images

119

Such descriptors encode significant, general visual characteristics into standard numerical formats that can be used for various higher-level tasks. In many application areas, for example in aerial image retrieval and in medicine, thesauri for texture have been built. A thesaurus of brushwork terms concerning the annotation of paintings covering the period from Medieval to Modern art, which includes terms as "shading", "glazing", "mezzapasta", "grattage", "scumbling", "impasto", "pointillism", and "divisionism" had been proposed in the field of art image retrieval [Marchenko et al, 2007]. The brushwork is defined as a combination of colour presence and contrast features with texture features, such as directional ("impasto"), non-directional ("pointillism"), contrasting ("divisionism") and smooth ("mezzapasta"), or in case of spatial homogeneity they can be grouped into homogeneous ("mezzapasta" and "pointillism"), weakly homogeneous ("divisionism") and inhomogeneous ("scumbling", "shading" and "glazing"). 6.2.3

Salient Point Features

Feature detection is a low-level image processing operation. That is, it is usually performed as the first operation on an image, and examines every pixel to see if there is a feature present at that pixel. If this is part of a larger algorithm, then the algorithm will typically only examine the image in the region of the features. There are very large number of feature detectors, which vary widely in the kinds of feature detected, the computational complexity and the repeatability. The salient point feature detectors can be divided into: edges, corners, blobs and ridges (with some overlap). A detailed review of salient point features is showed in [Ivanova et al-TR, 2010]. "Edges" are points where there is a boundary between two image regions. In practice, edges are usually defined as sets of points in the image which have a strong gradient magnitude. Furthermore, some common algorithms chain high gradient points together to form a more complete description of an edge. These algorithms usually place some constraints on the properties of an edge, such as shape, smoothness, and gradient value. Locally, edges have one dimensional structure. The terms "Corners" or "Interesting points" refer to point-like features in an image, which have a local two dimensional structure. The name "corner" arose since early algorithms first performed edge detection, and then analyzed the edges to find rapid changes in direction (corners). These algorithms were then developed so that explicit edge detection was no longer required, for instance by looking for high levels of curvature in the image gradient. It was then noticed that the so-called corners were also being detected on parts of the image which were not corners in the

120

Access to Digital Cultural Heritage ...

traditional sense (for instance a small bright spot on a dark background may be detected). "Blobs" provide a complementary description of image structures in terms of regions, as opposed to corners that are more point-like. Blob descriptors often contain a preferred point (a local maximum of an operator response or a gravity centre), which means that many blob detectors may also be regarded as "point of interest" operators. Blob detectors can detect areas in an image which are too smooth to be detected by a corner detector. The concept "Ridges" is a natural tool for elongated objects. A ridge descriptor computed from a grey-level image can be seen as a generalization of a medial axis. From a practical viewpoint, a ridge can be thought of as a one-dimensional curve that represents an axis of symmetry, and in addition has an attribute of local ridge width associated with each ridge point. It is algorithmically harder to extract ridge features from general classes of grey-level images than edge-, corner- or blob features. The ridge descriptors are frequently used for road location in aerial images and for extracting blood vessels in medical images. Multiple scale-invariant features extraction algorithms such as SIFT, GLOH, SURF, LESH exist; they are widely used in current object recognition. They transform an image into a large collection of feature vectors, each of which is invariant to image translation, scaling, and rotation, partially invariant to illumination changes and robust to local geometric distortion (an overview is provided in [Ivanova et al-TR, 2010]). 6.2.4

Shape Features

Shape is a key attribute of segmented image regions, and its efficient and robust representation plays an important role in retrieval. Shape representations are closely connected with the particular forms of shape similarities used in each case. The current state of the art of this area is described in detail in [Data et al, 2008]. They have marked the shift from global shape representations which was dominant in early research to the use of more local descriptors in the last years. In MPEG-7 standard also are included Region Shape, Contour Shape and Shape 3D descriptors. The Region Shape descriptor utilizes a set of Angular Radial Transform coefficients. The Contour Shape descriptor is based on the Curvature Scale Space representation of the contour. The Shape 3D descriptor specifies an intrinsic shape description for 3D mesh models, which exploits some local attributes of the 3D surface [ISO/IEC 15938-3]. Shape features can play very significant role in semantic retrieval.

Chapter 3: Automated Metadata Extraction from Art Images

6.2.5

121

Spatial Relations

Representing spatial relations among local image entities plays a very important role in the process of preparing visual signatures. Several kinds of indexing methods are used for representing absolute or relative spatial relations.  Partitioning Partitioning can be defined as data-independent grouping [Data et al, 2008]. This method is not closely connected with representing absolute spatial relations, but allows a simple way for receiving more local information for the examined images. There are different methods for partitioning the image depending on the type of application. The simplest method is to divide image into non-overlapping tiles. In [Gong et al, 1996] the image is split into nine equal sub-images. [Striker and Dimai, 1997] have split image into oval central region and four corners. These methods have low computational cost and can be used for deriving more precisely (than from the whole image) low-level characteristics. They are not suitable if the goal is object segmentation.  Segmentation Segmentation is opposite of partitioning and is characterized as datadriven grouping. [Estrada, 2005] formulates segmentation as "the problem of defining a similarity measure between image elements that can be evaluated using image data, and the development of an algorithm that will group similar image elements into connected regions, according to some grouping criterion. The image elements can be pixels, small local neighbourhoods, or image regions produced by an earlier stage of processing, or by a previous step of an iterative segmentation procedure. The similarity function can use one or many of the available image cues (such as image intensity, colour, texture, and various filter responses), or be defined as a proximity measure on a suitable feature space that captures interesting image structure." A great variety of segmentation techniques exists. Some applied approaches used either agglomerative (by merging) or divisive (by splitting) hierarchical clustering with different similarity functions (based on the entropy or statistic distance) and stopping criteria (like minimum description length, chi-square, etc.). Agglomerative algorithms are usually more frequently used than the divisive ones. An excellent example for agglomerative clustering is the algorithm where "normalized cut criterion" measures both the total dissimilarity between the different groups as well as the total similarity within the groups [Shi and Malik, 2000].

122

Access to Digital Cultural Heritage ...

Other algorithms are not hierarchical. The simplest and widely used segmentation approach is based on k-means clustering. This basic approach enjoys a speed advantage, but is not as refined as some recently developed methods. Another disadvantage is that the number of clusters is an external parameter. The mean-shift algorithm is nonparametric clustering technique; it does not require prior knowledge of the number of clusters and the algorithm recursively moves to the kernel smoothed centroid for every data point looking for the point with highest density of data distribution [Comaniciu and Meer, 1999]. Amongst other approaches it is worth mentioning the multi-resolution segmentation of low-depth-of-field images [Wang et al, 2001], a Bayesian framework-based segmentation involving the Markov chain Monte Carlo technique [Tu and Zhu 2002], and the EM-algorithm-based segmentation using a Gaussian mixture model [Carson et al, 2002], forming blobs suitable for image querying and retrieval. A sequential segmentation approach that starts with texture features and refines segmentation using colour features is explored in [Chen et al, 2001]. An unsupervised approach for segmentation of images containing homogeneous colour/texture regions has been proposed in [Deng and Manjunath, 2001]. Yet another group of algorithms are the so-called model-based segmentation algorithms. The central assumption is that structures of interest have a repetitive form of geometry. These algorithms work well when the segmented image contains the search object and are widely used in medicine and radiological image retrieval.  Presentations of Relative Relationships Considering that homogeneous regions or symbolic objects have already been extracted, the relative relationships try to model or characterize the spatial relations between them, for instance "object A is under and on the left of object B" [Freeman, 1975]. Another convenient way of representing local spatial relations is Delaunay triangulation. This method was invented by Boris Delaunay in 1934 for the case of Euclidean space. In this space the Delaunay triangulation is the dual structure of the Voronoi diagram. Several algorithms can be used for computing Delaunay triangulation, such as flipping, incremental, gift wrap, divide and conquer, sweep-line, sweep-hull, etc. [de Berg et al, 2000].

Chapter 3: Automated Metadata Extraction from Art Images

6.3

123

MPEG-7 Standard

The Moving Picture Experts Group (MPEG) [ISO/IEC JTC1/SC29 WG11] was formed by the ISO in 1988 to set standards for audio and video compression and transmission 133 . A series of currently widely spread standards such as the standard for audio recording MP3 (MPEG-1 Layer 3, ISO/IEC 11172), the standards for transmission for over the air digital television, digital satellite TV services, digital cable television, DVD video and Blue-ray (MPEG-2, ISO/IEC 13818 and MPEG-4, ISO/IEC 14496) are outcomes of this group. In addition to the above standards, the group deals with different standards to describe content and environment. Here our interest is focused on the MPEG-7 standard ISO/IEC 15938, named "Multimedia Content Description Interface" [ISO/IEC 15938-3], which provides standardized core technologies allowing the description of audiovisual data content in multimedia environments. Audiovisual data content that has MPEG-7 descriptions associated with it may include: still pictures, graphics, 3D models, audio, speech, video, and composition information about how these elements are combined in a multimedia presentation (scenarios). The MPEG-7 descriptions of visual content are separated into three groups: 

Colour Descriptors: Colour Space, Colour Quantization, Dominant Colours, Scalable Colour, Colour Layout, Colour Structure, and GoF/GoP Colour;



Texture Descriptors: Homogeneous Texture, Edge Histogram, and Texture Browsing;



Shape Descriptors: Region Shape, Contour Shape, and Shape 3D. Colour descriptors used in MPEG-7 are as listed below:



Colour Space: The feature colour space is used in other colour based descriptions. In the current version of the standard the following colour models are supported: RGB, YCrCb, HSV, HMMD, Linear transformation matrix with reference to RGB and Monochrome;



Colour Quantization: This descriptor defines a uniform quantization of a colour space. The number of bins which the quantizer produces is configurable; this allows for great flexibility within a wide range of applications. For a meaningful application in the context of MPEG-7, this descriptor has to be combined with Dominant Colour descriptors, e.g. to express the meaning of the values of dominant colours;

133

http://www.chiariglione.org/mpeg

124

Access to Digital Cultural Heritage ...



Dominant Colour(s): This colour descriptor is most suitable for representing local (object or image region) features where a small number of colours are enough to characterize the colour information in the region of interest. Whole images are also applicable, for example, flag images or colour trademark images. Colour quantization is used to extract a small number of representative colours in each region/image. Correspondingly, the percentage of each quantized colour in the region is calculated. A spatial coherency on the entire descriptor is also defined, and is used in similarity retrieval. The specific presentation of this descriptor allows for the variety of possibilities of using different kind of similarity measures. The Earth mover distance [Wang et al, 2003] is the most convenient for such kind of features. Other types of similarity measures are used in [Yang et al, 2008];



Scalable Colour: The descriptor specifies a colour histogram in HSV colour space, which is encoded by a Haar transformation. Its binary representation is scalable in terms of bin numbers and bit representation accuracy over a broad range of data rates. The Scalable Colour descriptor is useful for image-to-image matching and retrieval based on colour feature. Retrieval accuracy increases with the number of bits used in the representation. The sum of absolute difference of coefficients can be used (

L1

metric) as a distance

measure; 

Colour Layout: This descriptor effectively represents the spatial distribution of colour of visual signals in a very compact form. This compactness allows visual signal matching functionality with high retrieval efficiency at very small computational costs. It provides image-to-image matching without dependency on image format, resolutions, and bit-depths. It can be also applied both to a whole image and to any connected or unconnected parts of an image with arbitrary shapes. It also provides very friendly user interface using hand-written sketch queries since this descriptor captures the layout information of colour feature. The sketch queries are not supported in other colour descriptors. The colour Layout descriptor uses the YCbCr colour space with 8 bits quantization. The elements of colour Layout specify the integer arrays that hold a series of zigzag-scanned DCT coefficient values. The DCT coefficients of each colour component are derived from the corresponding component of local representative colours. For similarity measure can be used standard L1 or L2 metrics as well as specific functions, which takes into account the significance of the order of coefficients [Herrmann, 2002];

Chapter 3: Automated Metadata Extraction from Art Images



125

Colour Structure: This is a colour feature descriptor that captures both colour content (similar to a colour histogram) and information about the structure of this content. Its main functionality is image-to-image matching and its intended use is for still-image retrieval, where an image may consist of either a single rectangular frame or arbitrarily shaped, possibly disconnected, regions. The extraction method embeds colour structure information into the descriptor by taking into account all colours in a structuring element of 8x8 pixels that slides over the image, instead of considering each pixel separately. Unlike the colour histogram, this descriptor can distinguish between two images in which a given colour is present in identical amounts but where the structure of the groups of pixels having that colour is different in both images. Colour values are represented in the doubleconed HMMD colour space, which is quantized non-uniformly into 32, 64, 128 or 256 bins. Each bin amplitude value is represented by an 8bit code. The Colour Structure descriptor provides additional functionality and improved similarity-based image retrieval performance for natural images compared to the ordinary colour histogram. The descriptor expresses local colour structure in an image by means of a structuring element that is composed of several image samples. The semantics of the descriptor, though related to those of a colour histogram, is distinguishable in the following way. Instead of characterizing the relative frequency of individual image samples with a particular colour, this descriptor characterizes the relative frequency of structuring elements that contain an image sample with a particular colour. Hence, unlike the colour histogram, this descriptor can distinguish between two images in which a given colour is present in identical amounts but where the structure of the groups of pixels having that colour is different in the two images. Usually the sum of absolute normalized difference of coefficients is used ( L1 metric) as a distance;



GoF/GoP Colour: The Group of Frames/Group of Pictures Colour descriptor extends the Scalable Colour descriptor that is defined for a still image to colour description of a video segment or a collection of still images. The same similarity/distance measures that are used to compare Scalable Colour descriptions can be employed to compare GoF/GoP Colour descriptors.

From texture descriptors we will stop our attention on the following ones: 

Edge Histogram: This descriptor specifies the spatial distribution of five types of edges in local image regions (four directional edges – vertical, horizontal, 45 degree, 135 degree and one non-directional in

126

Access to Digital Cultural Heritage ...

each local region called a sub-image. The sub-image is a part of the original image and each sub-image is defined by dividing the image space into 4x4 non-overlapping blocks, linearized by raster scan order. For each sub-image a local edge histogram with 5 bins is generated. As a result 16*5=80 histogram bins forms an Edge Histogram descriptor array. Each sub-image if divided into image-blocks. The value for each histogram bin in is related to the total number of image blocks with the corresponding edge type for each sub-image. These bin values are normalized by the total number of image blocks in the sub-image and are non-linearly quantized by quantization tables, defined in MPEG-7 standard. For this descriptor can be used each similarity measure function for histograms. [Won et al, 2002] suggests an extension to this descriptor in order to capture not only the local edge distribution information but also semi-global and global ones; 

Homogeneous Texture: This descriptor characterizes the region texture using the energy and energy deviation in a set of frequency channels. This is applicable for similarity based search and retrieval applications. The frequency space from which the texture features in the image are extracted is partitioned with equal angles of 30 degrees in the angular direction and with an octave division in the radial direction. As a result of applying of 2D Gabor function for feature channels and consequent quantization and coding average, standard deviation, energy and energy deviation are extracted.

The main issue with the MPEG-7 standard is that it focuses on the representation of descriptions and their encoding rather than on the practical methods on the extraction of descriptors. The creation and application of MPEG-7 descriptors are outside the scope of the MPEG-7 standard. For example, description schemes used in MPEG-7 specify complex structures and semantics groupings descriptors and other description schemes such as segments and regions which require a segmentation of visual data. MPEG-7 does not specify how to automatically segment still images and videos in regions and segments; likewise, it does not recommend how to segment objects at the semantic level [Tremeau et al, 2008]. MPEG-7 does not strictly standardize the distance functions to be used and sometimes does not propose a dissimilarity function leaving the developers the flexibility to implement their own dissimilarity/distance functions. A few techniques can be found in the MPEG-7 eXperimentation Model (XM) [MPEG-7:4062, 2001]. Apart from that, there are many general purpose distances that may be applied in order to simplify some complex distance function or even to improve the performance [Eidenberger, 2003]. A large number of successful distance measures

Chapter 3: Automated Metadata Extraction from Art Images

127

from different areas (statistics, psychology, medicine, social and economic sciences, etc.) can be applied on MPEG-7 data vectors [Dasiapoulou et al, 2007]. MPEG-7 is not aimed at any particular application. The elements that MPEG-7 standardizes support as broad a range of applications as possible. The MPEG-7 descriptors are often used in the processes of image-to-image matching, searching of similarities, sketch queries, etc. [Stanchev et al, 2006].

7 Data Reduction Data reduction techniques can be applied to obtain a reduced representation of the data set that is much smaller in volume, yet closely maintains the integrity of the original data. That is, mining on the reduced data set should be more efficient yet produce the same (or almost the same) analytical results. Strategies for data reduction include dimensionality reduction, where encoding mechanisms are used to reduce the data set size and numerosity reduction, where the data are replaced or estimated by alternative, smaller data representations. In Figure 18 a hierarchy of some data reduction techniques is presented.

7.1

Dimensionality Reduction

The "curse of dimensionality", is a term coined by Bellman [Bellman, 1961] to describe the problem caused by the exponential increase in volume associated with adding extra dimensions to a feature space. In image clustering and retrieval applications, the feature vectors tend to use high dimensional data space and in such case to fall into "curse of dimensionality" since the search space grows exponentially with the dimensions. In image databases, the volume of the data is very large and the amount of time needed to access the feature vectors on storage devices usually dominates the time needed for a search. This problem is further complicated when the search is to be performed multiple times and in an interactive environment. Thus high dimensionality of data causes increased time and space complexity, and as a result decreases performance in searching, clustering, and indexing. When the attribute space is not high-dimensional, the standard method is representing the features as points in a feature space and using distance metrics for similarity search. The problem with this method is that with the increasing of data dimension, the maximum and minimum distances to a given query point in the high dimensional space are almost the same under a wide range of distance metrics and data distributions.

128

Access to Digital Cultural Heritage ...

All points converge to the same distance from the query point in high dimensions, and the concept of nearest neighbours become meaningless.

Figure 18. Data Reduction Techniques

There are two main ways to overcome the curse of dimensionality in image search and retrieval. The first is to search the approximate results of a multimedia query, and the second is to reduce the high dimensional input data to a low dimensional representation. The dimensionality reduction techniques are based on either feature selection (also named attribute subset selection) or feature extraction methods.

Chapter 3: Automated Metadata Extraction from Art Images

7.1.1

129

Feature Selection (Attributive Subset Selection)

In feature selection, an appropriate subset of the original features is found to represent the data. This method is useful when the data available limited amount, but is represented with a large number of features [Agrawal, 2009]. It is crucial to determine a small set of relevant variables to estimate reliable parameter. The advantage of selecting a small set of features is that you need to use few values in the calculations. Data sets for analysis may contain attributes, which may be irrelevant or redundant to the mining task. Leaving out relevant attributes or keeping irrelevant attributes may aggravate data mining process. Attribute subset selection reduces the data set size by removing irrelevant or redundant attributes (or dimensions). The goal of attribute subset selection is to find a minimum set of attributes such that the resulting probability distribution of the data classes is as close as possible to the original distribution obtained using all attributes. Finding on optimal subset is a hard computational process. Therefore, heuristic methods that explore a reduced search space are commonly used for attribute subset selection. Optimal feature subset selection techniques can be divided to: filter, wrapper and hybrid [Gheyas and Smith, 2010].  Filter Approaches In filter approaches, features are scored and ranked based on certain statistical criteria and the features with highest ranking values are selected. Usually as filter methods t-test, chi-square test, Wilcoxon MannWhitney test, mutual information, Pearson correlation coefficients and principal component analysis are used. Filter methods are fast but lack robustness against interactions among features and feature redundancy. In addition, it is not clear how to determine the cut-off point for rankings to select only truly important features and exclude noise.  Wrapper Approaches In the wrapper approaches, feature selection is "wrapped" in a learning algorithm. The learning algorithm is applied to subsets of features and tested on a hold-out set, and prediction accuracy is used to determine the feature set quality. Generally, wrapper methods are more effective than filter methods. Since exhaustive search is not computationally feasible, wrapper methods must employ a designated algorithm to search for an optimal subset of features. Wrapper methods can broadly be classified into two categories based on the search strategy: (1) greedy and (2) randomized/stochastic.

130

Access to Digital Cultural Heritage ...

(1) Greedy wrapper methods use less computer time than other wrapper approaches. Two most commonly applied wrapper methods that use a greedy hill-climbing search strategy are: 

Sequential backward selection, in which features are sequentially removed from a full candidate set until the removal of further features increase the criterion;



Sequential forward selection, in which features are sequentially added to an empty candidate set until the addition of further features does not decrease the criterion.

The problem with sequentially adding or removing features is that the utility of an individual feature is often not apparent on its own, but only in combinations including just the right other features. (2) Stochastic algorithms, developed for solving large scale combinatorial problems such as ant colony optimization, genetic algorithm, particle swarm optimization and simulated annealing are used as feature subset selection approaches. These algorithms efficiently capture feature redundancy and interaction, but are computationally expensive.  Hybrid Approaches The idea behind the hybrid method is that filter methods are first applied to select a feature pool and then the wrapper method is applied to find the optimal subset of features from the selected feature pool. This makes feature selection faster since the filter method rapidly reduces the effective number of features under consideration [Gheyas and Smith, 2010]. 7.1.2

Feature Extraction

In feature extraction, new features are found using the original features without losing any important information. Feature extraction methods can be divided into linear and non-linear techniques, depending of the choice of objective function. Some of the most popular dimensionality reduction techniques are:  Projection Pursuit (PP) Projection pursuit is a method, which finds the most "interesting" possible projections of multidimensional data. A good review of projection pursuit can be found in [Huber, 1985]. The projection index defines the "interestingness" of a direction. The task is to optimize this index. A projection is considered interesting if it has a structure in the form of trends, clusters, hyper-surfaces, or anomalies. These structures can be

Chapter 3: Automated Metadata Extraction from Art Images

131

analyzed using manual or automatic methods. The scatter-plot is one such manual method, which can be used to understand data characteristics over two selected dimensions at a time. There are many methods to automate this task.  Principal Component Analysis (PCA) One often used and simple projection pursuit method is the Principal Component Analysis, which calculates the eigenvalues and eigenvectors of the covariance or correlation matrix, and projects the data orthogonally into space spanned by the eigenvectors belonging to the largest eigenvalues. PCA is also called the discrete Karhunen-Loève method (K-L method), the Hotelling transform, singular value decomposition (SVD), or empirical orthogonal function (EOF) method. A good tutorial on PCA can be found in [Smith, 2002]. PCA searches for k n -dimensional orthogonal vectors that can best be used to represent the data, where k  n . The original data are thus projected onto a much smaller space, resulting in dimensionality reduction. PCA transforms the data to a new coordinate system such that the first coordinate (also called the first principal component) is the projection of the data exhibiting the greatest variance, the second coordinate (also called the second principal component) exhibits the second greatest variance, and so on. In this way, the "most important" aspects of the data are retained in the lower-order principal components. PCA is computationally inexpensive, can be applied to ordered and unordered attributes, and can handle sparse data and skewed data. Principal components may be used as inputs to multiple regression and cluster analysis.  Multidimensional Scaling (MDS) Multidimensional scaling (MDS) is used to analyze subjective evaluations of pairwise similarities of entities. In general, the goal of the analysis is to detect meaningful underlying dimensions that allow the researcher to explain observed similarities or dissimilarities (distances) between the investigated objects. In PCA, the similarities between objects are expressed in the covariance or correlation matrix. MDS allows analyzing any kind of similarity or dissimilarity matrix, in addition to correlation matrices. Assume, there are p items in n -dimensional space and a p  p matrix

of

proximity

measures,

MDS

produces

a

k -dimensional

representation ( k  n ) of the original data items. The distance in the new

132

Access to Digital Cultural Heritage ...

k -space reflects the proximities in the data. If two items are more similar, this distance will be smaller. The distance measures can be Euclidean distance, Manhattan distance, maximum norm, or other. MDS is typically used to visualize data in two or three dimensions, to uncover underlying hidden structure. Any dataset can be perfectly represented using n  1 dimensions, where n is the number of items scaled. As the number of dimensions used goes down, the stress must either come up or stay the same. When the dimensionality is insufficient the non-zero stress values occur. It means that chosen dimension k cannot perfectly represent the input data. Of course, it is not necessary that an MDS map has zero stress in order to be useful. A certain amount of distortion is tolerable. Different people have different standards regarding the amount of stress to tolerate. The rule of thumb is that anything under 0.1 is excellent and anything over 0.15 is unacceptable. Both PCA and MDS are eigenvector methods designed to model linear variability in high dimensional data. In PCA, one computes the linear projections of greatest variance from the top eigenvectors of the data covariance matrix. Classical MDS computes the low dimensional embedding that best preserves pair-wise distances between data points. If these distances correspond to Euclidean distances, the results of metric MDS are equivalent to PCA.  Locally Linear Embedding (LLE) Locally Linear Embedding (LLE) is also an eigenvector method that computes low dimensional, neighbourhood preserving embeddings of high dimensional data. LLE attempts to discover nonlinear structure in high dimensional data by exploiting the local symmetries of linear reconstructions. Notably, LLE maps its inputs into a single global coordinate system of lower dimensionality, and its optimizations – though capable of generating highly nonlinear embeddings – do not involve local minima. Like PCA and MDS, LLE is simple to implement, and its optimizations do not involve local minima. At the same time, it is capable of generating highly nonlinear embeddings [Saul and Roweis, 2000].  Spectral Clustering The main tools for spectral clustering are graph Laplacian matrices. The technique is based on two main steps: first embedding the data points in a space in which clusters are more "obvious" (using the

Chapter 3: Automated Metadata Extraction from Art Images

133

eigenvectors of a Gram matrix) and then applying an algorithm to separate the clusters, such as K-means. A good tutorial for Spectral Clustering can be found in [Luxburg, 2006]. Sometimes called Diffusion Maps or Laplacian Eigenmaps, these manifold-learning techniques are based on graph-theoretic approach. A graph is built using the data items, which incorporates neighbourhood information of the data set. A low dimensional representation of the data set is computed using the Laplacian of the graph that optimally preserves local neighbourhood information. The vertices, or nodes, represent the data points, and the edges connected the vertices, represent the similarities between adjacent nodes. After representing the graph with a matrix, the spectral properties of this matrix are used to embed the data points into a lower dimensional space, and gain insight into the geometry of the dataset. Though these methods perform exceptionally well with clean, well-sampled data, problems arise with the addition of noise, or when multiple sub-manifolds exist in the data.  Vector Quantization (VQ) Vector Quantization (VQ) is used to represent not individual values but (usually small) arrays of them. In vector quantization, the basic idea is to replace the values from a multidimensional vector space with values from a lower dimensional discrete subspace. A vector quantizer maps

k -dimensional vectors in the vector space R k into a finite set of vectors Y  { yi : i  1,..., n} . The vector yi is called a code vector or a codeword and the set of all the codewords Y is called a codebook. Unfortunately, designing a codebook that best represents the set of input vectors is NPhard. There are different algorithms, which try to overcome this problem. A review of vector quantization techniques used for encoding digital images is presented in [Nasrabadi and King, 1988]. VQ can be used for any large data sets, when adjacent data values are related in some way. VQ has been used in image, video, and audio compression.  Curvilinear Component Analysis (CCA) The principle of Curvilinear Component Analysis is a self-organized neural network performing two tasks: vector quantization (VQ) of the sub-manifold in the data set (input space); and nonlinear projection (P) of these quantizing vectors toward an output space, providing a revealing unfolding of the sub-manifold. After learning, the network has the ability to continuously map any new point from one space into another: forward

134

Access to Digital Cultural Heritage ...

mapping of new points in the input space, or backward mapping of an arbitrary position in the output space [Demartines and Herault, 1997].

7.2

Numerosity Reduction

In numerosity reduction data is replaced or estimated by alternative, smaller data representations. These techniques may be parametric or nonparametric. For parametric methods, a model is used to estimate the data, so that typically only the data parameters need to be stored, instead of the actual data. For comprehensive data representation outliers may also be stored. Regression and Log-linear models, which estimate discrete multidimensional probability distributions, are two examples. Nonparametric methods for storing reduced representations of the data include histograms, clustering, and sampling. Data discretization is a form of numerosity reduction that is very useful for the automatic generation of concept hierarchies. Discretization and concept hierarchy generation are powerful tools for data mining, in that they allow the mining of data at multiple levels of abstraction.  Regression and Log-Linear Models Regression and Log-Linear models can be used to approximate the given data. They are typical examples of parametric methods [Han and Kamber, 2006]. In simple linear regression, the data are modelled to fit a straight line. A random variable, y (called a response variable), can be modelled as a linear function of another random variable,

x (called a predictor variable), with the equation y  wx  b , where the variance of y is assumed to be constant. In the context of data mining, x and y are both numerical attributes. The coefficients, w and b (called regression coefficients), specify the slope of the line and the y -intercept, respectively. These coefficients can be solved by the method of least squares, which minimizes the error between the actual line separating the data and the estimate of the line. Multiple linear regression is an extension of simple linear regression, which allows a response variable y to be modelled as a linear function of two or more predictor variables. Log-linear models approximate discrete multidimensional probability distributions using logarithmic transformations. Given a set of tuples in n dimensions (e.g., described by n attributes), we can consider each tuple as a point in a n -dimensional space. Log-linear models can be used to

Chapter 3: Automated Metadata Extraction from Art Images

135

estimate the probability of each point in a multidimensional space for a set of discretized attributes, based on a smaller subset of dimensional combinations. This allows a higher-dimensional data space to be constructed from lower dimensional spaces. Log-linear models are therefore also useful for dimensionality reduction (since the lowerdimensional points together typically occupy less space than the original data points) and data smoothing (since aggregate estimates in the lowerdimensional space are less subject to sampling variations than the estimates in the higher-dimensional space). Regression and log-linear models can both be used on sparse data, although their application may be limited. While both methods can handle skewed data, regression does it exceptionally well. Regression can be computationally intensive when applied to high dimensional data, whereas log-linear models show good scalability for up to 10 or so dimensions.  Discrete Wavelet Transforms The Discrete Wavelet Transform (DWT) is a linear signal processing technique that transforms input vector to another vector with same length, but elements are wavelet coefficients. A wavelet is a mathematical function used to divide a given function into different scale components. A wavelet transform is the representation of a function by wavelets. The wavelets are scaled and translated copies (known as "daughter wavelets") of a finite-length or fast-decaying oscillating waveform (known as the "mother wavelet"). The first DWT was invented by the Hungarian mathematician Alfred Haar in 1909. The most commonly used set of discrete wavelet transforms was formulated by the Belgian mathematician Ingrid Daubechies in 1988. Haar wavelet is the first one of the family of Daubechies wavelets. The Daubechies wavelets are a family of orthogonal wavelets defining a discrete wavelet transform and characterized by a maximal number of vanishing moments for some given support. With each wavelet type of this class, there is a scaling function (also called father wavelet) which generates an orthogonal multiresolution analysis [Daubechies, 1988].  Histograms Histograms use binning to approximate data distributions and are a popular form of data reduction [Han and Kamber, 2006]. A histogram for an attribute partitions the data distribution of the attribute into disjoint subsets, or buckets. There are several partitioning rules, including Equalwidth (where the width of each bucket range is uniform), Equal-frequency

136

Access to Digital Cultural Heritage ...

(each bucket contains roughly the same number of contiguous data samples), V-Optimal (the histogram with the least variance) and MaxDiff (where a bucket boundary is established between each pair for pairs having the b  1 largest differences, where b is user-specified number of buckets). V-Optimal and MaxDiff histograms tend to be the most accurate and practical. Histograms are highly effective at approximating both sparse and dense data, as well as highly skewed and uniform data.  Clustering In data reduction, the cluster representations of the data are used to replace the actual data. The effectiveness of this technique depends on the nature of the data. It is much more effective for data that can be organized into distinct clusters than for smeared data. In database systems, multidimensional index trees are primarily used for providing fast data access. They can also be used for hierarchical data reduction, providing a multi-resolution clustering of the data. This can be used to provide approximate answers to queries. An index tree can store aggregate and detail data at varying levels of abstraction. It provides a hierarchy of clustering of the data set, where each cluster has a label that holds for the data contained in the cluster. If we consider each child of a parent node as a bucket, then an index tree can be considered as a hierarchical histogram. The use of multidimensional index trees as a form of data reduction relies on an ordering of the attribute values in each dimension. Multidimensional index trees include R-trees, quad-trees, and their variations. Special cases of clustering are data discretization techniques, which can be used to reduce the number of values for a given continuous attribute by dividing the range of the attribute into intervals. Interval labels can then be used to replace actual data values. Replacing numerous values of a continuous attribute by a small number of interval labels thereby reduces and simplifies the original data. From point of view of using class information in the discretization process the methods are supervised or unsupervised. Supervised discretizators usually fall into the following categories: if the process starts by finding one or a few points (called split points or cut points) to split the entire attribute range, and then repeats this recursively on the resulting intervals, it is called topdown discretization or splitting. In contrast, bottom-up discretization or merging starts by considering all continuous values as potential splitpoints, removes some by merging neighbourhood values to form intervals, and then recursively applies this process to the resulting intervals. Discretization can be performed recursively on an attribute to provide a hierarchical or multi-resolution partitioning of the attribute

Chapter 3: Automated Metadata Extraction from Art Images

137

values, known as a concept hierarchy. Concept hierarchies are useful for mining at multiple levels of abstraction. We have made a brief overview of discretization techniques in [Mitov et al, 2009b].  Sampling Sampling allows a large data set to be represented by a much smaller random sample (or subset) of the data. The most common ways for receiving data reduction, using sampling, according to [Han and Kamber, 2006] are: 

Simple random sample without replacement (SRSWOR), where all tuples are equally likely to be sampled;



Simple random sample with replacement (SRSWR), where after a tuple is drawn, it is placed back into the primary set, so that it may be drawn again;



Cluster sample, where all tuples are grouped into mutually disjoint "clusters", then a simple random sample can be obtained;



Stratified sample, where if source set is divided into mutually disjoint parts called strata, a stratified sample is generated by obtaining a simple random sampling at each stratum. This helps ensure a representative sample, especially when the data are skewed.

An advantage of sampling for data reduction is that the cost of obtaining a sample is proportional to the size of the sample when applied to data reduction; sampling is most commonly used to estimate the answer to an aggregate query.

8 Indexing The second component of CBIR-systems is indexing. Efficient indexing is critical for building and functioning of very large text-based databases and search engines. Research on efficient ways to index images by content has been largely overshadowed by research on efficient visual representation and similarity measures. In [Markov et al, 2008] we provide an expanded survey of different spatial access methods, based on the earlier analyses of [Ooi et al, 1993] and [Gaede and Günther, 1998]. The access methods are classified in several categories: one-dimensional; multidimensional spatial; metric; and high dimensional access methods (Figure 19). The article [Markov et al, 2008] includes more detailed description of interconnections between access methods as well as the references of sources, where these methods are described. Here we provide a summary of the methods.

138

Access to Digital Cultural Heritage ...

Figure 19. Taxonomy of the Access Methods

Multidimensional Spatial Access Methods are developed to serve information about spatial objects, approximated with points, segments, polygons, polyhedrons, etc. From the point of view of the spatial databases can be split in two main classes of access methods – Point Access Methods and Spatial Access Methods [Gaede and Günther, 1998]. Point Access Methods are used for organizing multidimensional point objects. Typical instance are traditional records, where one dimension corresponds to every attribute of the relation. These methods can be clustered in three basic groups: (1) Multidimensional Hashing; (2) Hierarchical Access Methods; (3) Space Filling Curves for Point Data. Spatial Access Methods are used for work with objects which have arbitrary form. The main idea of the spatial indexing of non-point objects is using of the approximation of the geometry of the examined objects to more simple forms. The most commonly used approximation is Minimum

Chapter 3: Automated Metadata Extraction from Art Images

139

Bounding Rectangle (MBR), i.e. minimal rectangle, which sides are parallel of the coordinate axes and completely include the object. Approaches exist for approximation with Minimum Bounding Spheres or other polytopes, as well as their combinations. The usual problem when one operates with spatial objects is their overlapping. There are different techniques to avoid this problem. From the point of view of the techniques for organization of the spatial objects Spatial Access Methods form four main groups: (1) Transformation – this technique uses transformation of spatial objects to points in the space with more or less dimensions. Most of them spread out the space using space filling curves and then use some of point access method upon the transformed data set; (2) Overlapping Regions – here the data set are separated in groups; different groups can occupy the same part of the space, but every space object associates with only one of the groups. The access methods of this category operate with data in their primary space (without any transformations) eventually in overlapping segments; (3) Clipping – this technique use eventually clipping of one object to several sub-objects. The main goal is to escape overlapping regions. But this advantage can lead tearing of the objects, extending of the resource expenses and decreasing of the productivity of the method; (4) Multiple Layers – this technique is a variant of the technique of Overlapping Regions, because the regions from different layers can also overlap. However, there are some important differences: first, the layers are organizing hierarchically; second, every layer splits the primary space in different way; third, the regions of one layer never overlap; fourth, the data regions are separated from space extensions of the objects. Metric Access Methods deal with relative distances of data points to chosen points, named anchor points, vantage points or pivots [MoënneLoccoz, 2005]. These methods are designed to limit the number of distance computation, calculating first distances to anchors, and then finding the point searched for in the narrowed region. These methods are preferred when the distance is highly computational, as e.g. for the dynamic time warping distance between time series. Metric Access Methods are employed to accelerate the processing of similarity queries, such as the range and the k-nearest neighbour queries too [Chavez et al, 2001]. High Dimensional Access Methods are created to overcome the bottleneck problem, which appears with increasing of dimensionality. These methods are based on the data approximation and query approximation in sequential scan. For query approximation two strategies can be used: (1) examine only a part of the database, which is more probably to contain a resulting set – as a rule these methods are based

140

Access to Digital Cultural Heritage ...

on the clustering of the database; (2) splitting the database to several spaces with fewer dimensions and searching in each of them, using Random Lines Projection or Locality Sensitive Hashing.

9 Retrieval Process The third component of CBIR systems is served by retrieval engines. The retrieval engines build the bridge between the internal space of the system and the user making requests which need to be satisfied. Looking at the system side, the design of these engines is closely connected with the chosen feature representation and indexing schemes as well as the selected similarity metrics. From the user point of view, in order to take into account the subjectivity of human perception and bridge the gap between the highlevel concepts and the low-level features, relevance feedback has been proposed to enhance the retrieval performance. The other direction for facilitating that process is examining the image retrieval in more general frame of multimedia retrieval process, where content-based, modelbased, and text-based searching can be combined.

9.1

Similarity

In the process of image retrieval, choosing the features as well as indexing the database are closely connected with the used similarity measures for establishing nearness between queries and images or between the images in a given digital resource in the processes of categorization. The concept of similarity is very complex and is almost a whole scientific area itself. Similarity measures are aimed to give answer how much one object is close to another one. In the process of obtaining the similarity between two images several processes of finding similarity on different levels and data types need to be resolved. Image signature is a weighted set of feature vectors. In the case when a region-based signature is represented as such set of vectors, each of them would be represented in this way. A natural approach to defining a region-based similarity measure is to match every two corresponded vectors and then to combine the distances between these vectors as a distance between sets of vectors. For every level different similarities may be used. In Figure 20, similarity measures, used in CBIR for different features types, is presented.

Chapter 3: Automated Metadata Extraction from Art Images

141

Figure 20. Different kinds of similarity measures

9.1.1

Distance-Based Similarity Measures

The most popular similarity measures are distance measures. They can be applied in each area which meets the conditions to be a metric space for equal self-similarity, minimality, symmetry and triangle inequality 134. In mathematical notation the distance d ( X , Y ) between two vectors X and Y is a function for which d ( X , Y )  0 ; if d ( X , Y )  0 then

X Y

;

d ( X ,Y )  d (Y , X )

.

Fulfilment

of

triangle

inequality

d ( X ,Y )  d ( X , Z )  d (Z ,Y ) defines distance d ( X , Y ) as a metric. Replacing for triangle inequality with the condition d ( X ,Y )  max{d ( X , Z ), d (Z ,Y )} defines an ultra-metric, which plays an important role for the hierarchical cluster analysis. [Perner, 2003] shows a classification of some distance metrics explaining their interconnections. For two vectors X  ( x1, x2 ,..., xn ) and

Y  ( y1, y2 ,..., yn ) we can use different metrics:

134

http://www.britannica.com/EBchecked/topic/378781/metric-space

142



Access to Digital Cultural Heritage ...

The Lp -metric also called Minkowski metric is defined by the following



n

x y 

formula: d Lp ( X , Y )  

i 1

i

i

1/ p

p

  

, where the choice of the parameter

p depends on the importance of the differences in the summation; 

L1 -metric, also known as rectilinear, taxi-cab, city-block or Manhattan metric is received for

n

p  1 : d L ( X , Y )   xi  yi . This measure, 1

i 1

however, is insensible to outlier since big and small difference are equally treated; 

In the case of p  2 the resulting spaces are the so-called Hilbert spaces. One of most popular of them are Euclidean spaces, where the distance is calculated as: d Euclidean ( X , Y ) 

n

 x  y  i 1

i

i

2

. This metric gives

special emphasis to big differences in the observations and is invariant to translations and orthogonal linear transformations (rotation and reflection). In image retrieval weighted Euclidean distance is also used [Wang et al, 2001]; 



In the case p   L -metric, which can be also called Chebyshev or n

Max Norm metric is obtained: dChebyshev ( X , Y )  max xi  yi . This measure i 1

is useful if only the maximal distance between two variables among a set of variables is of importance whereas the other distances do not contribute to the overall similarity. In image matching often Hausdorff measure is used. The Hausdorff distance measures how far two subsets of a metric space are from each other. It turns the set of non-empty compact subsets of a metric space into a metric space in its own right. Informally, two sets are close in the Hausdorff distance if every point of either set is close to some point of the other set. The Hausdorff distance is the longest distance you can be forced to travel by an adversary who chooses a point in one of the two sets, from where you then must travel to the other set. The Hausdorff distance is symmetricized by computing in addition the distance with the role of X and Y reversed and choosing the larger of the two distances: n

n

n

n

i 1

j 1

j 1

i 1

d Hausdorf ( X , Y )  max(max min d ( xi , y j ),max min d ( y j , xi )) . Often techniques, used in text matching, are applied in image retrieval too. One such example is cosine similarity, which is a measure of similarity between two vectors of n dimensions by finding the cosine of

Chapter 3: Automated Metadata Extraction from Art Images

143

the angle between them, often used to compare documents in text mining. Given two vectors of attributes, X and Y , the cosine similarity is represented using a dot product and magnitude as: n

d cosine ( X , Y ) 

X  Y  X Y

x * y n

i

.

n

x  y i 1

9.1.2

i

i 1

2 i

i 1

2 i

Distance Measures for Categorical Data

Categorical data are two types – ordered and nominal. The analysis of symbolic data has led to a new branch of Data Analysis called Symbolic Data Analysis (SDA) [Esposito et al, 2002]. The degree of dissimilarity can be defined by assigning levels of dissimilarity to all the different combinations between attribute values. The mapping can be made to discrete space [0, 1] or more complex discrete linear spaces. Special distance coefficients have been designed for nominal attributes. The basis for the calculation of these distance coefficients is a contingency table, where as columns and rows either the status "not present" (0) or "present" (1) of the property are placed. The cells of the tables contain the frequency of observations that do not share the property ( N 00 ), either only one object contains the property ( N 01 or

N10 ), or both of them share the property ( N11 ). Given that distance coefficients for nominal data can be calculated as variants of generalized formula [Nieddu and Rizzi, 2003]:

N11  tN 00 , t  {0,1}, v  {0,1,2},w  {0,1} . N11  v( N10  N 01 )  wN 00 Several coefficients are subordinated of this formula, such as Jaccard coefficients ( t  0, v  1, w  0 ), Russel-Rao coefficients ( t  0, v  1, w  1 ), Sokal-Sneath coefficients ( t  0, v  2, w  0 ), Sokal-Michener coefficients ( t  1, v  1, w  1 ), Roger-Tanimoto coefficients ( t  1, v  2, w  1 ), etc. Other similarity measures that do not fit in the previous class are considered respectively as arithmetic and geometric mean of the N11 /( N11  N10 ) and N11 /( N11  N01 ) that represent the quantities proportional

of

agreements

on

1  N11 N11     (Kulczynski) and 2  N11  N10 N11  N 01  Driver-Kroeber).

the

marginal

distributions:

N11 ( N11  N10 )( N11  N01 )

(Occhiai-

144

Access to Digital Cultural Heritage ...

More sophisticated similarity measures, concerning recent data mining techniques, take into account the distribution of combinations of examined attributes as presented in [Boriah et al, 2008]. 9.1.3

Probability Distance Measures

The disadvantage of the metric measures is that they require the independence of the attributes. A high correlation between attributes can be considered as a multiple measurement for an attribute. That means the measures described above give this feature more weight as an uncorrelated attribute. Some examples of the used distances of such type are shown below. The

Mahalanobis

distance

(or

"generalized

squared

interpoint

distance") is defined as: di  ( xi  yi ) S 1 ( xi  yi ) and takes into account the covariance matrix S of the attributes. The most familiar measure of dependence between two quantities is Pearson's correlation, which is obtained by dividing the covariance of the two variables cov( X ,Y ) by the product of their standard deviations  X and

 Y :  ( X ,Y ) 

cov( X ,Y )

 X Y

.

Some similarity measures are defined in statistics. Chi-square is a quantitative measure used to determine whether a relationship exists between two categorical variables. Other similarities come from the information theory. One example is the Kullback-Leibler distance also called information divergence, information gain or relative entropy. It is defined for discrete distributions of compared objects X and Y , which have probability functions xi and n

 xi   . Although the information divergence is not a  yi 

yi : d ( X , Y )   xi log 2  i 1

true metric because d ( X ,Y )  d (Y , X ) , it satisfies many useful properties, and is used to measure the disparity between distributions. An example of probability measure based approach is the Earth Mover's Distance (EMD) [Rubner et al, 1998]. EMD is a measure, which can be used for signatures in the form of sets of vectors. The concept was first introduced by Gaspard Monge in 1781. It is a mathematical measure of the distance between two distributions. Informally, if the distributions are interpreted as two different ways of piling up a certain amount of dirt over the region, the EMD is the minimum cost of turning one pile into the other. The cost is assumed to be amount of dirt moved times the distance

Chapter 3: Automated Metadata Extraction from Art Images

145

by which it is moved. A typical signature consists of list of pairs ( ( x1 , m1 ) , … , ( xn , mn ) ), where each

xi is a certain "feature" (e.g., colour,

luminance, etc.), and mi is "mass" (how many times that feature occurs). Alternatively, xi may be the centroid of a data cluster, and mi – the number of entities in that cluster. To compare two such signatures with the EMD, one must define a distance between features, which is interpreted as the cost of turning a unit mass of one feature into a unit mass of the other. The EMD between two signatures is then the minimum cost of turning one of them into the other. EMD can be computed by solving an instance transportation problem using the so-called Hungarian algorithm [Kuhn, 1955]. The EMD is widely used to compute distances between colour histograms of two digital images. The same technique is used for any other quantitative pixel attribute, such as luminance, gradient, etc. Several attempts are focused on proposing fast algorithms for calculating EMD. For instance a fast algorithm for angular type of histograms, which make good representation of hue or gradient distribution, is suggested in [Cha et al, 1999]. 9.1.4

Structural Similarity Measures

Structural similarity is involved in a variety of pattern recognition problems when considered from an abstract perspective. The abstraction refers to measurements and observations whose specifics are ignored. One class of such problems is encountered in image processing, where a set of features or objects with topological interrelations is detected in several scenes. Whenever these are presumed to be similar according to position, proximity or else, the degree of similarity is of interest. Structures are represented throughout by labelled graphs such as image graphs. In image graphs, vertices represent image edges, corners or regions of interest such as regions of constant intensity or homogenous texture. Graph edges represent relations such as neighbourhoods or concept hierarchies. Edge labels represent distances, degrees of association or else. Special branch of measures observed similarities in graph theory. Examples of such measures are given in [Dehmer et al, 2006]. Most classical methods are based on isomorphic and sub-graph relations. For large graphs these measures are faced with the complexity of the sub-graph isomorphism problem. Other measures are based on graph transformations. The graph edit distance is defined as minimum cost of transformations (deletion, substitutions, insertions) of vertices and edges, which need to transform one graph into another one. The idea of finding underlying graph grammar, in which both of graphs belong, is

146

Access to Digital Cultural Heritage ...

used to define some further measures. The application of such measures is very complex, because the underlying grammar is difficult to define. Graph kernels take the structure of the graph into account. They work by counting the number of common random walks between two graphs. Even though the number of common random walks could potentially be exponential, polynomial time algorithms exist for computing these kernels. From graph theory in image retrieval, often Geodesic distances are used to measure the similarities between images. The geodesic distance is defined as the shortest path between two vertexes of a graph.

9.2

Techniques for Improving Image Retrieval

Single similarity measure is not sufficient to produce robust, perceptually meaningful ranking of images. The results achieved with the classical content based approaches are often unsatisfactory for the user. As an alternative, learning-based techniques such as clustering and classification are used for speeding-up image retrieval, improving accuracy, or for performing automatic image annotation. Including the relevance feedback in this process allows the user to refine t queryspecific semantics. Bridging the gaps between primitive feature levels, which are produced in the classic CBIR systems and higher levels, which are convenient for the user, can be made with examining the image retrieval process in more general frame of multimedia retrieval process. It needs to integrate multimedia semantics-based searching with other search techniques (speech, text, metadata, audio-visual features, etc.) and to combine content-based, model-based, and text-based searching. It can be made in two main ways – creating the semantic space through statistical pattern recognition and machine learning techniques or creating semantic concepts, which can be incorporated in an already built semantic space (for instance – created ontologies, describing interconnections between examined concepts). Several techniques in these directions are used.  Unsupervised Clustering Unsupervised clustering techniques are a natural fit when handling large, unstructured image repositories such as the Web. Clustering methods fall roughly into three main types: pair-wise-distance-based, optimization of an overall clustering quality measure, and statistical modelling. The pair-wise distance-based methods (e.g., linkage clustering and spectral graph partitioning) are of general applicability, since the mathematical representation of the instances becomes irrelevant. One disadvantage is the high computational cost. Clustering based on the

Chapter 3: Automated Metadata Extraction from Art Images

147

optimization of an overall measure of clustering quality is a fundamental approach used in pattern recognition. The general idea in statistical modelling is to treat every cluster as a pattern characterized by a relatively restrictive distribution, and the overall dataset is thus a mixture of these distributions. For continuous vector data, the most commonly used distribution of individual vectors is the Gaussian distribution.  Image Categorization (Classification) Image categorization (classification) is advantageous when the image database is well specified, and labelled training samples are available. Classification methods can be divided into two major branches: discriminative and generative modelling approaches. In discriminative modelling, classification boundaries or posterior probabilities of classes are estimated directly, for example, Support Vector Machines (SVM) and decision trees. In generative modelling, the density of data within each class is estimated and the Bayes formula is then used to compute the posterior. Discriminative modelling approaches are more direct when optimizing classification boundaries. On the other hand, generative modelling approaches are easier to incorporate with prior knowledge and can be used more conveniently when there are many classes.

10 Conclusion As in other cultural heritage domains, digital art images also require methods to resolve art issues and to experiment with and implement approaches for involving the users without compromising the trustworthiness of the resources. We believe that areas which will develop with a priority in the very near future are: 

Further refining of specialized image retrieval techniques seeking to both improve the quality of the analysis and to overcome the semantic gap;



Defining best practices in involving the users (individual users as well as communities of users);



Sustaining trustworthiness of the resources when social media tools are used to add user generated content;



Improving not only the information delivery but also the user experiences and expanding the delivery of information with immersing technologies.

The ultimate goal is to facilitate the access to art objects in digital form and to convert it to fun and a great experience.

148

Access to Digital Cultural Heritage ...

Bibliography [Agarwal, 2009] Agarwal, A.: Web 3.0 concepts explained in plain English. 30.05.2009. http://www.labnol.org/internet/web-3-concepts-explained/8908/ [Bellman, 1961] Bellman, R.: Adaptive Control Processes: a Guided Tour. Princeton University Press, 1961. [Best, 2006] Best, D.: Web 2.0 Next big thing or next big Internet bubble? Lecture Web Information Systems. Technische Universiteit Eindhoven, 2006. [Boriah et al, 2008] Boriah, S., Chandola, V., Kumar, V.: Similarity Measures for Categorical Data: A Comparative Evaluation, In Proc. of 2008 SIAM Data Mining Conf., 2008, Atlanta, pp. 243-254. [Burford et al, 2003] Burford, B., Briggs, P., Eakins, J.: A taxonomy of the image: on the classification of content for image retrieval. Visual Communication, 2/2, 2003, pp. 123-161. [Carson et al, 2002] Carson, C., Belongie, S., Greenspan, H., Malik J.: Blobworld: image segmentation using expectation-maximization and its application to image querying. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24/8, 2002, pp. 1026-1038. [Castelli and Bergman, 2002] Castelli, V., Bergman, L. (eds.): Image Databases: Search and Retrieval of Digital Imagery, John Wiley & Sons, 2002. [Cha et al, 1999] Cha, S.-H., Shin, Y.-C., Srihari, S.: Algorithm for the Edit Distance between Angular Type Histograms. Technical report, St.Univ. of New York at Buffalo, 1999. [Chavez et al, 2001] Chavez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.: Searching in metric spaces. ACM Computing Surveys, 33/3, 2001, pp. 273-321. [Chen et al, 2001] Chen, Y., Zhou, X., Huang T.: One-class SVM for learning in image retrieval. Proc. IEEE Int. Conf. on Image Processing, vol. 1, 2001, pp. 34-37. [Chen et al, 2005] Chen, C.-C., Wactlar, H., Wang, J., Kiernan, K.: Digital imagery for significant cultural and historical materials – An emerging research field bridging people, culture, and technologies. Int. J. Digital Libraries, 5/4, 2005, pp. 275–286. [Colombo et al, 1999] Colombo, C., Del Bimbo, A., Pala, P.: Semantics in visual information retrieval. IEEE Trans. on Multimedia 6, 3, 1999, pp. 38-53. [Comaniciu and Meer, 1999] Comaniciu, D., Meer, P.: Mean shift analysis and applications. 7th Int. Conf. on Computer Vision, Kerkyra, Greece, 1999, pp. 1197-1203. [Croft, 1995] Croft, W.: What Do People Want from Information Retrieval? (The Top 10 Research Issues for Companies that Use and Sell IR Systems). Center for Intelligent Information Retrieval Computer Science Department, University of Massachusetts, Amherst, 1995. [Crucianu et al, 2004] Crucianu, M., Ferecatu, M., Boujemaa, N.: Relevance feedback for image retrieval: a short survey. State of the Art in Audiovisual Content-Based Retrieval, Information Universal Access and Interaction Including Data Models and Languages (DELOS2 Report), 2004. [Dasiapoulou et al, 2007] Dasiapoulou, S., Spyrou, E., Kompatsiaris, Y., Avrithis, Y., Stintzis, M.: Semantic processing of colour images. Colour Image Processing: Methods and Applications, Ch. 11, CRC Press, Boca Raton, USA, 2007, pp. 259-284.

Chapter 3: Automated Metadata Extraction from Art Images

149

[Datta et al, 2008] Datta, R., Joshi, D., Li, J., Wang, J.: Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys, 40/2/5, 2008, 60 p. [Datta, 2009] Datta, R.: Semantics and Aesthetic Inference for Image Search: Statistical Learning Approaches. PhD thesis, the Pennsylvania State University, 2009. [Daubechies, 1988] Daubechies, I.: Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics. 41/7, 1988, pp.909-996. [de Berg et al, 2000] de Berg, M., van Kreveld, M., Overmars, M., Schwarzkopf O.: Computational Geometry: Algorithms and Applications (2nd revised ed.), SpringerVerlag, 2000. [Dehmer et al, 2006] Dehmer, M., Emmert-Streib, F., Wolkenhauer, O.: Perspectives of graph mining techniques. Rostocker Informatik Berichte, 30/2, 2006, pp. 47-57. [Demartines and Herault, 1997] Demartines, P., Herault, J.: Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets. IEEE Trans. Neural Networks, 8/1, 1997, pp. 148-154. [Deng and Manjunath, 2001] Deng, Y., Manjunath, B.: Unsupervised segmentation of colour-texture regions in images and video. IEEE Trans. Pattern Analysis and Machine Intelligence, 23/8, 2001, pp. 800-810. [Dobreva and Chowdhury, 2010] Dobreva, M., Chowdhury, S.: A User-Centric Evaluation of the Europeana Digital Library. In: ICADL 2010, LNCS 6102, 2010, pp. 148-157. [Eakins and Graham, 1999] Eakins, J., Graham, M.: Content-based Image Retrieval. University of Northumbria at Newcastle. Report: 39, JSC Technology Application Programme, 1999. [Eidenberger, 2003] Eidenberger, H.: Distance measures for MPEG-7-based retrieval. Fifth ACM SIGMM Int. Workshop on Multimedia Information Retrieval, 2003, pp. 130-137. [Enser et al, 2006] Enser, P., Sandom, Ch., Lewis, P., Hare, J.: The reality of the semantic gap in image retrieval. Tutorial, 1st Int. Conf. on Semantic and Digital Media Technologies, Athens, Greece, 2006. http://eprints.ecs.soton.ac.uk/13272/ [Esposito et al, 2002] Esposito, F., Malebra, D., Tamma, V., Bock H.: Classical resemblance measures, Analysis of Symbolic Data, Springer, 2002, pp. 139-152. [Estrada, 2005] Estrada, F.: Advances in Computational Image Segmentation and Perceptual Grouping. PhD Thesis, Graduate Department of Computer Science, University of Toronto, 2005. [Freeman, 1975] Freeman, J.: The modelling of spatial relations. Computer Graphics and Image Processing, 4/2, 1975, pp. 156-171. [Gaede and Günther, 1998] Gaede, V., Günther, O.: Multidimensional access methods. ACM Computing Surveys, 30/2, 1998, pp. 170-231. [George, 2008] George, C.: User-Centred Library Websites. Usability Evaluation Methods. Chandos publishing, 2008. [Gheyas and Smith, 2010] Gheyas, I., Smith, L.: Feature subset selection in large dimensionality domains. Elsevier, Pattern Recognition, 43/1, 2010, pp. 5-13. [Gong et al, 1996] Gong, Y., Chuan, C., Xiaoyi, G.: Image indexing and retrieval using colour histograms. Multimedia Tools and Applications, vol. 2, 1996, pp. 133-156. [Grosky et al, 2008] Grosky, W., Agrawal, R., Fotouchi, F.: Mind the gaps – finding the appropriate dimensional representation for semantic retrieval of multimedia assets. In Semantic Multimedia and Ontologies, Springer London, 2008, pp. 229-252.

150

Access to Digital Cultural Heritage ...

[Gruber, 1993] Gruber, T.: A translation approach to portable ontologies. Knowledge Acquisition, 5/2, 1993, pp. 199-220. [Han and Kamber, 2006] Han, J., Kamber, M.: Data Mining: Concepts and Techniques, Second ed., Morgan Kaufmann Publishers, 2006. [Herrmann, 2002] Herrmann, S.: MPEG-7 Reference Software. Munich University of Technology. http://www.lis.e-technik.tu-muenchen.de/research/bv/topics/mmdb/e_mpeg7.html [Huber, 1985] Huber, P.: Projection pursuit. The Annals of Statistics, 13/2, 1985, pp. 435-475. [Hung et al, 2007] Hung, Sh.-H., Chen, P.-H., Hong, J.-Sh., Cruz-Lara, S.: Context-based image retrieval: A case study in background image access for multimedia presentations. IADIS Int. Conf. WWW/Internet 2007, Vila Real, Portugal, INRIA00192463, ver.1, 2007, 5 p., http://hal.inria.fr/inria-00192463/en/ [Hurtut, 2010] Hurtut, T.: 2D Artistic Images Analysis, a Content-based Survey. http://hal.archives-ouvertes.fr/hal-00459401_v1/ [ISO/IEC 15938-3] International Standard ISO/IEC 15938-3 Multimedia Content Description Interface – Part 3: Visual, http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber= 34230 [ISO/IEC JTC 1/SC29 WG11] WG11: The Moving Picture Experts Group "Coding of Moving Pictures and Audio" http://www.itscj.ipsj.or.jp/sc29/29w12911.htm [Itten, 1961] Itten, J.: The Art of Colour: the Subjective Experience and Objective Rationale of Colour, Reinhold Publishing Corporation of New York, 1961. [Ivanova and Stanchev, 2009] Ivanova, K., Stanchev, P.: Colour harmonies and contrasts search in art image collections. First Int. Conf. on Advances in Multimedia (MMEDIA), Colmar, France, 2009, pp. 180-187. [Ivanova et al, 2010/Euromed] Ivanova, K., Dobreva, M., Stanchev, P., Vanhoof K.: Discovery and use of art images on the web: an overview. Proc. of the Third Int. Euro-Mediterranean Conf. EuroMed, Lemesos, Cyprus, Archaeolingua, 2010, pp. 205-211. [Ivanova et al, 2010/MCIS] Ivanova, K., Stanchev, P., Vanhoof, K., Ein-Dor, Ph.: Semantic and abstraction content of art images. Proc. of Fifth Mediterranean Conf. on Information Systems, Tel Aviv, Israel, 2010, AIS Electronic Library, paper 42, http://aisel.aisnet.org/mcis2010/42. [Jaimes and Chang, 2002] Jaimes, A., Chang, S.-F.: Concepts and techniques for indexing visual semantics. Image Databases: Search and Retrieval of Digital Imagery. John Wiley & Sons, 2002, pp. 497-565. [Kato, 1992] Kato, T.: Database architecture for content-based image retrieval. Proc. of the SPIE – The International Society for Optical Engineering, San Jose, CA, USA, vol. 1662, 1992, pp. 112-113. [Kuhn, 1955] Kuhn, H.: The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, vol. 2, 1955, pp. 83-97. [Luxburg, 2006] Luxburg, U.: A Tutorial on Spectral Clustering. Tech. Report No. TR-149, Max Planck Institute for Biological Cybernetics, 2006. [Maitre et al, 2001] Maitre, H. Schmitt, F. Lahanier, C.: 15 years of image processing and the fine arts. Proc. of Int. Conf. on Image Processing, vol. 1, 2001, pp. 557-561.

Chapter 3: Automated Metadata Extraction from Art Images

151

[Marchenko et al, 2007] Marchenko, Y., Chua, T., Jain R.: Ontology-based annotation of paintings using transductive inference framework. Advances in Multimedia Modeling (MMM 2007), Springer, LNCS 4351, Part I, pp. 13-23. [Markov et al, 2008] Markov, K., Ivanova, K., Mitov, I., Karastanev, S.: Advance of the access methods. Int. Journal Information Technologies and Knowledge, 2/2, 2008, pp. 123-135. [Mattison, 2004] Mattison, D.: Looking for good art. Searcher – The Magazine for Database Professionals, vol. 12, 2004, part I – number 8, pp. 12-35; part II – number 9, pp. 8-19; part III – number 10, pp. 21-32. [Mitov et al, 2009b] Mitov, I., Ivanova, K., Markov, K., Velychko, V., Stanchev, P., Vanhoof, K.: Comparison of discretization methods for preprocessing data for pyramidal growing network classification method. In IBS ICS – Book No: 14. New Trends in Intelligent Technologies, Sofia, 2009, pp. 31-39. [Moënne-Loccoz, 2005] Moënne-Loccoz, N.: High-Dimensional Access Methods for Efficient Similarity Queries. Tech. Report No: 0505, University of Geneva, Computer Vision and Multimedia Laboratory, 2005. [MPEG-7:4062, 2001] MPEG-7, Visual experimentation model (xm) version 10.0. ISO/IEC/ JTC1/SC29/WG11, Doc. N4062, 2001. [Nasrabadi and King, 1988] Nasrabadi, N., King, R.: Image coding using vector quantization: a review. IEEE Trans. on Communications, 36/8, 1988, pp. 957-971. [Nieddu and Rizzi, 2003] Nieddu, L., Rizzi, A.: Proximity Measures in Symbolic Data Analysis. Statistica, 63/2, 2003, pp. 195-212. [Ooi et al, 1993] Ooi, B., Sacks-Davis, R., Han, J.: Indexing in Spatial Databases. Tech. Report. 1993. [Pavlov et al, 2010] Pavlov, R., Paneva-Marinova, D., Goynov, M., Pavlova-Draganova, L.: Services for Content Creation and Presentation in an Iconographical Digital Library. Serdica J. of Computing, 4/2, 2010, pp.279-292. [Pavlova-Draganova et al, 2010] Pavlova-Draganova, L., Paneva-Marinova, D., Pavlov, R., Goynov, M.: On the Wider Accessibility of the Valuable Phenomena of the Orthodox Iconography through a Digital Library. Third Int. Euro-Mediterranean Conf. EuroMed, Lemesos, Cyprus, 2010, Archaeolingua, pp. 173-178. [Perner, 2003] Perner, P.: Data Mining on Multimedia Data. Springer-Verlag NY, 2003. [Pickett et al, 2000] Pickett, J. (ed.): The American Heritage Dictionary of the English Language. Houghton Mifflin Co., 2000. [Raymond, 1999] Raymond, E.: The Cathedral & the Bazaar. O'Reilly, 1999. http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ [Rubner et al, 1998] Rubner, Y., Tomasi, C., Guibas, L.: A metric for distributions with applications to image databases. In IEEE Int. Conf. on Computer Vision, 1998, pp. 59-66. [Saul and Roweis, 2000] Saul, L., Roweis, S.: An Introduction to Locally Linear Embedding. Tech. Report, AT&T Labs and Gatsby Computational Neuroscience Unit, 2000. [Shi and Malik, 2000] Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22/8, 2000, pp. 888-905. [Smeulders et al, 2000] Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content based image retrieval at the end of the early years. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22/12, 2000, pp. 1349-1380.

152

Access to Digital Cultural Heritage ...

[Smith, 2002] Smith, L.: A Tutorial on Principal Components Analysis, 2002. http://csnet.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf [Snoek et al, 2005] Snoek, C., Worring, M., Smeulders, A.: Early versus late fusion in semantic video analysis. In Proc. of the 13th Annual ACM Int. Conf. on Multimedia, 2005, pp. 399-402. [Stanchev et al, 2006] Stanchev, P., Green Jr., D., Dimitrov, B.: Some issues in the art image database systems. Journal of Digital Information Management, 4/4, 2006, pp. 227-232. [Stork, 2008] Stork, D.: Computer image analysis of paintings and drawings: An introduction to the literature. Proc. of the Image processing for Artist Identification Workshop, van Gogh Museum, Amsterdam, The Netherlands, 2008. [Striker and Dimai, 1997] Striker, M., Dimai, A.: Spectral covariance and fuzzy regions for image indexing. Machine Vision and Applications, vol. 10, 1997, pp. 66-73. [Tremeau et al, 2008] Tremeau, A., Tominaga, S., Plataniotis, K.: Colour in i mage and video processing: most recent trends and future research directions. Journal Image Video Processing, vol. 1, 2008, pp. 1-26. [Tu and Zhu, 2002] Tu, Z., Zhu, S.: Image segmentation by data-driven Markov chain Monte Carlo. IEEE Trans. Pattern Analysis and Machine Intelligence, 24/5, 2002, pp. 657-673. [Wang et al, 2001] Wang, J., Li, J., Wiederhold, G.: SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23/9, 2001, pp. 947-963. [Wang et al, 2003] Wang, S., Chia, L., Deepu, R.: Efficient image retrieval using MPEG-7 descriptors. Int. Conf. on Image Processing, 2003, pp. 509-512. [Wang et al, 2006] Wang, J., Boujemaa, N., del Bimbo, A., Geman, D., Hauptmann, A., Tesic, J.: Diversity in multimedia information retrieval research. Proc. of the ACM SIGMM Int. Workshop on Multimedia Information Retrieval (MIR) at the Int. Conf. on Multimedia, 2006, pp. 5-12. [Wei-ning et al, 2006] Wei-ning, W., Ying-lin, Y., Sheng-ming, J.: Image retrieval by emotional semantics: a study of emotional space and feature extraction. IEEE Int. Conf. on Systems, Man and Cybernetics, vol. 4, 2006, pp. 3534-3539. [WIDWISAWN, 2008] Special issue on Web 2.0. vol. 6 n. 1. http://widwisawn.cdlr.strath.ac.uk/issues/vol6/issue6_1_1.html [Won et al, 2002] Won, Ch., Park, D., Park, S.: Efficient use of MPEG-7 edge histogram descriptor. ETRI Journal, 24/1, 2002, pp. 23-30. [Yang et al, 2005] Yang, N., Dong, M., Fotouhi, F.: Semantic feedback for interactive image retrieval. Proc. of the 12th ACM Int. Conf. on Multimedia, 2005, pp. 415-418. [Yang et al, 2008] Yang, N., Chang, W., Kuo, C., Li, T.: A fast MPEG-7 dominant colour extraction with new similarity measure for image retrieval, Journal of Visual Communication and Image Representation, 19/2, 2008, pp. 92-105. [Zhou and Huang, 2003] Zhou, X., Huang, T.: Relevance feedback in image retrieval: a comprehensive review. Multimedia Systems, 8/6, 2003, pp. 536-544.

Chapter 4: APICAS – Content-Based Image Retrieval in Art Image Collections Utilizing Colour Semantics Krassimira Ivanova, Peter Stanchev, Koen Vanhoof, Milena Dobreva 1 Colour – Physiology and Psychology From all the senses that connect us to the world – vision, hearing, taste, smell, and touch – vision is the most important. More than 80% of our sensory experiences are visual [Holtschue, 2006]. When the brain receives light stimulus, it first recognizes shapes and objects and separates the objects from their surrounding environment. Figure-ground separation or pattern recognition is the first cognitive step in the process of perception. In this process, colour plays an important but secondary role. Colour responses are tied stronger to human emotions than to intellectual judgement. Even this property on its own illustrates why colours have such a powerful influence on human perception. The presence of one or more colours in different proportions conveys different messages, which can increase or suppress the perception of the observed objects. Jointly with shape, colour is one of the fundamental building blocks of visual symbols. It is also closely associated with mental and emotional states, and can affect them profoundly [O'Connel et al, 2009]. Colours play a major role in the field of image retrieval. Within this context it is not the colour itself but the perception of colours and colour combinations as similar or dissimilar what is crucial when one has to extract images by some criterion related to the level of emotional

154

Access to Digital Cultural Heritage ...

perception, or to search for the specifics of expressiveness of the artist. All these tasks fall in already discussed abstraction aspects of image content. For instance, different painting techniques reflect technical abstractions, as well as the use of colour combinations as particular expressive means grounded on emotional abstractions. Here we make an overview of the existing qualitative descriptions of the phenomena, which we will use later to propose an appropriate transition to quantitative formal description of successful colour combinations already defined by artists and art researchers. The nature of colour is a subject of study in various sciences. Physics studied electromagnetic structure of light waves, physiology is interested in the perception of light waves as colours, psychology explores the problems of colour perception and its impact on intelligence, mathematics constructs techniques for structuring colour spaces and their measurement. It appears that the basic laws for the establishment of harmony, colour, different ways to use contrast, the ratio of colour components with other forms of art such as line, plastic, lights, etc., which in theory and practice of painting are born intuitively, have scientific explanations in different disciplines (which does not mean of course that creating of the masterpiece is a simple process of following any schemes blindly). The problems of colour can be examined from several aspects. The physicist studies the nature of the electromagnetic energy vibrations and particles involved in the phenomenon of light, the several origins of colour phenomena such as prismatic dispersion of white light, and the problems of pigmentation. He investigates mixtures of chromatic light, spectra of the elements, frequencies and wave lengths of coloured light rays. Measurement and classification of colours are also a topic in physical research. The chemist studies the molecular structure of dyes and pigments, problems of colour fastness, vehicles, as well as preparation of synthetic dyes. Colour chemistry today embraces an extraordinarily wide field of industrial research and production. The physiologist investigates the various effects of light and colours on our visual apparatus – eye and brain – and their anatomical relationships and functions. Research on light and dark adaptation and on chromatic colour vision occupies an important place. The phenomenon of afterimages is another physiological topic. The psychologist is interested in the influence of colour radiation on human mind and spirit. Colour symbolism, and the subjective perception and discrimination of colours, are important psychological problems. Expressive colour effects – what Goethe called ethico-aesthetic values of colours – likewise fall within the psychologist's research [Itten, 1961]. Cultural studies and semiology are both concerned with the meaning and

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

155

interpretation of colours in different cultures. Engineering is investigating what are the best ways of generating high quality colours in different devices – starting from television and ending with small portable devices. The artist, finally, is interested in colour effects from their aesthetic aspect, and needs both physiological and psychological information. Discovery of relationships, mediated by the eye and brain, between colour agents and colour effects in man, is a major concern of the artist. Visual, mental and spiritual phenomena are multiply interrelated in the realm of colour and the colour art [Itten, 1961].

1.1

Physiological Ground of the Colour Perceiving

The colour of the physical point of view is part of the electromagnetic spectrum with a wavelength of 380 nm to 780 nm (usually rounded or between 400 nm and 700 nm). The colour of an object depends on both the physics of the object in its environment and the characteristics of the perceiving eye and brain. Two complementary theories of colour vision are the trichromatic theory and the opponent process theory.  Triсhromatic Theory Great importance for the development of the colour theory is the Newton discovery in 1666 that white light is a mix of all colours of the spectrum. In 1801, Thomas Jung suggests the hypothesis that mixing only three primary colours can produce all colours. Later Hermann von Helmholtz elaborated on this theory with the assumption that in the retina of the human eye has receptors responding to the three primary colours and all colours are obtained by mixing these three colours with different intensities. The trichromatic theory has been confirmed experimentally in 1960, when three types of receptors were identified in the retina, preferentially sensitive to red, green and blue light waves.  Opponent Theory The idea of opponent theory emerged in the studies of Leonardo da Vinci about 1500. Similar views were expressed by Arthur Schopenhauer , but the first integral presentation of this theory had been proposed in the works of Ewald Hering in 1872 [Hering, 1964]. The theory suggests that there are three channels opposing each other: red against green, blue against yellow, and black against white (the latter channel is achromatic and carries information about variations of lightness). Responses to one colour of an opponent channel are antagonistic to those of the other colour. To put it in another way, there are certain pairs of colours one never sees together at the same place and at the same time. One does not see reddish greens or yellowish blues but does see yellowish greens,

156

Access to Digital Cultural Heritage ...

bluish reds, yellowish reds, etc. One practical example for this theory is the so-called after-image phenomenon: if one looks at a unique red patch for about a minute and then switches the gaze to a homogeneous white area he would see a greenish patch on the white area. In other words after-image will produce such colours that in combination with the first colour are neutral. The opponent theory was confirmed in the 1950s, when opposing colour signals were found in optical connections between the eye and brain. At that time a pair of visual scientists working at Eastman Kodak conceived a method for quantitatively measuring the opponent processes responses. Leo Hurvich and Dorothea Jameson invented the hue cancellation method to psychophysically evaluate the opponent processing nature of colour vision [Hurvich and Jameson, 1957]. Modern theories combine these two theories: the process starts by light entering the eye, which stimulates the trichromatics cones in the retina, and is further processed into three opponent signals on their way to the brain. More recent developments are in the Retinex theory, proposed by Edwin Land. Experiments show that people have a considerable amount of colour constancy (i.e. colours are perceived the same even under different illumination) [Gevers, 2001].  Colour Perception Colour perception is not an independent process and is influenced by conditions in which this act takes place. On the one hand a physical interference of waves leads to the perception of two colours as another one, which had been used by impressionists after the invention of new techniques of laying the paints. On the other hand, the perception of colour provokes mutual induction of the nerve processes; according to Pavlov, the law of mutual induction of nerve processes is one of the fundamental laws of the nerve physiology [Raychev, 2005]. Mutual induction in the perception of colour leads to a change in the perception of a given colour, depending on the stimuli in another part of the retina (simultaneous contrast) or stimuli applied earlier on the same spot of the retina (consecutive contrast). Contrasting colour changes, resulting from the simultaneous operation of different colours, can be analysed through three main features characterizing the colour – hue, brightness and saturation. Exceptions are achromatic and monochromatic images that use only the contrast of brightness. The perception of colour depends largely on the background. Under its influence, colours are seen in other tints and shades. The perception of achromatic colours, placed among chromatic ones, is also changing. Gray colour on a red background is perceived with a

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

157

greenish hue, on a yellow background – with a bluish one, on a green – with a pinkish one, on a blue – with a yellowish, i.e. the colour hue of the object acquires the additional background colour. The same principles are valid to consecutive contrast, but in this case cannot refer to the background and the object, but to preceding visual stimuli, which affect the next colour, laid in the same position of sight. This is a result of the colour eye fatigue. The result is a coherent image, which remains different depending on the length of preceding visual stimulus as well as on its colour composition.

1.2

Image Harmonies and Contrasts

The contrasts are experienced when we establish differences between two observed effects. When these differences reach maximum values we talk about diametrical contrast. Our senses perceive only through comparison. For instance an object is perceived as being short when it is near to a long object and vice versa. In a similar way colour effects become stronger or weaker thorough contrasts. Multiple scholars observed and examined the influence of colours on each other. Aristotle in his "De meteorologica" formulated questions about the difference of violet near white or black wool [Gage, 1993]. In 1772 – the same year that Johann Heinrich Lambert constructed his colour pyramid and demonstrated for the first time that the completeness of colours can only be reproduced within a three dimensional system [Spillmann, 1992], another colour circle was published in Vienna by Ignaz Schiffermüller. He was one of the first who arranged the complementary colours opposite each other: blue opposite orange; yellow opposite violet; red opposite green [Gage, 1993]. Leonardo da Vinci noticed that when observed adjacent to each other, colours are influencing the perception. Goethe, however, was the first to specifically draw attention to these associated contrasts. Johann von Wolfgang Goethe in his book Theory of Colours, published in 1810, studied the emotion and psychological influence of colours. His six-hue spectrum of colours remains the standard for artists even nowadays [Birren, 1981]. Michel Eugène Chevreul (1786-1889) had contributed to the study of contrast establishing the law of simultaneous contrast in 1839 [Gage, 1993]. When colours interact, they are capable of change in appearance, depending on particular relationships with adjacent or surrounding colours. Simultaneous contrast is strongly tied to the phenomenon of afterimage, also known as successive contrast, when the eye spontaneously generates the complementary colour even when the hue is

158

Access to Digital Cultural Heritage ...

absent. The explanation of successive contrast is given in opponent colour vision theory. Successive and simultaneous contrasts suggest that the human eye is satisfied, or in equilibrium, only when the complementary colour relation is established. Research on the mutual influences of colours had strongly manifested in the studies of Georges Seurat who suggested the optical fusion theory, also called Pointillism or Illusionism. The theory behind this optical mixture was set out as early as in the 2 nd century by Ptolemy who identified two ways of achieving optical fusion; one by distance where "the angle of vision formed by rays of light from the very small patches of colour was too small for them to be identified separately by the eye, hence many points of different colours seemed together to be the same colour" [Gage, 1993]. The other related to after images and moving objects. The use of this theory lays in the established new painting technique, firstly showed by Seurat in his painting "Sunday Afternoon on the Island of La Grande Jatte" in 1886. He called this phenomenon "Chromoluminarisme" or "Peinture Optique". The Pointillist technique consists of "placing a quantity of small dots of two colours very near each other, and allowing them to be blended by the eye at the proper distance" [Birren, 1981]. Adolf Hoelzel suggested seven contrast groups, based on his own understanding of the colour wheels. Every contrast marks some quality of colour perception. His contrasts are: (1) Contrast of the hue; (2) LightDark; (3) Cold-Warm; (4) Complementary; (5) Gloss-Mat; (6) MuchLittle; (7) Colour-Achromatic [Gage, 1993]. The great contribution in revealing effects of colour interactions was made by Josef Albers (1888-1976). His book "The Interaction of Colour" [Albers, 1963] became quintessential in understanding colour relationships and human perception. Albers stated that one colour could have many "readings", dependent both on lighting and the context in which it is placed. He felt that the comprehension of colour relationships and interactions was the key to gaining an eye for colour. According to Albers, we rarely see a colour that is not affected by other colours. Even when a colour is placed against a pure neutral of black, white, or gray, it is influenced by that neutral ground. Colours interact and are modified in appearance by other colours in accordance with three guiding rules: Light/dark value contrast, Complementary reaction, and Subtraction. Johannes Itten (1888-1967) expanded the theories of Hoelzel and Albers. He defined and identified strategies for successful colour combinations [Itten, 1961]. Through his research he devised seven methodologies for coordinating colours utilizing the hue's contrasting properties. These contrasts add other variations with respect to the

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

159

intensity of the respective hues; i.e. contrasts may be obtained due to light, moderate, or dark value. He defined the following types of contrasts: 

Contrast of hue: the contrast is formed by the juxtaposition of different hues. The greater the distance between hues on a colour wheel, the greater the contrast;



Light-dark contrast: the contrast is formed by the juxtaposition of light and dark values. This could be a monochromatic composition;



Cold-warm contrast: the contrast is formed by the juxtaposition of hues considered "warm" or "cold";



Complementary contrast: the contrast is formed by the juxtaposition of colour wheel or perceptual opposites;



Simultaneous contrast: the contrast is formed when the boundaries between colours perceptually vibrate. Some interesting illusions are accomplished with this contrast;



Contrast of saturation: the contrast is formed by the juxtaposition of light and dark values and their relative saturation;



Contrast of extension (also known as the Contrast of proportion): the contrast is formed by assigning proportional field sizes in relation to the visual weight of a colour.

1.3

Psychological Colour Aspects

The colour impact on people depends on many factors, where physical laws and physiology are only the beginning. Psychological perception plays an important role in this process which is influenced by the particular psychological state on the one hand, and by socio-cultural environment in which the character of a person is composed on the other hand. Perception of colour brings the whole emotional and mental identity of the observer, his/her intelligence, memory, ideology, ethics, aesthetic feelings and other sensations. These feelings as well as philosophical, religious and other aspects of the categories of colour perception are essential to its nature and create relative symbolic aspects of colour impression. Using colour as a symbol dates back to antiquity. Simple natural feeling, caused by the colour, gradually had been canonized in a system of secular or religious symbols. Thus a deep stratification of religious, social, historical, moral, ethical, psychological, etc. symbolism occurs, which sometimes leads to impossibility to detect what is the primary affective value of a particular colour. Heraldry is a typical example of such formal system in which every colour and composition conditionally acquired some symbolic meaning [Raychev, 2005]. Such orderly system

160

Access to Digital Cultural Heritage ...

of colour symbols is built in within the liturgical system of the Catholic, Orthodox and Protestant churches in which each liturgical colour carries some message and importance and may be used only under certain circumstances. The white colour for example symbolizes innocence, purity and joy. The red colour symbolizes fire and blood, sacrifice and martyrdom. The green colour brings hope and life. The purple colour is associated with relaxation, contemplation and repentance. The pink colour marks moments of joy during periods of penance and fasting. The black colour is associated with sorrow and sadness. Gold is allowed during the holidays and is associated with praise and high mood [Goldhammer, 1981]. Such symbolic colour systems had been developed in almost all nations. For example, in China five colours: white, black, blue, yellow and red symbolize certain concepts and attributes of the objects of the world around us as the cardinal directions, seasons, weather events, taste, character features and others [Raychev, 2005]. There are general principles of psychological elements of colour perception regardless of the formation sources of conditional symbolic systems in different cultures. All phenomena in the field of psychological perception of colour, which cannot be explained as a direct result of the visual impression of colour, can find an explanation by way of association. Associations are built between factors which coexist constantly or frequently. For instance binding of green with the concept of hope is the result of constantly repeated relationship between green plants and hope for future good harvest. Associations can be strictly individual, for example the relationship between blue and the mother can arise only for a person whose mother is blue-eyed or wears blue most of the time, but by no means it can be a common association. Adopted secular and religious symbolism of colour is reflected in art because of its social function. For example there is no reason to be surprised that in West European Renaissance works of art before Constable (1776-1837) missed green tones. The academic style of painting at that time has presented a green with brown tones with scarce presence of green. Red robes as a symbol of the martyrdom of Jesus seemed natural and in line with religious symbolism in the painting of El Greco "The disrobing of Christ (1583)".

2 Art Image Analyzing Systems In the addressed paradigm. Tanase in

last 20 years, numerous research and development efforts the image retrieval problem, adopting the similarity-based The technical report made by Remco Veltkamp and Mirela 2000 still remains a precise and comprehensive review of

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

161

industrial systems developed since the beginning of the work on CBIR until the end of the previous century [Veltkamp and Tanase, 2000]. While earlier developments were focused on extracting visual signatures, the recent ones address mostly the efficient pre-processing of visual data as a means to improve the performance of neural networks and other learning algorithms when dealing with content-based classification tasks. Given the high dimensionality and redundancy of visual data, the primary goal of pre-processing is to transfer the original data to a low-dimensional representation that preserves the information relevant for the classification. The performance of the techniques is assessed on a difficult painting-classification task that requires painter-specific features to be retained in the low-dimensional representation. Most existing specialized systems focus on artworks analysis techniques over digital imagery, e.g. virtual restoration of artworks (inpainting, fading colour enhancement, crack removal, etc.). Other issues addressed are aspects requiring specific technical expertise (authentication, cracks, and art forgeries). However, our primary area of interest in this dissertation is the content based retrieval of artworks in databases, as well as specific artists' studies (colour palette statistics, creative processes, etc.) and art history investigation. The emotional and aesthetic charge is inextricably bound up with other components that create the whole presentation of the artwork. Hence, emotional based image retrieval has also its place in the review. Although all CBIR systems have, generally speaking, a common task, they differ in their goals, depending on the specific needs for which they were created. We examined following systems: QBIC [Flickner et al, 1995], PICASSO system [Del Bimbo and Pala, 1997], Pictorial portrait database of miniatures of the Austrian National Library [Sablatnig et al, 1998], Painting classification system [Keren, 2002], Art historian system [Icoglu et al, 2004], Lightweight image retrieval system for paintings [Lombardi et al, 2005], Collage [Ward et al, 2005], Brushwork identification [Marchenko et al, 2006], M4ART [Broek et al, 2006], ACQUINE [Datta et al, 2006], MECOCO [Berezhnoy et al, 2007], MARVEL [Natsev et al, 2007]. Some systems (such as QBIC, Collage, M4ART, etc.) are industrial systems which support the complete digital object lifecycle in the overall process of image retrieval. Such systems have to support a wide range of functions, starting from reaching the data from repository through feature extraction, creating and keeping metadata, building appropriate indexing

162

Access to Digital Cultural Heritage ...

techniques for easy access, providing a friendly user interface and accelerating relevance feedback, etc. Others are experimental systems, aimed to study the ability of some features and/or algorithmic techniques to enhance image retrieval or to solve exact classification task. Such systems do not claim to be so comprehensive, but being at the frontier of the research they are more focused of studying particular concept or technique and in this way are of definite interest. Returning to our goal – studying the ways for closing different kinds of gaps – on Figure 21 we show our vision of connection of some of reviewed systems with the taxonomy of art image content. Here we do not stop our attention of used features and algorithms within the process of analysis made by the systems. We are focused on resulting features or concepts, produced by the systems. For instance, all systems in the rectangle engage with the field of artistic practice studies and art history investigation, notwithstanding that all of them use different kinds of visual primitives.

Figure 21. The systems and their connection with the taxonomy of art image content

In the figure we also incorporate our vision which shows which aspects will be addressed by our proposal, APICAS. APICAS is intended mostly to study the ability of colour analysis for covering mainly the abstraction gap. The analysis that we already made shows that colour plays a significant role in all three parts of abstraction space. The colour always brings some symbolism, generated by cultural environment. On the other side each artist builds his own colour vision, which expresses his/her

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

163

emotional and aesthetic feelings. Technical aspects are also narrowed from current existence. For instance, the new painting direction in the landscapes of Impressionists for studying the air and sun influence arose after the industrial production of paints in tubes started; it allowed artists to work easily outside of their studios. The scientific efforts are gradually progressing and step by step we can immerse into the processes of resolving the problems of classifying paintings of different styles or of different painters. New search paradigms such as mental image search with visual words enhance the user expression of visual target without a starting example. Alternatively, image fingerprinting, which deals with extracting unique image identifiers that are robust to image deformations (cropping, resizing, illumination changes, rotations etc.), might be used along with query-by-example techniques to partially deal with this task. This area is still in its infancy of research and development.

3 Proposed Features We propose a set of visual features, with the aim to represent the human perception of colours. We try to formalize the qualitative achievements of Itten's theory of successful colour combinations. The Itten's investigation of the "subjective timbre" [Itten, 1961] shows that the existence of laws of combining the colours does not restrict the variety of used colour combinations by the artists. They are key to the identification of the individual's natural mode of thinking, being and doing. We use three different categories of visual features, which represent the image (Figure 22).

Figure 22. Proposed visual features

The first class of features is a group of several global colour low-level attributes, which represent colour distribution in the images. The analysis

164

Access to Digital Cultural Heritage ...

of the distribution of colour features in art images is made in order to be used in tuning up the similarity measure functions. The second one is based on an attempt to formulate high-level features which represent colour harmonies and contrasts, based on the three main characteristics of the colour, which are closest to the human perception – hue, saturation and lightness. Functions for automatic features extraction from digital images based on the defined low-level global colour distribution features, are defined. The third method for obtaining visual features consists of observing the tiles of the images from chosen learning set. MPEG-7 descriptors for these tiles are extracted. For each descriptor after clustering a set of centroids is defined. The new images are putted under similar splitting and the calculation of MPEG-7 descriptors is attached to the closest centroid from the corresponding cluster set. In this way we overcome the complexity of MPEG-7 descriptors, which made good presentation of different types of visual features but need specific processing and cannot be putted directly into generic classification algorithms.

3.1

Colour Distribution Features

These features represent colour distribution in the images. One popular way is to use colour histograms, which are a statistic that can be viewed as an approximation of an underlying continuous distribution of colours' values. We want to use these characteristics for two connected purposes: 

analyzing the colour distribution in art images;



using these low-level features in the process of calculating higher-level colour harmonies and contrast features.

For representing colours we chose colour models that describe perceptual colour relationships and are computationally simple – such as HSV, HSL, HSL-artist colour models. HSV is used in MPEG-7 descriptors. HSL better reflects the intuitive notion of "saturation" and "lightness" as two independent parameters. HSL-artist colour model is a base for further development for defining colour harmonies and contrast characteristics. Later, except in the places, where is specially pointed, the term "lightness" is used as a collective concept for "Lightness" in HSL colour model, "Value" in HSV colour model or "Luma" in HSL-artist colour model depending on the colour model chosen. Colour histograms represent the number of pixels that have colours in each of a fixed list of colour ranges that span the image's colour space, the set of all possible colours. The used colours in digital presentations of art paintings can receive almost all possible colour values; because of this

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

165

we divide the dimensions of the colour model into appropriate numbers of ranges. The pixels in the images are converted into one of the chosen colour model (preferably HSL-artist colour model). The quantization of Hue is made to 13-bins, ih  -1,..., NH -1 , NH  12 , where one value is used for achromatic colours ( ih  -1 ) and twelve hues are used for fundamental colours ( ih  0,..., NH -1 ). The quantization of Hue is linear equidistant when HSL colour model or HSV colour model is chosen. For HSL-artist colour model the quantization function is non-linear with respect to taking into account the misplacement of artists' colour wheel and Hue definition in HSL colour space. The quantization intervals are given in Figure 23.

Figure 23. Quantization of Hue

The saturation and lightness in HSL colour model, respectively the saturation and values in HSV colour model, and saturation and luma in HSL-artist colour model are linearly quantized into NS-bins ( is  0,..., NS -1 ), respectively NL-bins ( il  0,..., NL -1 ). The MPEG-7 Dominant Colour descriptor, which extracts a small number of representative colours and calculates the percentage of each quantized colour in the image, also can be used as a kind of colour distribution feature. In order to equalize further definitions we reconfigure extracted RGB-values of Dominant Colour descriptor into values in chosen quantized feature space and use the corresponded percentage of each quantized colour as a percentage in the defined three-dimensional array. In [Ivanova and Stanchev, 2009] we have used exact function of defining the belonging of the colour characteristic to quantizing segment. Further in [Ivanova et al, 2010/IJAS] we add the possibility to make the quantization of colour characteristics using fuzzy calculating of belonging of colour to corresponded index (Figure 24). If the position of the examined value is in the inner part of one defined segment (more than one half from the left bound and less than three half from the right bound) the characteristic is considered to belong to this segment. In any other case (except the endmost parts for saturation and lightness), part of the characteristic is considered to belong to this segment and the rest part is considered to belong to the adjacent segment. For receiving that part a linear function, which reflects the decrease of belonging of that characteristic to the segment, is used.

166

Access to Digital Cultural Heritage ...

Figure 24. Fuzzy function for calculating quantization part of colour characteristic

As a result, every picture is represented with three dimensional array containing coefficients of participation of colours with correspondingly measured characteristics of the picture.

A  {A(ih, is, il ) | ih  -1,..., NH -1; is  0,..., NS -1; il  0,..., NL -1} . Analysis of colour distribution can be made by three directions together or only by two or one of them. On the base of three dimensional array A , using summarizing over the discarded dimension(s), we can receive corresponded projections: 

for

examining

two

of

the

dimensions:

AHS  {A(ih, is, )} ,

AHL  {A(ih, , il )} , ALS  {A(, is,il )} ; 

for representing the distribution of one dimension: AH  {A(ih, , )} ,

AS  { A(, is, )} , AL  { A(, , il )} where ih  -1,..., NH -1; is  0,..., NS -1; il  0,..., NL -1 .

Harmonies/Contrasts Features RED

GE AN OR W YE LLO

UE BL

Usually, in accordance of Johannes Itten proposition, the colour wheel which represents relations between hues is divided into twelve sectors. The centres of three equidistance sections correspond to primary colours. Secondary colours are located between them, which from one side are middle points of two primary colours, and from other side are complementary to the third colour. The quantization is expanded with the intermediate colours, which lays at the midpoint to adjacent primary and secondary hues.

VIO LE T

3.2

GREEN

Figure 25. The Artists' Colour Wheel

In Figure 25 the position of the hues in standard artists' colour wheel is shown. This order and correlations between hues is described in RYB

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

167

(Red-Yellow-Blue) colour model, used by the artists. Let us mention that this arrangement of hues differs from many contemporary colour models – RGB (Red-Green-Blue), CMY (Cyan-Magenta-Yellow), HSL (HueSaturation-Luminance), HSV (Hue-Saturation-Value), being based on the definition of colours as primary or secondary in accordance with the trichromatic theory [Colman, 2006]. But all classic theories, connected with definition of contrast are based on the opposition of the colours as they appear in the artists' colour wheel. We use HSL-artist colour model, which is based on the advantages of HSL and YCbCr colour models and render an account of disposition of hues in RYB colour model. We present one classification of different types of harmonies and contrasts, from the point of view of the three main characteristics of the colour – hue, saturation and lightness. 3.2.1

Harmonies/Contrasts from the Hue Point of View

 Harmonies/Contrasts Based on the Hues Disposition The figures below shows only relatively disposition of the colours, not the absolute meaning of the colour. Some of these combinations are discussed in [Holtzschue, 2006] and [Eiseman, 2006].

Figure 26. Monotone Composition

Monotone compositions: These compositions use one hue, and image is built on the base of varying of lightness of colour (Figure 26). These images are used to suggest some kind of emotion since every hue bears specific psychological intensity.

Figure 27. Different Variants of Analogous Composition

Analogous hues: Analogous hues can be defined as groups of colours that are adjacent on the colour wheel (Figure 27). They contain two, but never three primaries and have the same hue dominant in all samples.

168

Access to Digital Cultural Heritage ...

a)Complementary;

b)Double Complementary;

c)Split Complementary;

d)Complementary;

Figure 28. Variants of Complementary Contrast

Complementary contrasts: Complementary colours are hues that are opposite one another on the colour wheel. When more than two colours take part in the composition the harmonic disposition suggests combination between analogous and complementary hues (Figure 28).

a)Triad;

b)Partial Triad;

c)Tweaked Triad;

Figure 29. Triads

Triads: Three colours that are equidistance on the colour wheel form triad. This means that all colours are primary or secondary, or intermediate. When we have analyzed art paintings also "partial" triads were observed in landscapes images, when two of colours, forming triad were founded as most significant for the image. In some cases a tweaked form of triads is observed also, when two of colours are in two of positions of triad, but the third one is a little skewed by the third position (Figure 29).

a)Tetrad;

b)Partial Tertrad

c)Partial Tetrad

Figure 30. Tetrads

Tetrads: The tetrad includes four colours in equidistance on the colour wheel. This contrast produces very complicated scheme and can lead to disharmony. Of course here also can be examined the presence of partial forms of the tetrads (Figure 30). Usually this scheme is used only with precisely coordinated saturation and lightness values in more schematic pictures, for instance in heraldic signs, flags, etc.

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

169

Achromatic compositions: As a special case, images composed by black, greys and white tones or contain colours with very small saturation.  Warm-Cold Contrast Warm and cold are two opposing qualities of hue. Warm colours are hues around red and orange; cold colours are these around blue. The terms "warm" and "cold" are helpful for describing families of colours. They can be defined as follows: 

Warm: The image is warm when the composition is built from family of warm colours;



Cold: The image is cold when it is composed only (or predominantly) with cold colours;



Neutral: The composition contains colours mainly from neutral zones;



Warm-cold: The composition lays in this category when the percentage of cold family is in some proportion to the percentage of warm family;



Warm-neutral: In such compositions there is proportion between warm colours and neutral ones;



Cold-neutral: The image contains cold and neutral tones in some proportion.

3.2.2

Harmonies/Contrasts from the Saturation Point of View

Unlike of hue, which is circular and continuous, saturation and lightness are linear. That difference determines different definitions of harmonies/contrasts for these characteristics. This harmony appears together with the hue ones. It is used to give different perception when the colour is changed. As a whole we can define three big groups of harmonies and contrasts: 

Dull: An image can be classified as dull when composition is constructed mainly from unsaturated colours;



Clear: Clear images have been build mostly from clear colours (spectral and near to spectral, respectively only with varying in lightness);



Different proportion of saturations: Usually in composition of clear colours in combination of dull ones. Depending on content of different saturation and of distance between predominate quantities harmonies can be defined such as smooth, contrary, etc.

170

Access to Digital Cultural Heritage ...

3.2.3

Harmonies/Contrasts from the Lightness Point of View

The whole effect of the lightness of the image as well as light-dark contrast is a very powerful tool in art mastering. Mainly, an artwork can not contain light-dark contrast – at that case the image has one integral vibration of the lightness. In the other case sharp light-dark contrast is used to focus the attention in exact points of the image. 

Dark: Dark compositions are built mainly from dark colours;



Light: Light images contain mostly colours near white;



Different proportion of lightness: Light colours combined with dark ones compose the image. Depending on content of different lightness and of distance between predominate quantities contrasts can be defined as: smooth, contrary, etc.

3.3

Formal Description of Harmonies/Contrasts Features Using HSL-artist Colour Model

For defining colour harmonies/contrast features we use representation of the colour distribution as colour histograms, defined above:

A  {A(ih, is, il ) | ih  -1,..., NH -1; is  0,..., NS -1; il  0,..., NL -1} . Here NH  12 and corresponds to the number of quantized colours in Ittens' circle. "-1" index percentage of achromatic tones; "0" to " NH -1 " points percentage of colours, ordered as it is shown on Figure 25, starting from reds and ending to purples. We use NS  5 for defining harmonies' and contrasts' descriptors. Index "0" holds percentage of greys and almost achromatic tones, and "4" contains percentage of pure (in particular – spectral) tones. For indexing of luminance we use NL  5 . "0" holds percentage of very dark colours, and "4" contains percentage of very light colours. To simplify further calculation up to three arrays, containing percentage values of corresponding characteristics in the picture is calculated on the basis of this array. These arrays are (corresponded projections): 

H (h-1, h0 ,..., hNH -1 ) for hues ( AH  {A(ih, , )} , ih  1,...NH  1 );



S (s0 ,..., sNS -1 ) for saturation ( AS  {A(, is, )} , is  0,...NS  1 );



L(l0 ,..., lNL -1 ) for lightness ( AL  {A(, , il )} , il  0,...NL  1 ).

Chapter 4: APICAS – CBIR in Art Image Collections Utilizing Colour Semantics

171

 Hue Order Vector Hue Order Vector contains number of dominant hues nh , and positions of dominant hues, ordered in decreasing percentage. nh can vary from zero for achromatic paintings to maximum values of defined dominant colours. For the purposes of defining hue harmonies maximum values of the dominant colours are restricted to 5. The value of nh is defined as the number of ordered hues, which sum of the percentages exceed some (expert-defined) value x when an image is not achromatic.

(nh; p1, p2 ,..., pnh ) , pi {1,..., NH  1} : hpi  hpi+1 , h  H , i {1,..., nh  1}

nh {0,...,5} :

 (nh  0 if achromatic);(nh  1 if hp1  x);  nh  min(n,5) if 

n 1

h i 1

n

pi