Faceted Search and Retrieval Based on Semantically ... - CiteSeerX

3 downloads 2935 Views 2MB Size Report
Web, semantic annotation had emerged as an important research area. The use of semantically ... However, ontology development in design engineering remains a ..... phrases and so on based on the context free grammars [31]. The output of ...
Faceted Search and Retrieval Based on Semantically Annotated Product Family Ontology Soon Chong Johnson Lim, Ying Liu, Wing Bun Lee Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hung Hom, Kowloon, Hong Kong S.A.R., P.R. China

{lim.js, mfyliu, wb.lee}@polyu.edu.hk

ABSTRACT With the advent of various services and applications of Semantic Web, semantic annotation had emerged as an important research area. The use of semantically annotated ontology had been evident in numerous information processing and retrieval tasks. One of such tasks is utilizing the semantically annotated ontology in product design which is able to suggest many important applications that are critical to aid various design related tasks. However, ontology development in design engineering remains a time consuming and tedious task that demands tremendous human efforts. In the context of product family design, management of different product information that features efficient indexing, update, navigation, search and retrieval across product families is both desirable and challenging. This paper attempts to address this issue by proposing an information management and retrieval framework based on the semantically annotated product family ontology. Particularly, we propose a document profile (DP) model to suggest semantic tags for annotation purpose. Using a case study of digital camera families, we illustrate how the faceted search and retrieval of product information can be accomplished based on the semantically annotated camera family ontology. Lastly, we briefly discuss some further research and application in design decision support, e.g. commonality and variety, based on the semantically annotated product family ontology.

Categories and Subject Descriptors H.3.m [Information Storage and Retrieval]: Miscellaneous

General Terms: Design, Experimentation. Keywords:

Information management and retrieval, semantic annotation, product family, ontology.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ESAIR ’08, February 9, 2008, Barcelona, Spain. Copyright 2009 ACM 1-58113-000-0/00/0004…$5.00.

1. INTRODUCTION The ever changing customer preferences on product variety have posed great challenges to manufacturing firm in offering attractive product choices while optimizing the production cost. With that in concern, mass customization has been proposed and accepted as an economically viable model. Mass customization can be viewed as a manufacturing strategy that enables manufacturing company to better differentiate their products while satisfying production process and cost expectation. One of the ways in realizing mass customization is by designing and developing product family. Product family design is one of the active research areas in product design that have received much attention in the past decade. Product family design and product design generally share some common challenging issues, for instance, information and knowledge management issues of product related information. With the increasing complexity and varieties of products in the market, the issue of information management in product design has become more complicated. Therefore, information and knowledge management in product design, which calls upon rich representation, efficient storage and timely retrieval, remains as a challenging issue in design informatics study. With respect to the representation issue, what required is a scheme that promotes the comprehensiveness of information modeling with added features for easier change management. There have been a number of representation schemes proposed for product design information, e.g. relational database structure, tree like structure and object oriented approach. Though they are efficient for search and navigation in some occasions, such representations do not tackle the modeling of semantics in different structures or within a single structure of representation scheme. Among the different schemes, ontology based representation is catered especially for a semantic rich environment. The semantic features suggest potential semantic based applications in design engineering, such as semantic based multi-faceted search, navigation, knowledge extraction and analysis. However, ontology development in design engineering remains challenging where the semantic annotation for product related information, for instance, is still a time consuming and human intensive task. Majority of the previous studies on semantic annotation in design engineering attempt this issue through intensive domain literature

studies, where the expertise of human annotators are employed to annotate domain specific concepts and relations based on domain literature. Although the ontology defined based on using such an approach is essential in deriving non-trivial semantic relations and rules, the process will eventually become a burden for human annotators as the ontology evolves with incremental information. In view of the large amount of product offerings available in the market, an effective approach of semantic annotation for ontology development, that requires less human efforts and is able to adapt to changes and requirements in design information, is still a fundamentally challenging issue. Therefore, it would be highly beneficial if an approach, which can automatically suggest semantic annotation based on the design repository, can be proposed. In this study, we propose a faceted search and retrieval framework based on the semantically annotated product family ontology to address the challenges in intelligent product family information management. Section 2 surveys the related studies on semantic annotation. The current work on ontology based information processing and retrieval in design and manufacturing is covered in Section 3. Section 4 reports the state of the art of information management in product family design. Section 5 elaborates our approach of information retrieval for product family design, and Section 6 illustrates the utilization of the framework proposed in product family information management and retrieval with a case study of digital camera, and finally Section 7 concludes.

2. RELATED WORK Due to the vast services & applications offered by Semantic Web, semantic annotation has emerged as an important research area. Semantic annotation of textual and multimedia contents enables a better analysis, retrieval and exchange of digital contents globally. Some practical examples of ontology-based information retrieval include ontology-based query system for medical information retrieval [1] and ontology-based patent document mining [2]. Other applications include ontology-based information extraction [3, 4], named entity recognition [5] and sense tagging [6, 7]. In terms of annotation, there are several attempts. Among them are ontology-based matching [1, 2], lattice-based tagging [5], word co-occurrence [7], similarity matching [6], and probabilistic approach such as conditional random field [8]. A more recent study by Mei et al. [9] performed semantic annotation on frequent patterns by constructing context models and searching for semantically similar patterns. Most existing approaches in textbased annotation employ unsupervised approaches that involve little or no human efforts. Multimedia annotation that includes image and video annotation is another important application area of semantic annotation. Semantic annotation of images attempts to link low-level image features with high-level domain concepts, which is commonly termed as the semantic gap problem [10]. Ontology-based approach has been widely used for image annotation. The domain knowledge required for annotation in the ontology, such as textual image description and even captions, is extracted from other associated information with the images. The annotation efforts can thus be semi-automatically accomplished by human annotators with tag suggestions [10-12], or by using similarity ontologybased matching between image description terms [13, 14]. Other specific methodologies such as latent semantic analysis [15],

database-centric probabilistic model [16], transductive inference [17], confidence-based dynamic ensemble [18], expectationmaximization algorithm [19] and document object model approach [20] are also attempted. The application of image annotation is mainly for image indexing and retrieval [11, 13-15, 20]. On the other hand, semantic annotation on videos also serves the similar purpose for indexing and retrieval, widely known in Content Based Image Retrieval (CBIR) [14]. The existing video annotation approaches are based on the analysis of transliteration or transcripts of a video recording [21-23], or from the motion detected and extracted from a video recording [24, 25]. The later approach is usually ontology-based, where the ontology serves as the knowledge foundation for annotation. Currently, efforts reported in video annotation focus on ontology-assisted manual annotation [26], clip pattern matching and annotation [24, 25] and transcript based language processing [22, 23].

3. SEMANTIC ANNOTATION IN DESIGN ENGINEERING Research studies on the application of semantic annotation in design engineering have just started to capture the attention of researchers very recently. Semantic annotation has been used in the process of developing ontology or taxonomy for information extraction and retrieval. One area of studies in design information extraction includes the extraction of features from design models. For instance, Au and Yuen [27] proposed the linguistic approach to model sculptured models, and presented the taxonomic relations between three levels of abstractions at object level, feature level and geometry level. Fu et al. [28] attempted to extract features from data exchange part model using a multilevel feature taxonomy, which defines the relationship between design features and manufacturing features for feature identification in computer aided design models. Catalano et al. [29] performed three dimensional car annotation using a car aesthetic ontology. The aim for the annotation was to annotate geometric properties of a 3D car model. The studies discussed so far focus on feature identification for annotating relevant feature semantics in a particular design model. There are also other studies related to information extraction that are meant for information retrieval purpose. Kim et al. [30] employed an integrated taxonomy of engineering design for answer tag extraction and generation from engineering documentation. By using rhetoric structure theory for discourse analysis on engineering documents, the annotation process was performed and eventually served as the knowledge-base for a question and answering system. Questions were analyzed syntactically and semantically in order to search for the best answers from the knowledge repository based on a scoring mechanism. Another related study is on engineering documentation retrieval by Li et al. [31, 32]. They used a developed engineering ontology for the annotation of general engineering documentation so that a domain specific ontology can be generated, and based on this ontology engineering information retrieval can be carried out. It allows users to navigate according to the query-dependent concept categories in the ontology. Kitamura et al. [33, 34] presented the funnotation framework, an ontology-based function annotation

framework for engineering document annotation. The framework assists designers to annotate documents based on the classes of product functions and the “way to achieve such a function” for the semantic-based design search system. Hence, design documents can be effectively retrieved for future design reference. In summary, the applications of semantically annotated ontology in design engineering are meant for feature identification in design models, and for document management and retrieval for the purpose of design reference.

In order to illustrate the concept of product family, an example of product family for a branded digital camera series is shown in Figure 1. As noted, all the digital cameras can be divided to two different product categories. The one on the left side presents a family of digital cameras that caters for the low-end to mid range market. The high-end model is shown on the right side. It can be noticed that the series number for a family of product is normally the same. For instance, product iDC 2700 and iDC 2900 belong to the same product family iDC 2XXX, while iDC 3200 belongs to the higher range of product family iDC 3XXX.

4. INFORMATION MANAGEMENT IN PRODUCT FAMILY DESIGN

The basic idea of product family aims to achieve as many product variants as possible, but with less number of unique modules or components. It is observed in Figure 1, the two product variants, iDC 2700 and iDC 2900 possess some common product modules. For example, they both share the same camera body, flash module, power module and screen module. These are the product modules that are common to the iDC 2XXX family. What really distinguished between the two products variants are the customizable product modules. Shown in Figure 1, both product variants have different image processing capabilities. Overall, iDC 2900 is equipped with better image processing gears featured with better image processing technology, image sensor technology and better optical zoom when compared to iDC 2700.

Product family refers to a set of similar products that share common features, but yet possess some specific functionality which are distinguished in order to fulfill some niche customer requirements [35]. The challenges in product family design range from designing product platforms and selecting the best product configuration in managing a cost effective manufacturing strategy in product realization. From the product family perspective, knowledge management of product family design that covers product family related information modeling, extraction and retrieval is important for design analysis and timely decision making. Therefore, studies focusing on product family modeling and evaluation metrics are crucial to improve product modularity, commonality, variety and its cost structure.

Figure 1. Product family for a digital camera series As for the high end product, the iDC 3XXX product family has more superior common product modules than what the iDC 2XXX product family does. In the example, the iDC 3XXX series

has better screen display with touch screen function, and enjoys higher battery capacity. In terms of customized product modules, this camera also has better image processing technology such as

auto focus and face detection features, coupled with greater image sensing capability. We can also see that iDC 3200 shares the same lens module with iDC 2900. There had been a number of studies on product family modeling. In product design, Bill of Materials (BOM) is commonly used to model the raw materials, parts or components needed to manufacture a product. However, the amount of product information to be managed can be very large when the manufacturing company designs new product variants. The concept of Generic Bill of Materials (GBOM) is later introduced to tackle this problem [36]. This approach attempts to model all the product variants by using the least amount of information. Other work on product family modeling includes Jiao & Tseng [37] who had proposed a conceptual framework of product family architecture, which consists of three perspectives: functional, technical and structural. Du et al. [38] presented the architecture of product family where the classification of product modules and the evaluation on feasibility of product variants were emphasized. The later studies by Du et al. [39] focused on a rule-based approach by introducing a graph rewriting system. Besides using a graph representation, there are also other studies that adopted the object-oriented approach in modeling product family [40, 41]. Object-oriented scheme in modeling product family allows the definition on constraints and rules and an easier modeling scheme implementation using unified modeling language. Research studies in information retrieval specifically for product family design have not received much attention so far. Studies in information retrieval in product family design involve the efforts of modeling and retrieving relevant product information from the product family repository for design decision support purposes. For instance, Ong et al. [40] made use of object oriented approach for modeling and customization of product configuration in a web based design configuration system. The system allows the design engineers to manage the maintenance and update of configuration knowledge bases for product families. It also allows customers to search and navigate the product information that they are interested in. Using a hierarchical-based representation for BOM, Tseng et al. [42] applied case based reasoning to search similar product BOMs previously established using a product BOM query in assisting new product design configuration. Similarity measures were used for BOM matching and retrieval purposes. Nanda et al. [43] proposed to represent product family using a concept lattice for navigation purpose in the product structure of variants. Such a concept lattice based representation is considered the first attempt that enables navigation in commonality based family redesign. The studies discussed so far on product family modeling attempt to model product family using different representation schemes. These modeling schemes are used to address certain issues in product family, like the configuration of product variants [40], or the optimization of product family platforms [44]. As the product family analysis is focused towards a particular family of products, it is desirable that the search, navigation and retrieval for product family information can be conducted based on a single product family that is well represented beforehand. While the efforts involved in developing product family models heavily depend on human experts, such models have very limited features to support search and navigation across different product families. In other words, existing modeling approaches are insufficient to model and manage the rich information related to product family.

In relation, although the search and navigation in design, such as product structure navigation and product features search, provides certain help in accessing product related information, it is not an efficient way of retrieving information particularly when the product family contains a lot of product variants, or when the product structures are complex, or when the families are evolving. The current search and navigation features proposed are suitable only when a single product family is concerned. Furthermore, it is also evident that the search and navigation of product family are mostly limited to the structural facet of products only. Information search and retrieval that simultaneously involve other facets of products, such as functional and manufacturing wise, have never been studied before. Therefore, a multi-faceted product family information retrieval that enables search and navigation across product families and across different aspects of product variants is surely desirable for a timely retrieval of rich information so that a sound decision making in product family design can be fulfilled.

5. FACETED INFORMATION SEARCH AND RETRIEVAL FOR PRODUCT FAMILY DESIGN 5.1 Framework In this paper, we propose a faceted information retrieval approach based on semantically annotated product family ontology shown in Figure 2. Firstly, we develop product family ontology for a type of product of our interest. The whole process starts with the collection of product information for product variants that could possibly form a product family. In order to have a better illustration of our idea, we still consider the example of camera product family with three product variants: iDC 2200, iDC 2700 and iDC 2900 in the iDC 2XXX camera family in this context. A collection of product information related to each of the product variants, such as information from the third party manufacturers, is collected in the first place. The product entity extraction process is then performed on these collections of product information to extract important entities and their corresponding properties that a particular product constitutes. Secondly, product entities with their associated properties can be identified and extracted using concepts such as frequent terms identification and term weighting for the ranking of most relevant concepts [45]. For instance, using natural language processing techniques, we can extract possible product related entities and their associated properties of a product variant based on the frequent terms that are found in the collection of product information for a particular product variant. Each frequent term can be assigned a weight based on its term frequency in the collection, and the term’s weighting can be ranked accordingly to indicate its relative importance. For example, the highly ranked frequent terms extracted from the iDC 2200 camera information collection are “flash”, “lens”, “mega-pixels” and “shutter”. After the entity extraction process, the semantic annotation process continues with the further processing of entities and their properties upon extraction. Clustering is performed on entities and their associated properties identified. One of the important steps in this annotation process is the concept formation, which is to associate the entities with their corresponding concepts. A measure can be introduced to cluster semantically similar entities based on co-occurrence of frequent entities in documents with

different facets. Next, the relationships between different concepts will be annotated. For the purpose of generating semantic tags from design and manufacturing documentation for semantic annotation, we need a collection of documents that describes different facets of product. The annotation process is largely supported by the enterprise design and manufacturing knowledge repository, where it contains document collection that describes mainly on the product specifications and all other related information such as information on product functionality, manufacturing process and usage reviews of products. Information sources for these

documents include the manufacturer catalogs, online catalogs, engineering texts, handbooks, and web information such as online product specifications and other third party information like public forum and user reviews. Previously, we have proposed to automatically identify and suggest the semantic relationships in conceptual entities using a document profile modeling [46] approach based on the repository. The output from the semantic annotation process is a multi-faceted ontology. This multi-faceted ontology consists of different facets of product family related information, e.g. structural, cost, functional and manufacturing oriented, where such aspects are semantically annotated.

Figure 2. Framework of information retrieval based on semantically annotated product family ontology Once the multi-faceted ontology is developed, it can be used to perform product family related faceted search and retrieval. When a query has been input by a professional, the query will undergo a query processing process which typically consists of employing several NLP techniques such as tokenization, part-of-speech (POS) tagging, word disambiguation and phrase chunking. Tokenization is the process where a character stream is parsed and segmented into words and punctuations. POS tagging is next performed, followed by word disambiguation. Phrase chunking group words into noun phrases, verb phrases, prepositional phrases and so on based on the context free grammars [31]. The output of query processing is a set of critical query terms extracted from the original query. The process continues with faceted search and retrieval where the contexts of query terms are identified based on those references made available to the multi-faceted product family ontology. Finally, all the relevant results are identified and retrieved from the design and manufacturing repository according to the facets identified.

5.2 Semantic Annotation In view of the challenges of semantic annotation for fragmented texts often existed in the general context of design environment as well as online sources, such as consumer product review websites and public forums, we extended the basic idea of frequent item set mining and propose a new approach called document profile (DP) model to derive some suitable semantic tags for annotation. Different from the existing approach in generating tags based on the words’ co-occurrence patterns [7], our DP model focuses on the discovery of word frequent patterns at document sentence level which is particularly important for short texts. Furthermore, we have extended the basic idea of using pointwise mutual information (PMI) [47] to measure the strength of semantic association based on the terms and Maximal Frequent Sequences (MFSs) discovered. We propose a simple metric called averaged PMI (avgPMI) to measure the averaged strength of the semantic association among a set of features, e.g. both the terms and MFSs discovered. The avgPMI is defined as follows: avgPMI =

∑ PMI ( w , w ) , where PMI (w , w ) = log( p(w & w ) ) 1

N

2

1

1

2

2

p ( w1 ) p ( w2 )

where w1 and w2 are the two terms or MFSs discovered and N is the total number of terms or sequences in the DP model. The avgPMI intends to interpret the idea to maximize the strength of semantic association with smaller number of features, i.e. tags in this context. Essentially, DP is concerned about how to capture some single words and word sequences which often bear semantic meaning at the lexical level to represent documents. In our work, DPs are defined by a set of single words and MFS. Algorithm 1 gives the details how we generate the DP for a document. It starts with a set of sentences S j {s j1 , s j 2 ,..., s jn } of document di. Based on the sentence support threshold σ pre-specified, a set of frequent single words Γ j {t j1 , t j 2 ,..., t jn } are selected. Given the pre-specified word gap g, Γ j {t j1 , t j 2 ,..., t jn } is extended to a set of ordered word pairs. For instance, supposing g = 1 and the original sentence sjn comprises the words “ABCDEFGHI.” After the identification of frequent single words, only words “ABCEFI” are left for sjn. Therefore, the ordered pairs arising out of sjn would be ‘AB’, ‘AC’, ‘BC’, ‘CE’, and ‘EF’. Pairs like ‘BE’ and ‘CF’ are not considered because the number of words between ‘BE’ and ‘CF’ exceed the g parameter. Each ordered pair is then stored into a hash data structure, along with its occurrence information, such as location and frequency. This is repeated for all sentences in dj, with a corresponding update of the hash. Thereafter, each ordered pair in the hash is examined, and the pairs that are supported by at least σ sentences are considered frequent. This set of frequent pairs is named Grams2. Algorithm 1. Generation of Document Profile (DP) Input: S j {s j1 , s j 2 ,..., s jn } : a set of n pre-processed sentences in document dj, σ: sentence support, g: maximal word gap 1. for all sentences s ∈ Sj identify the frequent single words t, Γ j ← t ; 2. expand Γ j to Grams2, frequent ordered word pairs //Discovery phase 3. k = 2; 4. MFSj = null; 5. while Gramsk not empty 6. for all seq ∈ Gramsk 7. if seq is frequent 8. if seq is not a subsequence of some Seq ∈ MFSj // Expand phase: expand frequent seq 9. max = expand(seq); 10. MFSj = MFSj ∪ max; 11. if max = seq 12. remove seq from Gramsk 13. else 14. remove seq from Gramsk // Join phase: generate set of (k + 1)-seqs 15. Gramsk+1 = join(Gramsk); 16. k = k + 1; 17. DPj ← Γ j + MFS j ; 18. return DPj The next phase, the Discovery phase, forms the main body of Algorithm 1. It is an iteration of gram expansion for the grams in

the current Gramsk, and gram joining, to form Gramsk+1. Only grams that are frequent and not subsequences of the previously discovered MFSs are considered suitable for expansion. The latter condition is in place to avoid a rediscovery of MFSs that have already been found. This Expand-Join iteration continues until an empty gram-set is produced from a Join phase. We further break the Discovery phase into the Expand phase and the Join phase. In the Expand phase, every possibility of expansion of an input word sequence seq form Gramsk is explored. The expansion process continues, for that particular input word sequence seq, until the resulting sequence is no longer frequent. The last frequent sequence achieved in the expansion, seq’, will be an MFS by definition, and it will be stored together with its occurrence information. This process of gram expansion and information recording continues, for every suitable gram in Gramsk. Subsequently, the Join phase follows, which consists of a simple join operation amongst the grams left in Gramsk, to form Gramsk+1, i.e. the set of grams that are of length (k+1). When an empty gram set is produced from the Join phase, namely, no more grams are left for further expansion, the set of MFSj and Γ j is returned for document dj. In the end, the DPj for document dj is composed of Γ j plus MFSj found and this DPj{ MFS j + Γ j } is utilized as the semantic tags.

6. CASE STUDY AND DISCUSSION In order to demonstrate the feasibility of the proposed framework of faceted information retrieval based on semantically annotated product ontology, a case study of digital camera product family is presented. A collection of data for two branded digital camera from a major consumer electronics company was collected with the assistance of two final year project students. The two cameras belong to the same product series. For a start, a total of about 10 documents related to each camera are collected. The information sources include the camera manufacturer’s website and other third party websites like public forum and professional camera reviews. Figure 3 shows the semantically annotated ontologies for two cameras, DC-S120 and DC-S150 that belong to the same product series, i.e. DC-S1XX. The ontology for the two product variants is generated through the semantic annotation process. Due to the space limit, we can only visualize the concepts, entities and properties partially for structural and functional faceted ontology for the two cameras in Figure 4. The one shown in Figure 4 are important concepts and entities selected from a collection of over 30 suggested tags from documents. The structural facet contains seven concepts: flash module, power module, image sensor system, memory, screen module, image processing module and lens module and while the rest are concepts for functional facet. The ontology for each of the product variants has twelve common structural and functional concepts with different entities semantically annotated with them. In order to illustrate the concept of product family ontology based information retrieval, an example of multi-faceted information retrieval based on the semantically annotated product family ontology is presented in Figure 4. For a better and easier comprehension, let us consider two user queries: “High performance, rapid charging and recyclable camera battery” and “zero mercury, long lasting and fast charging battery for camera”. These two similar queries intend to look for camera batteries that have the stated features or functionalities. As noted, there are

terms and phrases that are conceptually similar, such as “high performance” and “long-lasting”; “recyclable” and “zero mercury”; and “rapid charging” and “fast charging”. First of all, the queries will undergo the query processing phase. The query processing phase separates all these queries syntactically into query terms for faceted search and retrieval. In this example, the processed query terms are a collection of noun phrases such as “camera” and “battery”, or some longer adjective phrases such as “fast charging” and “long lasting”. The next phase is the faceted search and retrieval. The processed query terms are matched with the most semantically similar concepts in the multi-faceted ontology. The concept searching is

performed across different ontologies to determine the most relevant context or facet of the query terms. For instance in our example, the term “camera” and “battery” indicate the type of product features that the user is interested in, and these features are present in all the facets of structural, functional and green manufacturing in our case. If we refer to Figure 3, the term “battery” matches with the “battery” entity under the concept “Power Module”. Similarly, terms such as “rapid charging”, “fast charging”, “high performance” and “long lasting” are semantically associated with the functional facet. In our example, there are also two terms that are semantically similar to the green manufacturing concepts: “recyclable” and “zero mercury”.

Figure 3. Semantically annotated ontology for two cameras from similar product family Based on the semantically similar facets determined from the query terms, the process continues with the retrieval of relevant information that fulfills the three facets. For instance, for the green manufacturing facet the search for camera batteries that fulfills the recyclable concept and contains zero mercury will be performed. Similar search process for other facets is also conducted. In a nutshell, the faceted search and retrieval process will look for the type of camera battery that possesses the features from the structural, functional and green manufacturing facets from the design and manufacturing repository, and present the results in a consolidated fashion. We should notice that the design and manufacturing repository contains product family information beyond either DC-S1XX or DC-S3XX families designed by the same manufacturer. It may

also include camera product family information explored from different manufacturers due to various reasons, e.g. marketing analysis. Therefore, the multi-faceted product family ontology can serve as an indexing structure for the search across different families of products. In our context, the query that requires camera battery may produce results from different camera family, such as i DC-S1XX, Trueshot SX and Gpix K2XX camera family. Each of these products that are included in the retrieved result of product family reveals that it is using this type of battery which we require. For instance, Trueshot S1, Trueshot S10 and Trueshot S20 might be using the same type of battery that offers better performance, faster charging and environmental friendly as shown in the detailed results in Figure 4.

From the detailed results, a designer or engineer can quickly navigate through the results either within a single product family or across product families. Within a single product family, the searched results enable product designers or engineers to better differentiate product variants that use the required battery. On the other hand, they can also navigate across different product families in order to have an overview of the camera families that have adopted the required battery, and make further comparison on which product family is deemed better in terms of matching the green manufacturing benchmarking concepts. The detailed results shown in Figure 4 are presented in the view of camera family. Further result manipulations are possible where the results can also be presented in structural, functional or green manufacturing view to enable better comparisons.

Besides the general search and navigation purpose, other types of potential application using the semantically annotated product family ontology are also very promising. Commonality is one of the important product family metrics that has been adopted to measure the level of common product modules that are shared in different product models. The existing commonality metrics are proposed mainly based on the structural aspect of products. By applying the semantically annotated product family ontology, we can further propose a completely new commonality metric that not only considers the structural issue, but also other facets like green and cost. Therefore, we will have a brand-new multi-faceted commonality metric for product family analysis.

Figure 4. Example of multi-faceted information retrieval in product family design Besides the evaluation metrics, it’s also desirable to perform family comparison with respect to a particular product for the sake of differentiating between different product families, or assessing the similarity or commonality between two product families. To identify a particular product family for a specific product from different manufacturers can then be conducted based on the semantically annotated product family ontology, with an opening to address some challenging questions like the potential of further developing a particular family. For instance, we may search and retrieve from the design and manufacturing repository the product family ontology in functional facets for two cameras, e.g. the DCS2XX and the Gpix K2XX families in Figure 4. From there, we can make a comparison on which product family is actually more superior in terms of the variety of functionality by introducing a semantically based comparison metric.

7. CONCLUSION AND FUTURE WORK Information management and retrieval in product family design is critical for various decision support and design analysis purposes. In this paper, we have discussed the potential of using

semantically annotated product family ontology for information management and retrieval purpose in product family design. In review of the status quo of information management in product family design, a framework of faceted information search and retrieval based on semantically annotated product family ontology has been proposed. Using a case study of digital camera family, the proposed conceptual framework has illustrated a promising way for information search and navigation in product family. A document profile model is proposed to illustrate the approach in the semantic annotation process. This model can generate possible semantic tags for annotation purpose that may reduce the burden of human annotators. However, the document facet identification and quality of tags generated from documents such as user reviews that are multi-facet in nature are interesting issues to be pursued. In the future, we shall focus more on the development and evaluation of semantic annotation for product family ontology. The semantically annotated product family ontology can also be applied as a semantically rich product information structure for various design decision support, e.g. product family related metrics and ontology based commonality or modularity analysis in product family. All these await our further exploration.

8. REFERENCES [1] Li, G.-y., Yu, S.-m., and Dai, S.-s. 2007. Ontology-Based Query System Design and Implementation. Network and Parallel Computing Workshops, 2007 NPC Workshops IFIP International Conference on. (2007). 1010-1015. [2] Ghoula, N., Khelif, K., and Dieng-Kuntz, R. 2007. Supporting Patent Mining by using Ontology-based Semantic Annotations. Web Intelligence, IEEE/WIC/ACM International Conference on. (2007). 435-438. [3] Boufaden, N. 2003. An ontology-based semantic tagger for IE system. Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2. (2003). Association for Computational Linguistics, 7-14. [4] Li, Y., and Bontcheva, K. 2007. Hierarchical, perceptron-like learning for ontology-based information extraction. Proceedings of the 16th international conference on World Wide Web. (2007). ACM, 777-786. [5] Mayfield, J., McNamee, P., Piatko, C. et al. 2003. Latticebased tagging using support vector machines. Proceedings of the twelfth international conference on Information and knowledge management. (2003). ACM, 303-308. [6] Curran, J. R. 2005. Supersense tagging of unknown nouns using semantic similarity. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. (2005). Association for Computational Linguistics, 26-33. [7] Kim, S.-B., Seo, H.-C., and Rim, H.-C. 2004. Information retrieval using word senses: root sense tagging approach. Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. (2004). ACM, 258-265. [8] Grilheres, B., Canu, S., Beauce, C. et al. 2005. A platform for semantic annotations and ontology population using conditional random fields. Web Intelligence, 2005 Proceedings The 2005 IEEE/WIC/ACM International Conference on. (2005). 790-793. [9] Mei, Q., Xin, D., Cheng, H. et al. 2007. Semantic annotation of frequent patterns. ACM Trans Knowl Discov Data 1, 3, 11-11. [10] Hollink, L., Little, S., and Hunter, J. 2005. Evaluating the application of semantic inferencing rules to image annotation. Proceedings of the 3rd international conference on Knowledge capture. (2005). ACM, 91-98. [11] Osman, T., Thakker, D., Schaefer, G. et al. 2007. An Integrative Semantic Framework for Image Annotation and Retrieval. Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. (2007). IEEE Computer Society, 366-373. [12] Petridis, K., Bloehdorn, S., Saathoff, C. et al. 2006. Knowledge representation and semantic annotation of multimedia content. Vision, Image and Signal Processing, IEE Proceedings - 153, 3, 255-262. [13] Von-Wun, S., Chen-Yu, L., Chung-Cheng, L. et al. 2003. Automated semantic annotation and retrieval based on sharable ontology and case-based learning techniques. Digital Libraries, 2003 Proceedings 2003 Joint Conference on. (2003). 61-72. [14] Wang, C., Zhang, L., and Zhang, H.-J. 2008. Learning to reduce the semantic gap in web image retrieval and annotation. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. (2008). ACM, 355-362.

[15] Pham, T.-T., Maillot, N. E., Lim, J.-H. et al. 2007. Latent semantic fusion model for image retrieval and annotation. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. (2007). ACM, 439-444. [16] Carneiro, G., and Vasconcelos, N. 2005. A database centric view of semantic image annotation and retrieval. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. (2005). ACM, 559-566. [17] Leslie, L., Chua, T.-S., and Ramesh, J. 2007. Annotation of paintings with high-level semantic concepts using transductive inference and ontology-based concept disambiguation. Proceedings of the 15th international conference on Multimedia. (2007). ACM, 443-452. [18] Li, B., and Goh, K. 2003. Confidence-based dynamic ensemble for image annotation and semantics discovery. Proceedings of the eleventh ACM international conference on Multimedia. (2003). ACM, 195-206. [19] Fan, J., Gao, Y., and Luo, H. 2004. Multi-level annotation of natural scenes using dominant image components and semantic concepts. Proceedings of the 12th annual ACM international conference on Multimedia. (2004). ACM, 540547. [20] Hua, Z., Wang, X.-J., Liu, Q. et al. 2005. Semantic knowledge extraction and annotation for web images. Proceedings of the 13th annual ACM international conference on Multimedia. (2005). ACM, 467-470. [21] Dowman, M., Tablan, V., Cunningham, H. et al. 2005. Webassisted annotation, semantic indexing and search of television and radio news. Proceedings of the 14th international conference on World Wide Web. (2005). ACM, 225-234. [22] Repp, S., Linckels, S., and Meinel, C. 2007. Towards to an automatic semantic annotation for multimedia learning objects. Proceedings of the international workshop on Educational multimedia and multimedia education. (2007). ACM, 19-26. [23] Repp, S., Linckels, S., and Meinel, C. 2008. Question answering from lecture videos based on an automatic semantic annotation. Proceedings of the 13th annual conference on Innovation and technology in computer science education. (2008). ACM, 17-21. [24] Bertini, M., Bimbo, A. D., Cucchiara, R. et al. 2004. Semantic video adaptation based on automatic annotation of sport videos. Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval. (2004). ACM, 291-298. [25] Bertini, M., Bimbo, A. D., and Torniai, C. 2006. Automatic annotation and semantic retrieval of video sequences using multimedia ontologies. Proceedings of the 14th annual ACM international conference on Multimedia. (2006). ACM, 679682. [26] Moreau, N., Lecl, re, M. et al. 2007. Formal and graphical annotations for digital objects. Proceedings of the 2007 international workshop on Semantically aware document processing and indexing. (2007). ACM, 69-78. [27] Au, C. K., and Yuen, M. M. F. 2000. A semantic feature language for sculptured object modelling. Computer-Aided Design 32, 1, 63-74.

[28] Fu, M. W., Ong, S. K., Lu, W. F. et al. 2003. An approach to identify design and manufacturing features from a data exchanged part model. Computer-Aided Design 35, 11, 979993. [29] Catalano, C. E., Giannini, F., Monti, M. et al. 2007. A framework for the automatic annotation of car aesthetics. AI EDAM 21, 01, 73-90. [30] Kim, S., Bracewell, R. H., and Wallace, K. M. 2007. Answering engineers' questions using semantic annotations. AI EDAM 21, 02, 155-171. [31] Li, Z., Raskin, V., and Ramani, K. 2007. Developing Ontologies for Engineering Information Retrieval. Proceedings of the ASME 2007 IDETC/CIE 2007 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference (Las Vegas, Nevada, USA September 4-7, 2007). 1-9. [32] Li, Z., Raskin, V., and Ramani, K. 2008. Developing Engineering Ontology for Information Retrieval. Journal of Computing and Information Science in Engineering 8, 1, 011003-(1-13)-011003-(1-13). [33] Kitamura, Y., Washio, N., Koji, Y. et al. 2006 Towards Ontologies of Functionality and Semantic Annotation for Technical Knowledge Management. New Frontiers in Artificial Intelligence, 17-28. [34] Kitamura, Y., Washio, N., Koji, Y. et al. 2006. An Ontologybased Annotation Framework for Representing the Functionality of Engineering Devices. Proceedings of IDETC/CIE 2006. (Philadelphia, Pennsylvania, USA, September 10-13, 2006). 1-10. [35] Simpson, T. W., Siddique, Z., and Jiao, J. 2005 Product platform and product family design: Methods and applications. New York, Springer, [36] Hegge, H. M. H., and Wortmann, J. C. 1991. Generic bill-ofmaterial: a new product model. International Journal of Production Economics 23, 1-3, 117-128. [37] Jiao, J., and Tseng, M. M. 2000. Fundamentals of product family architecture. Integrated Manufacturing Systems 11, 7, 469-483.

[38] Du, X., Jiao, J., and Tseng, M. M. 2001. Architecture of Product Family: Fundamentals and Methodology. Concurrent Engineering 9, 4, 309-325. [39] Du, X., Jiao, J., and Tseng, M. M. 2002. Graph Grammar Based Product Family Modeling. Concurrent Engineering 10, 2, 113-128. [40] Ong, S. K., Lin, Q., and Nee, A. Y. C. 2006. Web-based configuration design system for product customization. International Journal of Production Research 44, 2, 351 382. [41] Zhang, J., Wang, Q., Wan, L. et al. 2005. Configurationoriented product modelling and knowledge management for made-to-order manufacturing enterprises. The International Journal of Advanced Manufacturing Technology 25, 1, 4152. [42] Tseng, H.-E., Chang, C.-C., and Chang, S.-H. 2005. Applying case-based reasoning for product configuration in mass customization environments. Expert Systems with Applications 29, 4, 913-925. [43] Nanda, J., Thevenot, H. J., Simpson, T. W. et al. 2007. Product family design knowledge representation, aggregation, reuse, and analysis. AI EDAM 21, 02, 173-192. [44] Kumar, R., and Allada, V. 2007. Scalable platforms using ant colony optimization. Journal of Intelligent Manufacturing 18, 1, 127-142. [45] Liu, Y., Loh, H. T., and Lu, W. F. 2007 Deriving Taxonomy from Documents at Sentence Level. Emerging Technologies of Text Mining: Techniques and Applications, Prado, H. A. d. and Ferneda, E., eds., Idea Group Inc. [46] Yu, W., and Liu, Y. 2008. Automatic Identification of Semantic Relationships for Manufacturing Information Management. Proceedings of the 6th International Conference on Manufacturing Research (ICMR08). (2008). [47] Church, K. W., and Hanks, P. 1989. Word association norms, mutual information and lexicography. Proceedings of the 27th Annual Conference of the Association of Computational Linguistics. (1989). 76-83.