Multimedia Answer Generation for Community ...

Multimedia Answer Generation for Community Question Answering Engine: A Review Sonali. D. Ingale, Roshni K. Sorde, R. R. Deshmukh, S. N. Deshmukh Department of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada University Aurangabad (MS) India [email protected], [email protected], [email protected], [email protected] Abstract: As rapid growth of internet for information search, Community Question Answering (CQA) systems have been favored over last several years. Thousands of questions are asked and answered every day on social question and answer (Q&A) Web sites such as Yahoo! Answers. Conventional search engine provides a sorted list of web pages as a result satisfying the keyword equality. Although it is still facing challenges, such as understanding difficult questions and processing it, which are not informative enough for many questions. CQA services mainly focus on textual format as their answer. Nearly all of the existing CQA systems such as Yahoo! Answers, AskMetafilter, only support pure text based answers, which may not provide intuitive and sufficient information. Some research efforts have been put on multimedia QA, which aims to answer question using QA data. This paper gives an overview of technological changes in MMQA and concluding with the categorizing answers mainly into four types they are: text, text+image, text+video, text+image+video. Index Terms: community Question Answering (CQA), Multimedia Question Answering (MMQA)

I.INTRODUCTION With the rapid growth of the Internet, information searching has become an indispensable activity in people’s daily life. Document retrieval is currently the most widespread form of web searching. Users can type queries in the form of unstructured sets of keywords, and the search engines retrieve ordered lists of pointers to web pages based on the estimated relevance. QA is a smooth shift away from classical document search towards information retrieval. It

aims to find a concise and accurate answer to a natural language question instead of returning a ranked list of documents, utilizing advanced linguistic analysis, domain knowledge, data mining and natural language processing techniques. Compared to keyword-based search systems, it greatly facilitates the communication between humans and computers by naturally stating user’s intention in plain sentences. Based on the answer characteristics, QA can roughly be split into three key topics: automatic textual QA, community QA and multimedia QA. 1. Automatic textual QA: traditionally finding relevant data of user’s interest have been flourishing in textual format through information retrieval (IR) technology and systems. After firing a query this system returns a list of related documents for what the user is looking for. [1] 2. Community QA: bulk of questions which are to be answered are tough enough, but it is likely that the number of queries answered on CQA sites are much more than the number of library reference services [2].comparing with automated QA, CQA normally receives answers with better understandability as they are triggered by human intellegence.examples of CQA are Yahoo! Answers and MetaFilter, the popularity of which have been increasing dramatically for the past several years [3]. CQA mostly supports textual answers as shown in Fig.1

Fig. 1. Examples of QA pairs from several popular CQA forums. (a) An example from Answerbag; (b) an example from MetaFilter; (c) an example from Yahoo! Answer that only contains links to two videos; and (d) another example from Yahoo! Answers II.APPROACHES TO CQA 3. Multimedia QA: Multimedia QA uses a similar retrieval pipeline as that in text-based QA. MMQA can aggregate text QA in a complete QA paradigm in which the best answers might be a combination of text and other mediums such as images & videos [4].The following fig. 2 shows list the representative questions for each answer medium class. Here we do not illustrate the answers because several answers are fairly long. The correctly categorized questions are marked with “( ”

Most of the existing CQA systems, like Yahoo! Answers, Wiki Answers and AskMetafilter, only support pure text based answers which may not be easy to use, and understand to the user. This has made the evolution of multimedia in CQA. A. Text Based CQA: Research on QA systems started from 1960s and mainly focused on expert systems in specific domains. Text based QA became popular since establishment of QA track of TREC in 1990s [5].We can categories the QA into following types based on question and their expected answers: 1. Open-Domain QA [6], 2. Restricted-Domain QA [7], 3. Definitional QA [8] and 4. List QA [9]. B. Video Based QA: In addition to normal textual references or instructions, visual counterparts would be an ideal complementary source of information for users. Thus it's natural to extend the text-based QA research to video QA. Li et al. [10] presented a solution on “how-to” QA by leveraging communitycontributed texts and videos. This type of systems was designed to find multimedia answers from webscale media resources such as YouTube. C. Image Based QA: An image-based QA approach was introduced in [11], which mainly focuses on finding information about physical objects. An image-based QA system allows direct use of an image that refers to the object. This type of systems was designed to find multimedia

Fig. 2: Prediction of answer medium

answers from web-scale media resources such as Flicker, Google images. D. Multimedia QA Search: Due to the increasing amount of digital information stored over the web, searching for desired information has become an

essential task. The research in this area started from the early 1980s. With the rapid growth of content analysis technology in the 1990s, these efforts rapidly expanded to tackle the video and audio retrieval problems. Fig. 3 shows an example of MMQA.

Fig 3: Simple representation of MMQA search Engine

III.METHODOLOGY Existing CQA usually provide only textual answers, which are not informative enough for many questions. Clearly, it will be much better if there are some accompanying videos and images that visually demonstrate the process or the object. By processing a large set of QA pairs and adding them to a pool, it can enable a novel multimedia question answering (MMQA) approach as users can find multimedia answers by matching their questions with those in the pool. The methodologies as mentioned below: 1. Answer Medium Selection: It determines whether we need to and which type of medium we should add to enrich the textual answers. There are some existing research efforts on question classification. Li and Roth [12] developed a machine learning approach that uses the SNoW learning architecture to classify questions into five coarse classes and 50 finer classes. They used lexical and syntactic features such as partof-speech tags, chunks and head chunks together with two semantic features to represent the questions. Zhang and Lee [13] used linear SVMs with all possible question word grams to perform question classification. Arguello et al. [14] investigated medium type selection as well as search sources for a query. It analyzes question, answer, and multimedia search performance. Then learn a linear SVM model for classification based on the results. 1.1 Question-Based Classification: Since many questions contain multiple sentences and some of the

sentences are uninformative. The classification is accomplished with two steps. First, we categorize questions based on interrogatives second, for the rest questions; we perform a classification using a naive Bayes classifier. Table I Representative interrogative words

1.2 Answer-Based Classification: Apart from questions, answer can also be an important clue. For answer classification bigram text features and verbs are extracted. With the help of verb it will be easy to judge whether the answer can be related with video content.Intuitively, if a textual answer contains many complex verbs, it is more likely to describe a dynamic process and thus it has high probability to be well answered by videos. Therefore, verb can be an important clue. 1.3 Media Resource Analysis: Even after determining an appropriate answer medium, the related resource may be limited on the web or can hardly be collected, and in this case it may be needed to turn to other medium types. Search performance based on the fact that, most frequently, search results are good if the top results are quite coherent [15].

1.4 Medium Selection Based on Multiple Evidences: It perform medium selection by learning a Threeclass classification model based on the results of question-based classification, answer-based classification, and media resource analysis. Query generation for Multimedia search: To collect relevant image and video data from the web, we need to generate appropriate queries from text QA pairs before performing search on multimedia search engines. We accomplish the task with two steps. The first step is query extraction. Textual questions and answers are usually complex sentences. But frequently search engines do not work well for queries that are long and verbose [16]. Therefore, we need to extract a set of informative keywords from questions and answers for querying. The second step is query selection. This is because we can generate different queries: one from question, one from answer, and one from the combination of question and answer

textual format, which was later enhanced by multimedia question answering in CQA, as they are able to tackle complex queries and achieve better performance. Through this review we found that: The existing CQA systems gives the answer medium in the form of text,imgaes,video there is no related work for audio, so it can be an extended feature of CQA system in future. There are few failure cases. For example, the system may fail to generate reasonable multimedia answers if the generated queries are verbose and complex.

2. Query Generation for Multimedia Search: To collect relevant image and video data from the web, we need to generate appropriate queries from text QA pairs before performing search on multimedia search engines. We accomplish the task with two steps. The first step is query extraction. Textual questions and answers are usually complex sentences. But frequently search engines do not work well for queries that are long and verbose. Therefore, we need to extract a set of informative keywords from questions and answers for querying. The second step is query selection. This is because we can generate different queries: one from question, one from answer, and one from the combination of question and answer

REFERENCES

3. Multimedia Data Selection and Presentation: We perform search using the generated queries to collect image and video data with Google image and video search engines such as YouTube. However, most of the current commercial search engines are built upon indexing on text basis and usually return a lot of irrelevant results. Therefore, reranking by exploring visual information is essential to reorder the initial text-based search results. 4. Re-Ranked Multimedia Results: Graph based reranking method is more efficient method to re-rank the documents according to their content features & clarity scores. This will generate the outcomes with minimal repetitions. IV.CONCLUSION In this review, we have discussed the techniques and approaches developed in community question answering (CQA), and have observed that during past few year user’s gets reply of their queries in the

V.ACKNOWLEDGEMENT This review has been partially supported by Department of Computer Science and IT Dr.Babasaheb Ambedkar Marathwada University,Aurangabad. The views expressed here are those of the authors only.

[1] Sanda M. Harabagiu, Steven J. Maiorano, Marius A. Pasca 'Open-Domain Textual Question Answering Techniques', In Natural Language Engineering 9 (3): 1-38, 2003. [2] Chirag Shah,Jefferey Pomerantz,'Evaluating and Predicting Answer Quality in Community QA'SIGIR’10, July 19–23, 2010, Geneva, Switzerland. Copyright 2010 ACM 978-1-60558896-4/10/07 [3] Liqiang Nie, Meng Wang,Yue Gao, Zheng-Jun Zha,'Beyond Text QA: Multimedia Answer Generation by Harvesting Web Information',Ieee Transactions On Multimedia, Vol. 15, No. 2, February 2013 [4] Richang Hong and Meng Wang,'Multimedia Question Answering',Published by the IEEE Computer Society 1070-986X/12/ 2012 IEEE [5] Trec: The Text Retrieval Conf. [Online]. Available: http://trec.nist.gov/.4/8/2014 [6] S. A. Quarteroni and S. Manandhar, “Designing an interactive open domain question answering system,” J. Natural Lang. Eng., vol. 15, no. 1, pp. 73–95, 2008. [7] D. Mollá and J. L. Vicedo, “Question answering in restricted domains: An overview,” Computat. Linguist., vol. 13, no. 1, pp. 41–61, 2007. [8] H. Cui, M.-Y. Kan, and T.-S. Chua, “Soft pattern matching models for definitional question answering,” ACM Trans. Inf. Syst., vol. 25, no. 2, pp. 30–30, 2007. [9] R. C. Wang, N. Schlaefer, W. W. Cohen, and E. Nyberg, “Automatic set expansion for list question answering,” in Proc. Int. Conf. Empirical Methods in Natural Language Processing, 2008. [10] G. Li, H. Li, Z. Ming, R. Hong, S. Tang, and

T.-S. Chua, “Question answering over community contributed web video,” IEEE Multimedia,vol. 17, no. 4, pp. 46–57, 2010. [11] T. Yeh, J. J. Lee, and T. Darrell, “Photo-based question answering,” inProc. ACM Int. Conf. Multimedia, 2008. [12] X. Li and D. Roth, “Learning question classifiers,” in Proc. Int. Conf.Computational Linguistics, 2002. [13] J. Zhang, R. Lee, and Y. J. Wang, “Support vector machine classifications for microarray expression data set,” in Proc. Int. Conf. Computational Intelligence and Multimedia Applications, 2003. [14] J. Arguello, F. Diaz, J. Callan, and J. F. Crespo, “Sources of evidence for vertical selection,” in Proc. ACM Int. SIGIR Conf., 2009. [15]S. Cronen-Townsend, Y. Zhou, andW. B. Croft, “Predicting query performance,” in Proc. ACM Int. SIGIR Conf., 2002. [16] L. Nie, S. Yan, M.Wang, R. Hong, and T.-S. Chua, “Harvesting visual concepts for image search with complex queries,” in Proc. ACM Int. Conf. Multimedia, 2012.