Author Guidelines for 8

3 downloads 0 Views 287KB Size Report
Question: Pemerintahan koalisi Perdana Menteri. Obuchi terdiri atas LDP, Komei, dan partai apa lagi? (Prime Minister Obuchi's coalition government consists of ...
Transitive Strategy on CLQA System for Low Resources Language Pair via Development of Indonesian-Japanese CLQA -



Ayu Purwarianti1, Masatoshi Tsuchiya2, Seiichi Nakagawa2 1 Bandung Institute of Technology, 2Toyohashi University of Technology 1 [email protected]

Abstract We investigated transitive strategy employed in a CLQA (Cross Language Question Answering). This strategy is used to make the CLQA system adapted easily to other language pairs. There are two transitive strategies: 1) transitive translation; 2) transitive passage retriever. English is used as the pivot language. In the transitive translation, Indonesian keywords are translated into English, then into Japanese. The translated Japanese keywords are used to retrieve relevant Japanese passages. In the second transitive strategy, the English passage retrieval is done. The relevant Japanese passages are located using the retrieved English passage. Our CLQA system also consists of Indonesian question analyzer and Japanese answer finder. The Indonesian question analyzer consists of a question shallow parser and a machine-learning based question classification. The Japanese answer finder utilized SVM algorithm to select the answer candidates. In the experiments, we used test set of NTCIR 2005 CLQA1 task, consists of 200 questions.

1. Introduction CLQA (Cross Language Question Answering) systems have become an interesting research area. There are some research forums for CLQA task such as CLEF (Cross Language Evaluation Forum) and NTCIR (NII Test Collection for IR Systems). Indonesian language has been provided in the CLEF test data for Indonesian-English CLQA task since 2005[1]. Japanese language has been provided in the NTCIR test data for Japanese-English and English-Japanese CLQA since 2005[12]. . In the Indonesian-English CLQA task, two teams [7,10] adapted a common approach of building the CLQA. It consists of Indonesian question classifier (using some classification rules), Indonesian-English

machine translation software, an English passage retrieval, and an answer finder that consists of some named entity – EAT matching rules. In the NTCIR 2005 CLQA task, there were 5 teams joined the English-Japanese task and 4 teams in the Japanese-English task. Most of them adopted similar approach as in Indonesian-English mentioned before. Due to the rich resources between English and Japanese, the translation modules employed high quality machine translation, large size bilingual dictionary and statistical machine translation (parallel corpus). The highest accuracy on English-Japanese CLQA was achieved by FORST[5] with a machine translation (MT) software and a web search to translate the proper noun. In the answer finder, they used a matching score of each morpheme in the retrieved passages with the question keywords and EAT. The system achieved 15.5% accuracy for the Top1 answer. On the Japanese-English CLQA, the highest accuracy was gained by NCQAL[3] with three Japanese-English dictionaries (EDICT of 110,428 entries, ENAMDICT of 483,691 entries and in-house dictionary of 660,778 entries). In the answer finder, they matched the EAT with a rule based named entity tagger (669 rules for 143 answer types). The accuracy was 31.5%, outperformed other submitted runs such as the second ranked run with 9% accuracy. The most related work is QATRO[11] which submitted the run on the English-Japanese and Japanese-English. QATRO[11] employed the machine learning technique Maximum Entropy Models (MEMs) to extract answers from combined features of question features and document features. The idea is to classify each word with I/O/B label. This approach is also adopted in our Japanese answer finder module. The different part is the used features and the machine learning technique. While QATRO did not use the EAT information, we claimed that this information is quite effective to help the un-translated question focus. QATRO also used a POS matching feature between

the question and document which were not used in our system. With 300 question/answer pairs of training data, QATRO achieved 0.5% accuracy for the Top1 answer in the English-Japanese task and 1% accuracy in the Japanese-English task. For language pair with limited resources such as Indonesian and Japanese, transitive translation using bilingual dictionaries can be one solution for this problem. The transitive translation itself has been widely employed in CLIR (Cross Language Information Retrieval) researches[2,4,9]. Ballesteros[4] translated Spanish queries into French with English as the interlingua using Collins SpanishEnglish and English-French dictionaries. Gollins&Sanderson[9] translated Germany queries into English using two pivot languages (Spanish and Dutch) with Euro Wordnet as the data resource. We[2] translated Indonesian into Japanese using KEBI Indonesian-English and Eijirou English-Japanese dictionaries for Cross Lingual Information Retrieval (CLIR). For this research, we try to use this transitive translation in our Indonesian-Japanese CLQA. The translation problem in CLIR is different with the one in CLQA system. In CLQA, not all important keywords are included in the input question. The answer as one important keyword is not included in the input question. It should be searched in the target documents. Another characteristic is that, in CLQA, a relevant document might not have the question keywords as important keywords. For example, the question “Who is Japanese Prime Minister?”, one could find the answer in document not specifically discuss about the prime minister. This is quite different with the CLIR system. With aim to build an easily ported CLQA system, other modules in our system were based on machine learning and easily prepared rule system. Using a common architecture, our CLQA system consists of an Indonesian question analyzer, Indonesian-Japanese keyword translation, Japanese passage retrieval, English passage retrieval and Japanese Answer Finder. Each of these modules is described in next section.

2. Transitive Strategy in Japanese CLQA 2.1 Indonesian-Japanese Transitive Translation

We do the Indonesian-Japanese translation not on the question sentence but on the keywords extracted from the question sentence. First, an Indonesian question sentence is processed by a question analyzer. The results are then processed by the translation module to get the Japanese keywords. These Japanese keywords are used to retrieve Japanese passages. In the last step, a Japanese answer finder classifies the answer candidates from Japanese passages using Japanese keywords, Japanese question main word and the output of the question analyzer.

Fig. 1. Architecture of Indonesian-Japanese CLQA using Transitive Translation

2.2 Indonesian-Japanese Transitive Passage Retriever

CLQA

with

IndonesianCLQA

with

The first transitive strategy is to involve a transitive translation in the CLQA. English as used as the pivot language in the keyword translation. The overall architecture is shown in Figure 1.

Fig. 2. Architecture of Indonesian-Japanese CLQA with Transitive Passage Retriever

The main different thing with the first approach is the passage retriever. In the first approach, only Japanese passage retriever is conducted, in contrast, the second approach involves both English and Japanese passage retrievers (see Fig. 2 for overall architecture). By this, the pivot language is not only used in the translation phase. The nouns (proper and common nouns) of the retrieved English passages are then translated into Japanese and be used as the input for the Japanese passage retriever. The rest process is equal as the one in the first approach. We argue that by using the pivot language passage retriever, our system can have better performance on the passage retriever. This schema could be seen as a query expansion in the Japanese passage retriever which could handle low translation quality of Indonesian-Japanese translation.

3. Modules in the Indonesian-Japanese CLQA 3.1 Indonesian Question Analyzer This system consists of two main components: a question shallow parser and a question classifier. The question shallow parser aims to identify question information using some simple rules. The question information includes question keywords, question main word, interrogative word, and phrase information. The question classifier aims to extract the EAT (expected answer type) using SVM algorithm. It classifies a question into one of the following classes: date, location, person, organization, name and quantity. The main input data used in the question classifier are the question information resulted by the question shallow parser. To enhance the question classifier’s quality, we add a statistical value for the question main word, called as bi-gram frequency score. This score is the frequency score of a bi-gram pair between the question main word and each word included in the predefined word list. We also add a WordNet distance information as another additional attribute. WordNet distance means a distance between the question main word (translated into English) and some specified WordNet synsets (taken from the 25 noun lexicographer files in WordNet). Fig. 3 shows a result example of the question analyzer. Question: Pemerintahan koalisi Perdana Menteri Obuchi terdiri atas LDP, Komei, dan partai apa lagi?

(Prime Minister Obuchi's coalition government consists of LDP, Komei, and what other party?) Question shallow parser result: interrogative word: apa (what); main word: partai (party); phrase information: NP-POST; question keyword: pemerintahan (government), koalisi (coalition), Perdana Menteri (prime minister), Obuchi, LDP (Liberal Democratic Party), Komei Bi-gram frequency score: date(0,0), location(0.0101,11), name(0.0040,9), organization(0.0789,22), person(0.0005,5), quantity(0,0) WordNet distance score: all zero except for event(0.1724), group(0.4828), and person(0.3448) Question classifier result: organization Fig. 3. Example on the Question Analyzer's Output

3.2 Indonesian-Japanese Keyword Translator Based on the observations on the collected Indonesian questions, there are three word types used in the Indonesian questions: native Indonesian words, English words (such as “barrel”, “cherry”, etc) and transformed English words (such as “prefektur” from “prefecture”, “agensi” from “agency”, etc). For native Indonesian words, we used two dictionaries: Indonesian-English (KEBI; 29,047 entries) and English-Japanese (Eijirou; 556,237 entries). If a word is not available in the Indonesian-English dictionary, we assumed it as an OOV word and will be treated as the second or third word type. For the second word type, we used the EnglishJapanese dictionary, a Japanese proper name dictionary, a Japanese corpus and a katakana/hiragana transliteration module. The katakana information provided in the Japanese proper name dictionary is transliterated into alphabet. The word matching is done on this alphabet information. This is also done for the Japanese corpus. First, the Japanese corpus is morphologically analyzed by Chasen (http://chasenlegacy.sourceforge.jp/). The katakana information resulted by Chasen is transliterated into alphabet. As for the third word type, we used some transformation rules to transform the Indonesian words into English words. The resulted English words are then assumed as the second word type. The example on the translation result is shown in Fig. 4. Question keyword: shown in Fig. 3 Translation: - pemerintahan (untranslated OOV)

- koalisi 一体化,連合,連立 - perdana 火薬を 詰,準備,元請業者,青春,全盛, 一流,根本的,最上,首位 - menteri 公使 - Obuchi 小渕,尾駮,小淵,雄淵,おぶち - LDP L DP - Komei こ う めい ,光明,公明,広明,昂明,港 明,高明,米井,こ めい Fig. 4. Example on the Question Translation's Output 3.3 Japanese Passage Retrieval We used GETA (http://geta.ex.nii.ac.jp/) as our generic information retrieval engine. It retrieves some Japanese documents from a keyword set by using IDF, TF or TFxIDF score. The Japanese translation results are joined into one query and inputted into GETA to get the relevant Japanese documents. All passages (2 sentences) that contain a similar word as the question keywords are used as the retrieved passages. Because translation results using bilingual translation contain many Japanese translations, the translations were filtered by using mutual information and TFxIDF score. First, all combinations of the keyword translation sets are ranked based on their mutual information score. Each set of top 5 mutual information score is used to retrieve the relevant documents. In the final phase, documents within 100 highest TFxIDF score were selected from all relevant documents resulted from the queries of top 5 mutual information score.

the question keywords. An example of document feature for a document word "自民党" with question "Prime Minister Obuchi's coalition government consists of LDP, Komei, and what other party?" is shown in Fig. 5. Document word: 自民党 POS information (Chasen): 名詞,固有名詞,組織 Similarity score: 0 5 preceding words: に,よ る ,連立(similarity score is 1), 政権,は 5 succeeding words: 執行,部,が,小渕(similarity score is 1),前 Fig. 5. Example of Document Feature for the Answer Finder Module

4. Experiments 4.1 Experimental Data In order to gain an adequate number of training data for the question classifier and the answer finder modules, we collected our Indonesian-Japanese CLQA data. So far, we have 2,837 Indonesian questions and 1,903 answer-tagged Japanese passages. About 1,200 Japanese passages (more than half of the 1,903 passages) were gained from Japanese native speakers having reading the English question, English answer and the corresponding Japanese article. Fig. 6 shows some examples of Indonesian/English question along with its relevant Japanese passages.

3.4 Japanese Answer Finder Our Japanese answer finder locates the answer candidates using a machine learning based text chunking approach. Here, the document feature is directly matched with the question feature. Each word in the retrieved passages is classified into “B” (first word of the answer candidate) or “I” (part of the answer candidate) or “O” (not part of the answer candidate). The features used for the classification include the document feature, question feature, EAT information and similarity score. The document feature includes POS information (four POS information resulted by Chasen), and the lexical form. The question feature includes the question shallow parser result and the translated question main word. The similarity score is the similarity score between the document word and

Fig. 6. Examples of Indonesian-Japanese QuestionPassage Pair

As for the test data, we translated 200 English questions test data of the NTCIR 2005 CLQA1 Task into Indonesian. The Japanese corpus is the Yomiuri Newspaper corpus years 2000-2001 (658,719 articles). The English corpus is the Daily Yomiuri years 20002001 (17,741 articles). In the question classifier, we used an SVM algorithm from WEKA software (http://www.cs.waikato.ac.nz/ml/weka/) with a linear kernel and the “string to word vector” function. We used 10-fold cross validation for the accuracy calculation.

4.2 Experimental Results

because it is able to decrease the incorrect Japanese translations. For the direct translation, the combined filtering is not effective because the number of Japanese translations is much fewer than the transitive translation. Table 1 also shows that our translation only achieved 50% accuracy compared to the oracle experiment (last row, the document retrieval using keywords extracted from Japanese monolingual queries). Table 1. Japanese Passage Retrieval’s Experimental Results on Indonesian-Japanese CLQA with Transitive Translation (Schema 1)

In this section, we describe the performance result on the passage retriever and the Japanese answer finder between the first approach (transitive translation) and second approach (transitive translation and transitive passage retriever). 4.2.1 First Approach (Indonesian-Japanese CLQA with Transitive Translation) The performance of the Japanese passage retriever is shown in Table 1. There are two translation methods compared in the experiments: the direct translation and the transitive translation, both employed bilingual dictionaries. Table 1 shows two evaluation measures: precision and recall. Precision shows the average ratio of relevant passages to the retrieved passages. A relevant passage is a passage that contains a correct answer without considering any available supporting evidence. Recall shows the ratio of questions having retrieved relevant passages to the total test questions. n-th MI score means the input query is the keyword set with the n-th rank of MI score. Top n MI means that the input query consists of keywords from the 1st rank until n-th rank of MI score. MI-TFxIDF is the combination of mutual information score and the TFxIDF score. In the keyword translation, even though the number of OOV for common noun resulted by the direct translation is much larger than the transitive translation, but in general the direct translation has better retrieval performance than the transitive translation (higher precision score for all methods and higher recall score for almost all methods). It shows that the important keywords in the document retrieval mostly are the proper noun (number of OOV proper noun of direct translation is lower than the transitive translation). Table 1 also shows that without the combination of TFxIDF and mutual information filtering, the transitive translation result will have lower recall score than the direct translation. It indicates that the combined filtering is effective for transitive translation result

For the Japanese answer finder module, we used Yamcha (http://chasen.org/taku/software/yamcha) with default configuration as the SVM based text chunking software. To evaluate the answer finder module, we conducted the answer finder experiments for the correct passages. The result is shown in Table 2. Table 2. Question Answering Accuracy for Correct Japanese Documents

The evaluation scores are Top1 (correct rate of the top 1 answers), Top5 (rate of at least one correct answer included in the top 5 answers), TopN (rate of at least one correct answer retrieved in the found answers) and MRR (Mean Reciprocal Rank, the average reciprocal rank (1/n) of the highest rank n of a correct answer for each question). Baseline means that I use features mentioned in Section 3.4. We add an additional feature called the word distance which shows the distance between the current document word and other document word that equals to question keywords. To see the effect of the transitive translation in the answer finder feature, we conducted two kinds of experiments for the oracle correct documents. The first is the one that used transitive translation to measure the document word similarity score, shown in the first two rows. The second is the one that used the correct translation, the Japanese keywords contained in the Japanese queries, shown in the last two rows. This comparison shows that for the answer finder method, the use of transitive translation does not give significant effect which is different with the document retrieval results where the recall score of the transitive translation is about half of the one using the correct Japanese keywords. Table 3. Question Answering Accuracy for Retrieved Documents

Even though the CLQA result is not well enough but it is higher compared to a similar research by QATRO[12] for English-Japanese with 0.5% accuracy for the Top1 answer. It is mostly because of the low passage retrieval score. The low passage retrieval score is mainly caused by the translation errors between Indonesian and Japanese. Using the available resources, the translation could not resolve the OOV proper noun problem. There are many proper nouns which are important question keywords could not be translated by the translation resources. 4.2.2 Second Approach (Indonesian-Japanese CLQA with Transitive Passage Retriever) The performance of the Japanese passage retriever is shown in Table 4. Results shown in Table 4 use the same evaluation score as described in Section 4.2.1: recall and precision. Each experiment shown in Table 4 uses all English retrieved passages as the result of the transitive passage retrieval for the input of the Japanese passage retrieval. The Japanese passage outputs are then filtered using four methods mentioned in the first column of Table 4. Table 4. Japanese Passage Retriever Performance of Indonesian-Japanese CLQA with Transitive Passage Retriever

For the answer finder result, using highest 100 TFxIDF, the answer finder accuracy performace is 4% of Top1, 11.5% of Top5, 26.5% of TopN and 0.07% of MRR. The accuracy is higher than the Schema 1 (Table 3) because the passage retriever score achieved by the Schema 2 is higher than the one by the Schema 1. The advantage of using the transitive passage retriever is the better performance achieved by the passage retriever. The disadvantage is that the longer time needed for the whole process where the translation and passage retriever should be done twice. As the final experiment, we applied the same answer finder module on the passage retriever results. We usede the passage retriever with the combination of MI and TFxIDF filtering method. The question accuracy scores are shown in Table 3. The answers were ranked using the recall score (R) of the document retrieval and the text chunking (T) score resulted by Yamcha.

5. Conclusions We have conducted an Indonesian-Japanese CLQA using easily adapted modules and transitive approach. There are two transitive approaches: the transitive translation and transitive passage retriever. The experimental results show that the usage of transitive passage retriever could enhance the passage retrieval

performance and gives a slight improvement in the answer finder performance. By using transitive translation, the document retrieval results and the answer finder results are comparable with the direct translation. In addition, compared to other research of English-Japanese using a similar approach, our Indonesian-Japanese CLQA gives better accuracy scores.

References [1]

Alesssandro Vallin, Bernardo Magnini, Danilo Giampiccolo, Lili Aunimo, Christelle Ayache, Petya Osenova, Anselmo Peas, Maarten de Rijke, Bogdan Sacaleanu, Diana Santos, and Richard Sutcliffe. Overview of the CLEF 2005 Multilingual Question Answering Track. In Proc. of CLEF 2005 Workshop, Vienna, Austria, September 2005. [2] Ayu Purwarianti, Masatoshi Tsuchiya, and Seiichi Nakagawa. Indonesian-Japanese Transitive Translation using English for CLIR. Journal of Natural Language Processing, Information Processing Society of Japan, 14(2), 2007. [3] Hideki Isozaki, Katsuhito Sudoh, Hajime Tsukada. NTT’s Japanese-English Cross-Language Question Answering System. In Proc. Of NTCIR-5 Workshop Meeting, pages 186-193, Tokyo, Japan, December 2005. [4] Lisa A. Ballesteros. Cross-language Retrieval via Transitive Translation. In W. Bruce Croft, editor, Advances in Information Retrieval, pages 203-230. Kluwer Academic Publishers, 2000. [5] Tatsunori Mori and Masami Kawagishi. A Method of Cross Language Question Answering based on Machine Translation and Transliteration. In Proc. Of NTCIR-5

Workshop Meeting, pages 215-222, Tokyo, Japan, December 2005. [6] Kei Shimizu, Tomoyosi Akiba, Atsushi Fujii, Katunobu Itou. Bi-directional Cross Language Question Answering using a Single Monolingual QA System. In Proc. Of NTCIR-5 Workshop Meeting, pages 236-241, Tokyo, Japan, December 2005. [7] Mirna Adriani and Rinawati. University of Indonesia Participation at Query Answering-CLEF 2005. In Proc. of CLEF 2005 Workshop, Vienna, Austria, September 2005. [8] S. E. Robertson, and S. Walker. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1994. [9] Tim Gollins and Mark Sanderson. Improving cross language retrieval with triangulated translation. In SIGIR 01, 2001. [10] Wijono, Sri Hartati, Indra Budi, Lily Fitria, and Mirna Adriani. Finding Answers to Indonesian Questions from English Documents. In Proc. of CLEF 2006 Workshop, Spain, September 2006. [11] Yutaka Sasaki. Baseline System for NTCIR-5 CLQA1: An Experimentally Extended QBTE Approach. In Proc. Of NTCIR-5 Workshop Meeting, pages 230-235, Tokyo, Japan, December 2005. [12] Yutaka Sasaki, Hsin-Hsi Chen, Kuang hua Chen, and Chuan-Jie Lin. Overview of the NTCIR-5 Cross-Lingual Question Answering Task (CLQA1). In Proc. Of NTCIR5 Workshop Meeting, pages 230-235, Tokyo, Japan, December 2005.