Multilingual Dictionary & Phrasebook for Thai-to ...

104 downloads 256144 Views 4MB Size Report
Android operating system. Initially, this application is suitable for Thai language user that plans for traveling in 10 ASEAN member countries. It supports 10 ...
The 6th International Conference on Science, Technology and Innovation for Sustainable Well-Being (STISWB VI), 28-30 August 2014, Apsara Angkor Resort & Conference, Siem Reap, Kingdom of Cambodia

CSE-127

Multilingual Dictionary & Phrasebook for Thai-to-ASEAN Languages on Android Smartphone Supeeti Kulchan* and Wanthanee Prachuabsupakij Department of Information Technology, Faculty of Industrial and Technology Management King Mongkut’s University of Technology North Bangkok, Thailand *

[email protected]

Abstract Many people have experiencing problems with communication when traveling abroad when English is not a good choice in communications. It is an obstacle to obtain service such as traveling, dining, lodging and etc. This paper presents a smartphone application for the Android operating system. Initially, this application is suitable for Thai language user that plans for traveling in 10 ASEAN member countries. It supports 10 official languages including Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Thai and Vietnamese. The user searches the translation of target language from Thai words and phrases. The searching result displays in three styles 1) Display in alphabet of target language 2) Romanization of target language. 3) Vocalization of target language in order to communicate more clearly. We assure that the application reduces language gap in ASEAN. Keywords: Mobile Application, Multilingual Dictionary and Phrasebook. 1. Introduction ASEAN Economic Community (AEC) is coming in 2015 with the main objectives are to originate (a) a single market and production base (b) a highly competitive economic region (c) a region of equitable economic development (d) a region fully integrated into the global economy [1]. A single visa among ASEAN members is supposed to take effect [2], [3]. There will be free flow of visitor with in the region. Thai tourist could travels in ASEAN member countries under single visa, with single validity for each country. ASEAN has 10 member countries with 12 different official languages as shown in Table 1.

Table 1: List of ASEAN member countries and their official language(s)1. No. 1

1

Country Brunei Darussalam

2

Cambodia

3 4

Indonesia Lao PDR

5

Malaysia

Official Language Malay & English Cambodian (Khmer) Indonesian Lao Malay English Chinese Tamil

http://www.asean.org/asean/asean-member-states

50th Anniversary Technique Thai-German to Rajamangala Khon Kaen

The 6th International Conference on Science, Technology and Innovation for Sustainable Well-Being (STISWB VI), 28-30 August 2014, Apsara Angkor Resort & Conference, Siem Reap, Kingdom of Cambodia

6

Myanmar

7

Philippines

8

Singapore

9 10

Thailand Vietnam

Burmese Filipino English Spanish English Malay Chinese Tamil Thai Vietnamese

The diversity of languages in ASEAN is an obstacle for tourists’ activities, e.g., trading, service requisition and etc. It is difficult to make a communication if we don’t know their vocabulary or lack of knowledge of how to vocalize their basic conversation. In spite of having dictionary and phrasebook, it is still not easy to vocalize their language. In addition, the technology of smartphone (mobile phone) is highly advanced and powerful. Computing of complexity tasks complete in seconds. A huge amount of data can store on it. Motivated by this, we have created a smartphone application, named ASEANTourist, for the Android operating system. Initially, this application is suitable for Thai language user that plans for traveling in 10 ASEAN member countries. It supports 10 official languages including Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Thai and Vietnamese. The user searches the translation of target language from Thai words and phrases. The searching result displays in three styles 1) Display in alphabet of target language 2) Romanization (Transliteration) of target language. 3) Vocalization of target language in order to communicate more clearly. 2. The Scenario In this part, we describe a situation of using the ASEAN-Tourist application.

Initially, this application is designed for Thai language native speaker. Thai tourists have planed to travel among ASEAN member countries. They have no knowledge in local destination languages. Taking dictionary and phrasebook for all languages is not a proper choice. Moreover, they could not read another language except English. They download and install ASEANTourist for their smartphone. The application will help them, while English is not a suitable language for the communication. Just search and show the screen what they want to say to the target people. Furthermore, they might touch an icon to vocalize words or phrases. 3. Related Work Our works focus on mobile application available free of charge at GooglePlay2. There are 3 most relevant applications. The discussions are as follow: 3.1 App. “ASEAN Language AEC

(a)

(b)

Fig. 1 (a) “ASEAN Language AEC” application contains 9 languages for 9 countries. (b) Greeting phrases for Burmese.

2

https://play.google.com/store

50th Anniversary Technique Thai-German to Rajamangala Khon Kaen

The 6th International Conference on Science, Technology and Innovation for Sustainable Well-Being (STISWB VI), 28-30 August 2014, Apsara Angkor Resort & Conference, Siem Reap, Kingdom of Cambodia This mobile application [4] is intended for ASEAN languages learner or novice speaker. It contains 9 languages for 9 countries as illustrated in the figure 1(a). There are 14 categories including greeting, food, counting, time, currency, shopping, traveling, general, love phrases, color, animal, fruit, profession, date and month. Pros: (1) Transliteration of words and phrases into Thai, as shown in the figure 1(b). This technique will simplify the user to pronounce correctly. Moreover, the user can edit the transliterated word, if it is wrong. Cons: (1) Words and phrases of target language are not displayed in their characters. Display of target language characters is way to make a clear communication. (2) There is no vocalization function that the user might be attended. In addition, this application is intended only for Thai language user. 3.2 App. “ASEAN Languages” This mobile application [5] is related to the previous mobile application.

Both applications are mostly similar objective and categorization. It supports 10 languages, as shown in the figure 2(a). In the figure 2(b), vocalization function has been included. The user just touches the symbol at the end of selected phrase. However, words and phrases of target language are not displayed in their characters. 3.3 App. “Multilingual in AEC” This application [6] aims to facilitate people who want to travel and communicate with people in ASEAN country. It supports 5 languages including Thai, Lao, Khmer, Vietnamese and English. There are 21 categories as shown in the figure 3(a).

(a)

(b)

Fig. 3 (a) “Multilingual in AEC” application contains 5 languages. (b) Thai phrases and its translation for Lao with vocalization function. (a)

(b)

Fig. 2 (a) “ASEAN Languages” application contains 10 languages. (b) Greeting phrases with vocalization function for Burmese language.

Pros: (1) Using English as user interface, most people can use it. (2) The user can switch between 5 languages as source language. (3) Vocalization function is included at the end of each target language phrase. (4) Display the target language in their characters as shown in the figure 3(b).

50th Anniversary Technique Thai-German to Rajamangala Khon Kaen

The 6th International Conference on Science, Technology and Innovation for Sustainable Well-Being (STISWB VI), 28-30 August 2014, Apsara Angkor Resort & Conference, Siem Reap, Kingdom of Cambodia Cons: There is no transliteration or romanization for target language. As both previous applications, the users wish to pronounce by themselves. 4. ASEAN-Tourist We develop the “ASEAN-tourist” to respond people who travel around ASEAN member countries. Initially, the application is intended for Thai language speaker. It supports 10 languages including Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Vietnamese and Thai. There are 2 modules in the application i.e., dictionary and phrasebook. Additionally, romanization and vocalization function are included in the application.

The phrasebook module is divided into 11 categories including greeting, direction, the airport, hotel reservation, taxi, restaurant, car rental, trading, train station and bus station. It contains 101 phrases for daily use. 4.2 Translation Initially, the application is not allowed to process a translation in actual time. Lexical unit and phrases have been translated into target language while developing the application. We have 2 procedures to get translation result, i.e., machine translation and human translation. Machine Translation (MT): We recognize that translation result from machine translation is not excellent. Nevertheless, it is acceptable for understanding the meaning of source language. Google Translate3 and Bing Translator4 is most known as online machine translation. However, both online MT is not supports all ASEAN languages for the application. Table 2 illustrates the list of supported 10 ASEAN languages for Google Translate and Bing Translator [7], [8]. Table 2: List of supported 10 ASEAN languages for Google Translate and Bing Translator.

(a)

(b)

Fig. 4 (a) “ASEAN-Tourist” application supports 10 languages for all 10 ASEAN member countries. (b) At first, the users have to choose country in this page. 4.1 Dictionary & Phrasebook There are 6 categories for dictionary module including food, body-manner, traveling-places, trading, numberdirection-date-time and facilities. The dictionary module contains 257 words.

Supported Language Burmese Chinese English Filipino Indonesian Khmer Lao Malay Vietnamese Thai 3 4

Google Translate ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Bing Translator ✗ ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✓ ✓

https://translate.google.com http://www.bing.com/translator/

50th Anniversary Technique Thai-German to Rajamangala Khon Kaen

The 6th International Conference on Science, Technology and Innovation for Sustainable Well-Being (STISWB VI), 28-30 August 2014, Apsara Angkor Resort & Conference, Siem Reap, Kingdom of Cambodia Consequently, Google Translate has been selected to translate prepared words and phrases for some languages including Chinese, English, Indonesian and Malay. Human Translation: is an accuracy method for language translation. However, this method is a highly cost-effective and an extremely time-consuming process. Burmese is an unsupported by both mentioned online MT. Then, we use this method for Burmese language. Burmese native speaker who knows Thai language have been asked to interpret words and phrases from Thai into Burmese. In the reason of accuracy, human translation method has been selected as well for Filipino, Khmer, Lao, Vietnamese and Thai. 4.3 Romanization & Vocalization To extend features of the application, we include romanization and vocalization functions. These functions facilitate the application users to make clearly communication. Romanization: is the method of language writing by using roman / latin alphabets. For the example, Thai romanization of “สวัสดีครับ” is “swasdi khrab”. The standard of Thai romanization is ruled by The Royal Institute5 [9]. We have found an Thai romanization application6 with 99.58% of system acuracy [10]. Another language in our application, we romanize by human and by manchine. Google Translate has a romanization function. Then, Chinese, Indonesian and Malay have been romanized by Google Translate. The remaining language romanizes by human. In the figure 5, Thai phrase “คุณสบายดีไหม-How are you?” is translated into 4 languages, i.e., (a) Khmer (b) Burmese (c) 5 6

http://www.royin.go.th http://pioneer.chula.ac.th/~awirote/ resources/thai-romanization.html

Lao and (d) Vietnamese. The application displays together with romanized phrase for each language. Users simply pronounce the romanized phrase displayed on the screen.

(a)

(b)

(c)

(d)

Fig. 5 Translation of phrase “คุณสบายดีไหม--How are you?” into (a) Khmer. (b) Burmese. (c) Lao. (d) Vietnamese. Vocalization: or Text-to-Speech (TTS) is a voice generation technique from given text. In this feature, voice of each language is vocalized from 2 methods, i.e., machine and human. Google Translate also has TTS function. Then, Chinese, English and

50th Anniversary Technique Thai-German to Rajamangala Khon Kaen

The 6th International Conference on Science, Technology and Innovation for Sustainable Well-Being (STISWB VI), 28-30 August 2014, Apsara Angkor Resort & Conference, Siem Reap, Kingdom of Cambodia Indonesian has been synthesized by Google Translate. Even though Malay is the official language in Malaysia, Singapore and Brunei, Malay spoken in different countries. Then, we use a research application to synthesize [11]. The remaining languages include Burmese, Filipino, Khmer, Lao and Vietnamese. We have recorded human voice for each phrase. In the figure 5, the symbol on the button in the application produces a voice of the phrase when touched by the user. 5. Conclusion & Future Work We developed an Android mobile application named “ASEAN-Tourist” that response to Thai tourist traveling in ASEAN countries. There are 10 supported languages including Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Thai and Vietnamese. For the future work, we have planed to improve and add more features. In this version, the user interface is intended for Thai language speaker. We will implement the user interface that can be customized for ASEAN language in the next version. In this version, we assume that users should have competency of English pronunciation for reading romanized words and phrases. Then, we plan to implement a transcription function in next version. The function will convert text from one writing system to the user-friendly writing system. It helps users to pronounce in their own language. In the next version, we will add Spanish and Tamil that cover all 12 ASEAN languages. Finally, We assure that the application reduces language gap for tourist traveling in ASEAN countries. 6. Acknowledgement We thank Faculty of Industrial Technology and Management, King

Mongkut’s University of Technology North Bangkok for providing financial support for this work. 7. References [1] D. Buranasomphop, (2013). ASEAN Economic Community, URL: http://en.aectourismthai.com/content1/940, access on 30/06/2014 [2] D. Buranasomphop, (2014). Single visa among Asean members, URL: http://en.aectourismthai.com/content1/1349, access on 30/06/2014 [3] J. Ratanapaitoonchai, Single Visa ยกระดับความเชื่อมโยงการท่องเที่ยวอาเซียน, URL: http:// www.itd.or.th/research-article/406-singlevisa-, access on 30/06/2014 [4] ASEAN LANGUAGE AEC-Google Play, URL: https://play.google. com, access on 03/07/2014 [5] Asean Languages-Google Play, URL: https://play.google.com, access on 03/07/2014 [6] Multilingual in AEC-Google Play, URL: https://play.google.com, access on 03/07/2014 [7] Inside Google Translate – Google Translate, URL: http://translate.google. com/about/intl/en_ALL/, access on 05/07/2014 [8] Bing Translator, URL: http://www.bing.com/translator/help/, access on 05/07/2014 [9] Royal Institute. (1999). Principles of Romanization for Thai Script by Transcription Method, 1999. [10] W. Aroonmanakun. (2004). A Unified Model of Thai Romanization and Word Segmentation, paper presented in 18th Pac. Asia Conf. Lang. Inf. Comput., pp. 205– 214, 2004. [11] T.-P. Tan, S.-S. Goh, and Y.-M. Khaw. (2012). A Malay Dialect Translation and Synthesis System: Proposal and Preliminary System, paper presented in the 2012 International Conference on Asian Language Processing, Washington, DC, USA, 2012, pp. 109–112.

50th Anniversary Technique Thai-German to Rajamangala Khon Kaen