AWERProcedia Information Technology & Computer

1 downloads 0 Views 672KB Size Report
they provide security of mobile equipment, whether they have the documents ..... [15] Veronis, J., Hirs, D., Esspesser, R., & Ide N. NL and Speech in Multext ...
AWERProcedia Information Technology & Computer Science Vol 03 (2013) 204-211

3rd World Conference on Information Technology (WCIT-2012)

A Mobile Product Recognition System for Visually Impaired People with IPhone 4 Nilufer Yurtay *, Faculty of Computer and Informatics, Computer Engineering Department, Sakarya University, 54187-Sakarya, Turkey. Sait Çelebi, Department of Computer Science, Istanbul Şehir University, Istanbul, Turkey. Ayse Bilge Gunduz, Faculty of Computer and Informatics, Computer Engineering Department, Sakarya University, 54187-Sakarya, Turkey. Yucel Bicil, The Scientific and Technical Research Council of Turkey-National Research Institute of Electronics and Cryptology, TÜBİTAK-UEKAE, Gebze, Turkey. Suggested Citation: Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. rd Available from: http://www.world-education-center.org/index.php/P-ITCS. Proceedings of 3 World Conference on Information Technology (WCIT-2012), 14-16 November 2012, University of Barcelon, Barcelona, Spain. Received 3 March, 2013; revised 13 July, 2013; accepted 26 September, 2013. Selection and peer review under responsibility of Prof. Dr. Hafize Keser. ©2013 Academic World Education & Research Center. All rights reserved. Abstract To perceive surroundings and collect information is provided by other healthy senses apart from visual senses for visually impaired people. One of these senses is auditory. Auditory senses are crucial for those people to lead a social life. However, it is not always possible to find devices that provide audible signals and those works should be increased. The mobile applications, which make our lives easier in many fields, can be used efficiently for visually impaired people as well. In this study, a mobile product recognition system with 2nd barcode reader and a Turkish speech synthesizer in iPhone 4 is developed. It is aimed for visually impaired people to do shopping. Keywords: Text to speech, 2nd barcode reader, mobile, shopping, visually impaired;

* ADDRESS FOR CORRESPONDENCE: Nilufer Yurtay, Faculty of Computer and Informatics, Computer Engineering Department,

Sakarya University, 54187-Sakarya, Turkey, E-mail address: [email protected] / Tel.: +90-264-295-5898

Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. Available from: http://www.world-educationcenter.org/index.php/P-ITCS

1. Introduction Mobile applications have increased recently and become a specialized field consisting of many socially beneficial studies. Some studies in this case can be summarized as below. Medical Decision Support System (MDSS), which is developed for people using a great number of drugs and especially old people, is designed for patients to be protected from drug side-effects. It is suggested mainly for patients nursed at home. This mobile MDSS, named LIFE reader, is a barcode reader and includes a package named SafeMed. After reading European Article Number (EAN) barcode, this Palm Digital Assistant (PDA) barcode gives such information as the listing of drugs to be avoided with the drug whose barcode is read [1]. HealthReachMobile application is designed for the patients who need not to live with diabetes forever and can be cured. While this application was web-based before, it has become an application used in mobile media with the help of technological development. It has got through a test period after development. Participants who test the application are patients with type 1 and type 2 diabetes and between the ages of 18-26.Application consists of 3 steps: wireless glucometer, auto selfmanagement support mail and graphical display of glucose in blood per a year, a month and 24 hours. Glucometer users get the information about snapshots of their patients when they upload the rates of glucose. With the help of auto self-management support mails they are trained about their illnesses. By regularly uploaded rates of glucose they can see the graphical statics of their illnesses [2]. Management of hypertension requires patients to come to hospital or clinic. However, there may be some who feel uncomfortable with the situation and have medical phobia. Developed for such patients, real-time worker Real-time Data Solution (RDS) application helps managing hypertension individually [3]. Mobile computing attracts attention of researchers on communication, as well. Having a wide smart phone and PDA user group takes these mobile applications to the commercial centers. It is seen in the research that development approach of the companies selling applications in mobile media nowadays is different. While some put the open source products into their markets, others do not support such products. However, by the effects of the increase in the number of application developers, the number of the supporters of open source products is on the increase [4]. In a research iPad is found to be useful for surgeons in many fields after this tablet is put on the market. It is generally seen that surgeons agree on the idea that iPhone screen is too small for medical implementations but iPad is very useful for them [5]. Another research was carried on to improve clinical skills of health workers in new-born unit by the help of mobile-learning. Participants were chosen among midwives and were given iPod touch including small videos of “Reusable Learning Objects” in case of need. They were asked how mobile learning improves their clinical skills. Midwives answered these questions as “sufficient”, “insufficient”, “good”, “very good”. As a conclusion it was deducted that these videos improve midwives’ clinical skills [6]. Another research was carried on to find out how security is provided in mobile media, what people and companies do on this matter. Participants were informed that it is important to provide information security in a media in which there are few companies having security policies while there is a rapid growth in mobile technology. Companies participating in the survey were asked whether they provide security of mobile equipment, whether they have the documents about company on their phones and whether they find information security policy required. Considering the answers, it is clear that several workers pointed out they do not find security of mobile equipment required. It is seen that they sometimes make correspondence via their phones; however, they do not have information if that media is secure. Nevertheless, many of them agree on the fact that information security policy is required [7]. 205

Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. Available from: http://www.world-educationcenter.org/index.php/P-ITCS

In a survey carried out by Choi, real time learning is fulfilled with many sorted data for face recognition. To do this, a Google Dev Phone was chosen which has Android 1.5 operating system enabling application with C. After the application was developed via MATLAB 7.0 in computer environment; it was adapted into mobile media with no C++ optimization. At the end of the survey it was seen that they got better result with algorithm compared to other face recognition methods [8]. The advantages provided by mobile applications give hope to the possibility of usage of mobile technology for visually impaired people. In this survey a mobile product recognition system including our speech synthesizer module was designed in order to make shopping easier for visually impaired people. With this it is aimed to make shopping possible for visually impaired people by minimizing others’ help. 2. Necessity of System for Visually Impaired People According to the data of 2002 Turkey Disability Survey, the percentage of people with disabilities is determined as 12.29% in Turkey (11.10% male, 13.45% female). 0.60% of this percentage consists of visually impaired [9]. Considering that population of Turkey is about 75,000,000, it can easily be deducted that disabled population is about 55,000. According to the same survey, Results of Difficulties of Visually Impaired People in Turkey clearly shows that; visually impaired people complain about not having common Braille and audio publication due to the lack of facility for many of them [10]. 3. Design and Process of Survey As seen in Figure 1, our survey is conducted to help visually impaired people identify products while shopping. An application working in iPhone 4 was developed in our survey. Software, which enables iPhone to have connection with market’s database and to synthesize text coming from database, is available in the application. Taking mobile applications into consideration, it is important to choose the right hardware. We can summarize the reasons for the usage of iPhone in our survey: 5 megapixel camera, Wi-Fi or 3G internet connection, great number of applications developed/developing for iPhone, common usage area of iPhone, easy accessibility to applications for a visually impaired person owing to iPhone’s having its own screen saver, some of barcode reader applications on iPhone’s having open source code and its sufficiently rapid mobile processor for operations. As for writing and uploading application, one Mac computer was utilized.

Figure 1. Design of shopping system for visually impaired people

206

Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. Available from: http://www.world-educationcenter.org/index.php/P-ITCS

At the first step of the system after visually impaired person captures the product barcode with the iPhone 4’s camera, that equipment will take the product information from database by connecting the database of market. However, there are several difficulties in this system. There is no audible alarm system in available barcode reader applications. And this will prevent visually impaired person from finding product barcode’s place. Moreover, when insufficient light in the area is the case, user should be warned about this matter. For these reasons it, primarily, requires to integrate an audible alarm system into barcode reader module. It is also required to operate the image taken by the camera constantly so as to understand if the image of barcode comes into view of camera. Here it is aimed to get the best performance with the limited sources in mobile media. Besides, because the light is also crucial for the quality of the image taken, it makes barcode difficult to be found and read after that. Furthermore, it is generally very difficult to find the barcode of huge and amorphous products. Difficulties in reading barcode were resolved by an open source barcode reader named Zbar having sufficient performance for iPhone in this study. Moreover, an audible alarm system was added to this barcode reader so that visually impaired people can realize that they approach to the barcode and perceive what they find. Later on, barcode reader module was integrated with TTS. The second step of the survey is to synthesize information taken from database. Information, which is taken from the database of market via text-to-speech synthesizer module on the iPhone, is synthesized so as to make visually impaired people perceive what they hold on their hands. Those data’s can be transferred from database to iPhone via Wi-Fi or 3G and this will make the system more stable. If there is no Wi-Fi connection or it is not accessible from any place in the market, 3G Connection should be utilized. At the synthesizing step iPhone’s own Turkish Text-to-speech synthesizer can be utilized as any text-to-speech synthesizer can. In this system, a TTS module was used, which makes the figure in Figure 2 real. We have used our Turkish Text-To-Speech synthesizer in this study. The sound database of this TTS module is in the iPhone, so it’s working offline for synthesizing the speech for the moment. Namely, the system can work on both online and offline modes, if you upload all of the product information to the iPhone. However, communication with database is provided via web service in this survey. Thus, it works online. Product information returns from database to the system as a single string. It is pretty easy to apply the system to different markets. Tests were carried out by entering data to database from outside because market databases were not open source. 3.1 Turkish text-to-speech synthesizer module Speech synthesis means to translate a text in a digital media into sound automatically. With this technology texts in electronic media are tried to be synthesized in a human-like and comprehensible way [12]. Text-to-Speech Synthesizer (TTS) can be utilized with various applications in human-computer interaction fields, nowadays. Communication with disabled people, listening books in electronic media, learning foreign languages can be examples for these fields. Even though TTS has a popular research area and prepares background for the required applications for visually impaired people, there are not sufficiently various applications, yet. While a great number of Text-to-Speech Synthesizer Systems are developed for greatly used western languages, the number of ones for Turkish is still limited. Certain types of systems such as MBROLA[13], FESTIVAL [14], MULTEXT [15], GENGLISH [16], HTS [17] were developed for synthesis in more than one language. Of these systems MBROLA system is adapted into Turkish and made functional [18]. In spite of some surveys in other systems for Turkish, there is no natural enough system, yet. 207

Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. Available from: http://www.world-educationcenter.org/index.php/P-ITCS

Figure 2. Activity diagram

An additive speech synthesizer was developed to gain monotonous speech signal [19]. Synthesizing process is carried out via logotoms in our text-to-speech synthesizer module. Logotoms are groups of 208

Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. Available from: http://www.world-educationcenter.org/index.php/P-ITCS

regular but meaningless words. The logotom database that is used was designed by The Scientific and Technological Research Council of Turkey (TÜBİTAK) – National Research Institute of Electronics and Cryptology (UEKAE) and Izmir Institute of Technology (İYTE) *20+. 3.2 Performance of the system After developing a fast enough speech synthesizer on mobile platform, we conducted the following tests with a voluntary visually impaired person, shown in Table 1. Table 1. Periods of recognition of sample products for a visually impaired person Product 1

Sample Products

Ülker çikolatalı gofret

Test1 Test2 Test3 Test4 Test5 Test6 Test7 Test8 Test9 Test10 Total Time Average Time

00:21.50 00:06.30 00:07.43 00:03.56 00:04.75 00:04.60 00:06.77 00:04.19 00:04.19 00:04.71

Product2

Product3 Product4 Introduction to Ruffles Cryptography Erikli 0.5 Originals Johannes A. Litre Şaşal su sade patates Buchman cipsi Second Edition - Kitap 00:09.20 00:16.20 00:36.40 00:13.90 00:10.30 00:18.20 00:13.42 00:11.93 00:17.57 00:22.67 00:10.81 00:16.53 00:25.36 00:21.80 00:08.08 00:51.72 00:20.59 00:14.20 00:10.68 01:17.27 00:16.94 00:23.63 00:08.54 00:30.51 00:24.24 00:17.36 00:23.43 00:09.49 00:05.39 00:39.29

Product5

Product6

Danone 500ml kutu süt

First sensations nane gizemi 14 lü sakız

Total Test Time

00:14.10 01:59.53 00:12.13 00:07.99 00:09.75 00:10.29 00:14.52 00:15.88 00:12.68 00:12.50

00:07.70 00:04.74 00:06.58 00:13.67 00:10.56 00:22.20 00:04.36 00:05.51 00:16.96 00:05.40

01:45.10 02:52.97 01:09.06 01:15.23 01:20.30 02:03.60 02:10.54 01:28.26 01:38.86 01:16.78 17:00.70

01:08.00

03:24.31

03:20.19

03:41.15

03:49.37

01:37.68

00:06.80

00:20.43

00:20.02

00:22.11

00:22.94

00:09.77

4. Discussions It is possible to widen our survey about shopping system for visually impaired people. Provided that required permissions are acquired, passwords required for market product database can be accessible temporarily and limitedly. By integrating market basket, addition of products to the basket can be carried out via audible confirmation. In case market does not approve the access to their database, this problem can be solved by creating an open source barcode database including an estimated price for all products.

5. Conclusion In this survey a mobile application aiming product recognition was carried out so that visually impaired people do their shopping easily. Audible alarm system helping find product barcode via 209

Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. Available from: http://www.world-educationcenter.org/index.php/P-ITCS

mobile equipment’s camera and Turkish text-to-speech module which synthesizes the product information work together. Usage rate of it by visually impaired people is high. Text-to-speech synthesis is a technology that can be used in various fields today. However, the number of surveys on this matter in Turkish is limited. Mobile media is one of the places that are mostly used in text-to-speech synthesis. This text-to-speech synthesizer module is satisfactory in terms of comprehensibility and in medium-level in terms of neutrality. Acknowledgement This research was supported by TÜBİTAK-UEKAE Multimedia Technology Research and Development Laboratory. References [1]

[2]

[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]

[14]

[15] [16]

Johansson, P. E., Petersson, G. I., & Nilsson, G. C. Personal digital assistant with a barcode reader-A medical decision support system for nurses in home care. International Journal of Medical Informatics, 2010, pp. 232-242. Harris, L. T., Tufano, J., Le, T., Rees, C., Lewis, G. A., Evert, A. B., Flowers, J., ... Ralston J. D. Designing mobile support for glycemic control in patients with diabetes. Journal of Biomedical Informatics, 2010, 43, pp. s37s40. Zeng, H., Wu, S., Wang, K., Wen, E., Liu, J., & Yang, X. An iPhone-based tele-health system for hypertension management, Chongqing Medical University, China, 2009. Holzer, A., & Ondrus, J. Mobile application market: A developer’s perspective. Telematics and Informatics, 2011, 28, pp. 22-31. Dala-Ali, B. M., Lloyd, M. A., & Al-Abed, Y. The uses of the iPhone for surgeons. The Surgeon, 2010, pp. 1-5. Clay, C. A. Exploring the use of mobile Technologies for the acquisition of clinical skills. Nurse Education Today, 2010, pp. 1-5. Goode, A. Managing mobile security: How are we doing? Network Security, 2010, February, pp. 12-15. Choi, K., Toh, K.-A., & Hyeran, B. Realtime training on mobile devices for face recognition applications. ScienceDirect; Pattern Recognition, 2011, 44, pp. 386-400. TUIK, Turkey Disability Survey, [Online] 29 Mart 2011. Available from: http://www.tuik.gov.tr/PreIstatistikTablo.do?istab_id=512, 2002 Yurtay, N., Bicil, Y., Çelebi, S., Gündüz, A. B., Yurtay, Y., & Çelik, U. Reading Of Turkish E-Book For Visually Impaired, International Educational Technology Conference, İstanbul, 2011. TUIK, Turkey Disability Survey, [Online] 29 Mart 2011. Available from: http://www.tuik.gov.tr/PreIstatistikTablo.do?istab_id=518, 2002. SESTEK Ses ve İletişim Bilgisayar Teknolojileri A.Ş., Konuşma Sentezi, *Online+ 17 Nisan 2011, Available from: http://www.gvz.com.tr/konusmasentezi.html, 2000. Dutoit, T., Pagel, V., Pierret, N., Bataiile, F., & Van Der Vrecken, O. The MBROLA Project: Towards a set of high quality speech synthesizers of use for non commercial purposes, The 4th International Conference On Spoken Language Processing, Philadelphia, PA, USA, 1996, pp. 1393-1396. Dubuisson, T., Dutoit, T., Gosselin, B., & Remacle, M. On the Use of the Correlation between Acoustic Descriptors for the Normal/Pathological Voices Discrimination, EURASIP Journal on Advances in Signal Processing, Analysis and Signal Processing of Oesophageal and Pathological Voices, 2009, Veronis, J., Hirs, D., Esspesser, R., & Ide N. NL and Speech in Multext Project, AAAI’94 Workshop on Integration of Natural Language and Speech, 1994. Dutoit, T., & Cernak, M. TTSBOX: A Matlab Toolbox For Teaching Text-To-Speech Synthesis, Proc. ICASSP'05, Philadelphia, USA, 2005.

210

Yurtay, N., Çelebi, S., Gunduz, B., A. & Bicil, Y. A Mobile Product Recognition System for Visually Impaired People with IPhone 4, AWERProcedia Information Technology & Computer Science. [Online]. 2013, 3, pp 204-211. Available from: http://www.world-educationcenter.org/index.php/P-ITCS

[17] Yamagishi, J., Zen, H., Toda, T., & Tokuda, K. Speaker-Independent HMM-based Speech Synthesis System HTS-2007 System for the Blizzard Challenge 2007, Proc. Of Blizzard Challenge 2007. [18] Bozkurt, B., & Dutoit, T. An implementation and evaluation of two diphone-based synthesizers for Turkish, Proc. 4th ISCA Tutorial and Research Workshop on Speech Synthesis, Blair Atholl, Scotland, 2001, pp. 247250. [19] Yurtay, N., Çelebi, S., Bicil, Y., & Gündüz, A. B. Türçe Konuşma Sentezleme, International Science and Technology Conference, Famagusta, 2010, pp. 828-833. [20] Bicil, Y. Turkish text-to-speech synthesis. Master’s Thesis, Sakarya University Computer Engineering, Sakarya, 2010.

211