Multimodal Interfaces - limsi

5 downloads 0 Views 33KB Size Report
Software production has shifted from character-based interfaces ... -Say "underline" while pointing at a character of the word. To select a part of the ... following example. To put a ... integration of multimodality into the definition of these objects.
Multimodal Interfaces: New Solutions to the Problem of Computer Accessibility for the Blind Yacine BELLIK

Dominique BURGER

LIMSI-CNRS B.P. 133, 91403 ORSAY Cedex FRANCE Tel : 33.1.69.85.81.21 E-mail : [email protected]

INSERM UPMC B23, 75252 PARIS Cedex 05 FRANCE Tel : 33.1.44.27.34.35

KEYWORDS

User Interface, Multimodal Interface, Non Visual Interface. ABSTRACT

This paper examines how multimodal interfaces can improve the accessibility of software application for blind users. The approach, which consists of translating visual interaction forms through non visual modalities, can't be successfully applied to graphical interfaces. Optimising interface for the blind involves rethinking paradigms and building the application interface on another base. Multimodal interfaces open new avenues for research and development into this area. This paper discusses these promising perspectives through a concrete example: a prototype multimodal text editor that has been developed in a research project between INSERM and CNRS. INTRODUCTION

Computers open up a new opportunity for the social and professional integration of people with a visual handicap. People who are blind can carry on with their daily work using computers tools that have been adapted to their special requirements. They can take notes, receive and send faxes or E-mails, edit texts and consult dictionaries. Nevertheless, this great potential seems to be far from fully realised, mainly because of the human factors in the interfaces [1]. Thus a challenge for rehabilitation technology is to make access products for the blind as user-friendly as are the graphical interface-based applications for sighted people. This implies basing their design on specific features of non-visual communication, rather than simply trying to adapt visual dialogue methods. Multimodal interfaces open new avenues for research into the solution of these problems. This paper discusses and illustrates these promising perspectives. APPROACHES TO ADAPTING VISUAL INTERFACES

Software production has shifted from character-based interfaces to graphical user interfaces (GUI) over the past few years. This change is even more radical for the blind than for sighted users because it makes the access techniques previously used totally

inadequate. Indeed they are based on building a program able to read the characters contained in the screen memory. The first consequence is that developers have to find means of accessing data to be displayed on the screen. Fortunately, computer scientist have to rely on well organised Operating Systems to develop sophisticated interfaces like GUI's, and these are composed of software-independent components that communicate with each other by exchanging messages. This allowed the building of the so called Off-Screen Model (OSM) first develop by Berkeley Systems. The user does not directly access the screen, but a model of it that has been rebuild within the computer memory [2]. The access program can then retain representation of text, icons, controls and other information in a form that is close to the semantic level. It can also capture information that is not visible on the screen, which can be useful to the user[3]. Fig. 1 illustrates two approaches to the problem of adapting software application for the blind. The traditional approach, with character-based interfaces was to translate the visual messages on the screen into a non-visual form. Another approach that take advantage of recent technology build alternative interfaces for people with special needs. Visual Interaction Form

Translation

Non Visual Interaction Form

Translation Approach

Interaction Objects

Visual Interface

Non Visual Interface

Alternative Approach

Fig. 1 : Approaches for Adapting Visual Interfaces

NOVAE: A PROJECT TO EXPLORE MULTIMODAL SOLUTIONS

The ultimate advancement in research and development is still to produce interface systems that make it possible for the user to alternate or combine several channels to interact with the computer. Such interfaces are refereed as multimodal. Multimodal interfaces have features that could help to overcome some of the limitations of existing solutions for the blind. In particular, they could provide: • better adaptation to user's needs; • intuitive, concise and quick interaction modes; • easier learning and reduction of memorisation efforts; • increase power expression.

The NOVAE project was developed jointly by the INSERM and the CNRS in France to explore the potential of multimodal solutions for the construction of non-visual interfaces. It is supported by the Institut National des Jeunes Aveugles (INJA), and the CNAMTS. It aims to develop useful tools for the rapid prototyping and evaluation of non-visual interfaces. A library of multimodal interaction objects are being developed. In the following we describe one tool that is currently being evaluated. A MULTIMODAL TOOL FOR TEXT MANIPULATION

The access to rich documentation is a great concern for the blind. Not only is textual material deprived of some of its original features when embossed in braille on paper or made accessible from an E-document, but manipulation as simple as underlining or annotating texts are a great challenge for the blind. Since tools for text manipulation are basic components of user interfaces, we have developed a prototype for text manipulation that is designed to provide blind users with the main functionality of both a simple full-page editor and a simple hypertext document 1 . To ensure this, new interaction forms were devised using existing and commercially available I/O devices (speech recognition and synthesis systems, braille terminal and keyboard), through simple actions that users are able to master, like saying words, pointing at braille cells 2 (Fig 2), and listening to messages.

F1

F2

F3

F4

F5

F6

F7 F8

F9

7

8

9

F10

4

5

6

F11

1

2

3

F12

0

F13 F14

Fig. 2 : The braille terminal used

The braille display is used only for text display. Spoken messages and beeps are used to give the user information about the good or bad execution of operations, and information relative to the displayed text (attributes, word definition, etc.). The available commands allow the user to: • navigate within the text using spoken commands, • perceive and change text attributes (styles, fonts, etc.), • copy, move, and delete text, • have the program read parts of text, • control speech synthesis parameters, etc. Some examples of simple command scenarios may include: To underline a word: -Say "underline" while pointing at a character of the word. To select a part of the text: -Say "begin selection" while pointing at the first character. -Say "end selection" while pointing at the last character. To copy a part of the text: -Select the desired part. -Say "copy". -Say "paste" while pointing at the desired location. 1The programming basis for this development is SPECIMEN, a tool for the development of multimodal interfaces, developed at LIMSI. 2The terminal we use has an embeded piezoelectic braille display with 44 cells. Each cell presents a character on a 8 dot braille matrix. A pointing button is associated with each cell. A character is pointed at by pushing the button in front of the cell displaying the character.

Presentation of text attributes

Parts of texts can be enriched with attributes like style, colour, font, size, etc. The user is informed of such enrichments via the lower right dot (dot-8) of each braille cell. When a character has particular attributes (different from default attributes), this dot-8 of the cell is set up. The user can get more information by pressing the button in front of this character. The attributes are then immediately vocalised using the text-to-speech synthesiser. If a complete word is enriched, bold for instance, only its first character is marked so as not to disturb the process of reading braille. The synthesised speech message is then "bold word". Similarly, if the whole line is enriched, only the first character of the line is marked. The synthesised message is then "bold line". Discussion

This interface has been evaluated by blind users. Although complete results are not yet available, we have had positive preliminary comments. One is that the first users showed interest in these interface. They learned to operate it almost immediately. They also commented on the synergic multimodality that combines several channels to formulate a statement. This seems to be of particular relevance for blind users, as shown by the following example. To put a word in bold characters in a sentence, the blind user of a braille-based access program would probably have to embark on the following sequence of actions : • put the cursor at the beginning of the word • activate the selection key • put the cursor at the end of the word • activate the menu bar key • put the cursor on Format in the menu bar • press the Enter key • put the cursor on Bold in the pull-down menu • press the Enter key In our prototype the actions are as follows : • locate the word by reading it, click on one of the corresponding buttons and simultaneously say "bold". CONCLUSION

We have shown how far multimodal interfaces could contribute to overcoming some of the limitations of existing computer access solutions for blind users. It has become quite clear that one way to optimise of non-visual interfaces is to try to identify the conceptual items that are useful for carrying out a task, and to imagine non-visual presentation objects for these items, including interaction methods suited to the needs of blind users. The integration of multimodality into the definition of these objects must respects the work habits and representations of users. This will guide our further work on the NOVAE project. REFERENCES 1. Burger, D. L'ergonomie dans la conception des projets informatiques. J.C.Sperandio Ed., Toulouse, France: Octares, pp. 247-263, 1993. 2. Boyd, L.H. & Boyd, W.L. & Vanderheiden, G.C. The Graphical User Interface: Crisis, Danger and Opportunity. Journal of Visual Impairement and Blindness, vol.84, pp. 496-502, 1990. 3. Berliss, J. Non-Visual Human-Computer Interactions. D.Burger and J.C.Sperandio Eds, Montrouge, France, pp. 131-143, 1993.