Visually-Cued Touch Gestures for Accurate Mobile ...

PROCEEDINGS of the HUMAN FACTORS and ERGONOMICS SOCIETY 55th ANNUAL MEETING - 2011

1105

Visually-Cued Touch Gestures for Accurate Mobile Interaction Sean T. Hayes, Eli R. Hooten, Julie A. Adams Department of Electrical Engineering and Computer Science Vanderbilt University Nashville, Tennessee USA 37235 Portable devices such as tablet personal computers can allow a user to perform tasks while walking. These devices provide new interaction opportunities and challenges. LocalSwipes is a new interaction technique that utilizes localized, multi-touch, radial stroke gestures to reduce the difficulty of interacting with large data sets on mobile devices. LocalSwipes adapts traditional point-to-click widgets to visually support gestural input. A user evaluation demonstrated that LocalSwipes results in fewer interaction errors and is preferred when compared to a touch-based interface incorporating traditional graphical user interface widgets.

Copyright 2011 by Human Factors and Ergonomics Society, Inc. All rights reserved DOI 10.1177/1071181311551231

INTRODUCTION The long-term research objective is to develop touchbased interaction techniques to be used while walking. Today’s graphical user interfaces are built with a standard set of widely adopted widgets (e.g., buttons, combo boxes, etc.). These widgets were developed specifically for cursor control via a peripheral device, such as a mouse. The time to learn new software often is reduced when these standard widgets have been employed. Touch technology is becoming prevalent in portable electronics, such as tablet PCs, because it reduces the need for peripheral input devices. Mobile touch interaction compounds the challenges by reducing input accuracy and touch surface area, as compared to desktop and tabletop devices. Traditional point-and-click widget designs are less effective with touch interaction. The difficulties of point selection with touch interaction are collectively known as the “fat-finger” problem (Holz & Baudisch, 2010): fingers occlude portions of the interface during interaction and the contact area encompasses more than a single pixel. When using touch-interaction with a point-and-click design, touch and release actions are mapped directly to mouse press and release events, respectively. The touch-to-click adaptation requires the fingers to cover the selection point, preventing the user from seeing exactly what is being selected. The finger contact area is reduced to a single selection point, which may not be located at the intended target. One solution to the “fat-finger” problem is to increase the target area’s size. Fitts’ Law implies that interaction with small targets requires more time than with larger targets (Wickens, Lee, Liu, & Gordon-Becker, 2004). However, mobile devices’ limited screen space restricts the target size and new paradigms for efficient portable interaction are needed. A new widget-based interaction paradigm, called LocalSwipes, was developed. Based on lessons learned from previous research, LocalSwipes utilizes directional multitouch gestures to allow for efficient selection of information from large data sets. An evaluation compared LocalSwipes to touch-to-click interaction. LocalSwipes reduced the number of input errors and was preferred by participants. However, task completion time was not reduced.

Related Work A large repository of literature focusing on direct-touch interaction exists (for example, Potter, Weldon, & B. Shneiderman, 1988; Sears & Ben Shneiderman, 1991; Parhi, Karlson, & Bederson, 2006).However, these techniques are limited when addressing the fat-finger problem. Efficient use of menu systems has been a focus of touchinteraction research. Marking menus (Kurtenbach, Sellen, & Buxton, 1993), originally designed for stylus or mouse input, lend themselves to touch input. The user strokes in a defined direction to select an item. The contents of marking menus are hidden from view until the menu is activated. Marking menus generally have novice and expert modes. Novice users can wait a short time after starting the gesture for the menu to become visible. An expert can select items without waiting for the menu to become visible. Many alternative marking-menu designs have been purposed. Zone and Polygon menus (Zhao, Agrawala, & Hinckley, 2006) use multi-stroke techniques to increase menu breadth. Flower Menus (Bailly, Lecolinet, & Nigay, 2008) increase the number of selectable items by adding curved gestures to the options, which increases the maximum selectable options to 56 per level. Bailly, Lecolient and Guiard (2010) developed a system that recognizes combinations of finger counts on one hand and directional strokes on the other to access two-level menus. They found that using finger count was easier to learn and faster to perform on a tabletop device. Lepinski, Grossman, and Fitzmaurice (2010) presented multitouch marking-menus that utilized directional chording, which performed significantly faster than traditional hierarchal marking menus on a tabletop device. All marking-menu designs are most efficient in expert mode, requiring users to learn the arbitrary association between marks and the command defined by each application. This limitation, addressed by LocalSwipes, makes marking menus inadequate for many tasks. Systems often require a large amount of real-time data and options to be visible to support informed user decisions. For example, interfaces providing a supervisory level of control require that configurable information be visible for a longer time. However, menus hide controls until activation. Therefore, more visible and flexible controls, similar to


widgets ubiquitous with today’s applications, are necessary. LocalSwipes is designed to meet this need. Three touch-interaction extensions for traditional pointto-click widgets have been explored by Benko, Wilson, and Baudisch (2006). Dual-Finger Stretch uses two fingers to control the scale of portions of the interface. Dual-Finger XMenu is a circular menu that appears when a finger touches the screen with options to establish a cursor offset or have the cursor move slower than touch movement, allowing the cursor to be less occluded. The Slider technique adjusts the cursor speed based on the distance between the primary finger and the secondary finger. While these methods were shown to reduce errors, they require more steps than mouse interaction and are designed as bimanual gestures, which are not feasible on portable devices. Controls similar to traditional widgets are necessary, but cursor-based interaction is cumbersome and does not utilize the unique characteristics of touch input. Moscovich (2009) proposed sliding widgets that are activated by stroking over the widget in the desired direction. Using the finger’s full contact area instead of just one point and alternating the sliding direction of adjacent widgets greatly reduces selection ambiguity. Widgets are displayed as small sliders providing persistent visual gestural cues. The design was tested for buttons, but may be applied to other widget designs. Occlusion is a concern, because direct contact with the desired selection is required. Also, the sliding gesture is scale dependent, thus larger gestures may contact and activate multiple widgets. Escape (Yatani, Partridge, Bern, & Newman, 2008) is a target-selection technique that also uses single-point directional gestures that are visually cued. Presented in a mapbased application, Escape displays small and densely placed items as an arrow, which indicates the direction of the gesture. The user begins the gesture on or close to the item of interest and moves in the direction indicated by the arrow, disambiguating the selection from other nearby items. Using this technique requires close contact and therefore occlusion. Escape has only been studied with map-based icons, but raises questions about visual cues for widgets.

Design LocalSwipes incorporates many strengths of the research described previously, but focuses on mobile touch interaction that behaves similarly to traditional GUI widgets. Three primary degrees of freedom are utilized: the gesture’s start position, finger count, and the angle between the start point and endpoint of the gesture. LocalSwipes gestures are localized similar to Sliding Widgets (Moscovich, 2009), but the user is not required to start the gesture directly on the desired widget. Using semilocalization, a gesture can begin anywhere within a group of related widgets. Widgets are logically grouped similar to traditional group boxes; made clearly visible with a border. Groups are placed within the interface (e.g., panels, dialogs, dropdowns, windows, etc.). Figure 1 shows two groups of widgets. Semi-localization allows a large number of options to be visible, but is flexible enough to reduce hand occlusion by not requiring direct contact with a particular widget. In Figure

1106

Automatic Single Page Facing Continuous Default Layout: Continuous Facing Layout Options Default Zoom: 100% Maintain settings for all docs. Automatically update Remember opened files OK

Cancel

Accept

Figure 1: Two LocalSwipes widget groups, with an expanded combo box. The gesture visualizer is also displayed. 1, the user selects from a combo box using a gesture below the actual widget of interest. Ordinal multi-touch gestures activate a specific widget within a group. Single-finger interaction maps to the first four items in the group. Two fingers are used to interact with the second four items and so forth. Sets of four are compactly placed for easy visual recognition. The direction of the gesture dictates a single item. Similar to the directional cues in Escape (Yatani et al., 2008), triangular arrowheads at the end of each widget indicate the stroke direction. The number of arrowheads indicates the required number of fingers. The directional ordering of the widgets is always arranged so the arrows point outward from a set of four widgets. This allows the fingers to move towards the intended widget if the start position is centered in a group of four widgets. A gesture may be altered at any point during finger contact. The number of fingers can be increased or reduced to select from a different group and the fingers can move around the gesture’s start point to change the gesture direction. The item currently being selected by the gesture is highlighted to show that ending the gesture will result in the selection of that item. In Figure 1, both the combo box and an item from the popup are highlighted for selection. The final selection occurs when all fingers leave the surface. Finally, a gesture is canceled by moving the fingers back to the start location. A gesture visualizer aids in these interaction adjustments. The visualizer marks the start point with a semi-transparent circle and indicates the current gesture’s direction (see Figure 1). An expert user can perform the gestures quickly without using the visualizer or the widget highlighting. Multiple LocalSwipes gestures can be used to select items from multi-level widgets, such as menus and combo boxes. An initial gesture selects the combo box and activates a popup, which displays up to four values. If the popup contains more than four values, a vertical scrollbar appears to the right (see Figure 1). A single-finger gesture selects an element from the


popup and a two-finger gesture widget’s scrolling. These gestures are also cued with triangular arrowheads. This design visually reinforces the gesture technique. Maintaining the general look-and-feel of traditional widgets allows users to recognize the widget’s purpose. The simple and general visual gesture cues promote quick interaction recognition. Unlike marking menus, users are not required to memorize the relationship between a multitude of widgets and the possible gestures; the widgets clearly indicate the interaction possibilities. LocalSwipes gestures are scale independent. A minimum amount of movement is required to activate the gesture, but a gesture may be as large as the touch surface permits. Scale independence will be vital to increasing interaction accuracy in the mobile context. While walking, the hands and device will be in motion, increasing input error. Also, the user’s split attention focus will limit the concentration placed on touch precision. Larger gestures can reduce the ambiguity created by these types of noise. The number of LocalSwipes items accessible is dependent on several factors. For mobile devices, the use of more than three fingers may not be practical; this limits the design to twelve widgets per group. Increasing the directional options to eight allows for up to twenty four items per group. The number of LocalSwipes groups is dependent on the size of the screen size and each group’s size. Extremely small devices may only work effectively with a single finger. Widgets utilizing scrolling have a higher practical limit. However, increasing the set of invisible items increases the required interaction time. There is no minimal size an item must be in order for interaction to be efficient. Instead LocalSwipes accuracy is dependent on the widget-group size.

METHOD LocalSwipes was evaluated for map-based tasking as compared to traditional touch-to-click interaction. While it is important to evaluate this method against other gesture-based techniques, our objective was to assess LocalSwipes as a potential replacement for traditional widgets. An interface for controlling semi-autonomous robots based on the authors’ previous design (Hayes, Hooten, & Adams, 2010) was developed. Robots can be instructed to follow specified paths, cross dangerous areas, and search areas. Each task type provides options in the form of buttons and combo boxes to further customize the robot team’s actions. For each task, a user must select the task type and customize the task’s parameters. Each task requires interacting with three to seven combo boxes and buttons. Additional interactions are necessary to correct selection errors.

Procedure A within subjects evaluation was performed. The independent variable was the touch-interaction paradigm: touch-to-click and LocalSwipes. Both designs provided the same widgets and layout. The interfaces for each condition were used to provide high-level commands to simulated robots that executed the specified tasks.

1107

Each participant held the tablet with a non-dominant hand and interacted with the interface using their dominant hand. Participants were seated and allowed to rest their arms on their lap or a table. A ten minute training session was provided prior to completing both interaction conditions during which, participants were familiarized with the interface and the unique interaction features. Participants practiced by inputting eight robot tasks (each requiring multiple widget selections) and exploring the interface’s features until they were comfortable with the interactions. Task sets for the scenario were visible as a map overlay. Questions pertaining to the interface and its operation were encouraged during training. After the training, a 20 minute trial was completed with each interaction condition. The presentation order of the interaction conditions was randomized. The trials required participants to input twelve tasks. An equal number of each task type was performed across interaction conditions, but on different maps of similar complexity. The tasks were similar in difficulty, possessing equal numbers of fields to specify for both trials.

Apparatus The evaluation was performed on a Dell Latitude XT2 Tablet PC with a 1.33 GHz Intel Core™ 2 Duo CPU. The device has a 267 × 163 mm2 N-trig capacitive touchscreen with a 1280 × 800 px2 resolution. The interface was developed in C++ with the Qt 4.6.3 framework under Windows® 7.

Data Collection and Metrics Objective data (e.g., trial completion times, errors committed, etc.) were automatically logged. Additional dependent measures were the NASA TLX subjective workload measurement, Likert scale interface rankings, and direct interface comparisons. Questionnaires were administered after scenario completion for each condition and a comparison questionnaire was completed after both trials. Likert scale questions assessed the perceived interaction difficulty and ranged in value from 1 to 9 inclusive, with 9 being the best. Trial interaction time was measured as the time span between the start of the first task specification until the commitment of the last task. The time required for specifying areas on the map was removed because no widget interaction occurred during this time. This time span provides an accurate depiction of a participant’s total interaction time for each condition. Users’ errors directly related to the interaction condition were recorded. The main indicator of an input error is when a user corrects a previously entered value. Participants may have also knowingly or unknowingly left an incorrect value. This error type may be related to the participant’s understanding of the scenario and not the interaction technique. Based on observations, many errors were a result of selections that were contrary to the users’ intentions. Therefore, it is worthwhile to compare corrected errors between both interaction techniques.


Overall Mean

SD

Female Mean

Number of Errors

Table 1: Interaction time descriptive statistics Male

SD

Mean

SD

LocalSwipes 405.9

86.8

418.4 92.6 393.3

84.8

Touch-to-Click 347.6

70.3

333.5 72.9 361.7

69.3

Hypotheses

4.9

4.3

3.7 2.4

2.4

RESULTS The trial interaction time descriptive statistics are presented in Table 1. A Student’s t-test determined that trial interaction times for the touch-to-click interaction were significantly faster than the LocalSwipes interaction (t(15) = 2.80, p = 0.007). Interaction time differences exist by gender. While no significant difference exists across conditions for males, females were significantly faster with the touch-toclick interaction (t(7) = 3.24, p = 0.007). The mean interaction error counts with standard error bars are presented in Figure 2. These are errors users corrected. One participant performed 2.5 times more errors than any other participant and was removed as an outlier. A t-test found significantly fewer interaction errors committed with LocalSwipes (t(14) = 2.65, p = 0.009). There were no significant differences in errors by gender when using LocalSwipes (see Figure 2). However, females produced fewer touch-to-click interaction errors. The difference in errors between LocalSwipes and touch-to-click

8 6

6

7

LocalSwipes 6.5

7

6.5

2.4

Female

Male

was not significant for the females. Males did have significantly fewer errors with LocalSwipes, (t(7) = 2.13, p = 0.04). The median user preferences for buttons and combo boxes are presented by condition in Figure 3. Due to the ordinal nature of Likert scale data, Wilcoxon signed-rank tests were performed. Overall, participants significantly preferred LocalSwipes widgets to touch-to-click widgets (W = 376, p = 0.03). LocalSwipes was preferred for both individual widget types, but these differences were not significant. Females rated both touch-to-click buttons and combo boxes similarly. Females also rated all widgets similarly when comparing LocalSwipes to touch-to-click. Touch-to-click combo boxes were preferred by females, though not significantly. However, males significantly preferred LocalSwipes combo boxes (W = 11.5, p = 0.017) and LocalSwipes buttons (W = 15.5, p = 0.044). Overall, males significantly preferred LocalSwipes (W = 55.5, p < 0.01). No significant differences were found for the NASA TLX ratings. Overall the workload results were slightly higher for LocalSwipes (mean = 53.09 ± 12.12), than touch-to-click interaction (mean = 49.81±11.30). A similar trend exists for the individual components, except for frustration. Frustration was lower when using LocalSwipes (mean = 26.07 ± 20.57) as compared to touch-to-click interaction (mean = 32.93 ± 26.44). Confidence intervals (CI) for the ranking survey data were calculated using the adjusted Wald method. Lower bounds greater than 50% indicate significant results with a type I error rate of 0.05. LocalSwipes was significantly favored overall (75%, CI = 50.03% to 90.29%). LocalSwipes was rated as more natural and felt more productive by 72% of participants (CI = 54.46% to 84.60%). Overall, males significantly favored the LocalSwipes interaction (87.5%, CI = 50.78% to 99.89%). However, only 5 of 8 females preferred LocalSwipes interaction (CI = 30.38% to 86.51%).

Sixteen volunteer participants (8 women and 8 men) completed the evaluation. One participant was left handed. Thirteen participants reported using a touch-enabled device for one or more hours per week. The average participant age was 23.7 ± 7.9 years.

9

Touch-to-Click

Figure 2: Mean interaction error count per trail with standard error bars

Participants

Median Ratings

LocalSwipes

Overall

Performing gestures is likely more time consuming than a single touch interaction. Therefore, the hypothesis was that the LocalSwipes technique takes longer to perform the interactions than point-to-click, resulting in longer task completion time and longer overall trial completion time. However, it was also hypothesized that the LocalSwipes technique results in fewer input errors and will be generally preferred by users when compared to touch-to-click interactions.

6

7 6 5 4 3 2 1 0

1108

Touch-to-Click 6.5

8

7.5

6

5.5

6.5

7.5

8.5 5.5

3

1 Overall

Female

Male

Overall

Female

Male

Overall

Female

Buttons Combo Boxes All Widgets Figure 3: Median Likert Scale ratings. Higher numbers indicate higher preference.

Male


1109

DISCUSSION

ACKNOWLEDGMENTS

The results confirm the hypothesis that fewer errors occur when using LocalSwipes, but LocalSwipes results in longer trial completion times. However, it is often more desirable to perform a task correctly, rather than quickly. The effects of input errors may be dangerous in many real-time systems. We expect that interaction times will be reduced as users gain proficiency with the technique. Also, LocalSwipes is expected to be more robust to inaccurate input from mobile users. The efficiency of LocalSwipes may be hindered by the implementation. LocalSwipes was implemented from scratch and as a result there was a visually detectible lag in application responsiveness. The traditional widgets were created from a standard library and showed no delay. This issue has been fixed and we expect lower interaction times. Differences in performance and preference exist between males and females. The findings indicate that males create fewer errors and prefer LocalSwipes. Females, performed well with the LocalSwipes interaction, but performed better with the touch-to-click compared to LocalSwipes interaction, thus they did not have strong preferences. Comparing the errorcorrection data between genders showed no significant difference across interaction techniques for females, but a significant difference for males. An important difference between males and females is average finger size. On average, males have larger fingers (Peters, Hackeman, & Goldreich, 2009), thus they likely have more difficulty with touch-toclick interaction that relies on the accuracy of a single finger press over an occluded widget. If finger size is the most significant interaction difference between genders, then the performance is equally dependent on the widget sizes. Decreasing a widget’s size will increase the difficulty with both interaction techniques, but will have a larger detrimental impact on the touch-to-click interactions. A limitation of the current study is that participants were seated; however it provides an excellent baseline for the next step, mobile interaction. As individuals move (e.g., walk) and interact with a touch device, the ability to provide precise interaction may be degraded. It is expected that LocalSwipes will provide quicker, more accurate interaction for both genders under these conditions. A more efficient LocalSwipes implementation that also incorporates additional interaction capabilities will be evaluated with mobile users.

This research was partially supported by a contract from the US Marine Corps Systems Command to M2 Technologies, Inc., National Science Foundation Grant IIS-0643100, and by a Department of Defense National Defense Science and Engineering Graduate Fellowship. Additional partial support for underlying system development has been provided by the Office of Naval Research Multidisciplinary University Research Initiative Program award N000140710749.

CONCLUSION LocalSwipes is a novel interaction technique for mobile users that incorporates semi-localized, multi-touch gestures with familiar widgets that contain strong visual cues. An initial validation comparing LocalSwipes with touch-to-click interaction revealed that using LocalSwipes reduces interaction errors and is generally preferred by seated users over basic touch-to-click interaction, especially amongst males. However, trial completion times were slower for LocalSwipes. The design of LocalSwipes is intended to address limitations in existing gesture-based interaction and warrants further adaptation and evaluation in a mobile context.

REFERENCES Bailly, G., Lecolinet, E., & Guiard, Y. (2010). Finger-count & radial-stroke shortcuts: 2 techniques for augmenting linear menus on multi-touch surfaces. Proceedings of the 28th International Conference on Human Factors in Computing Systems (pp. 591-594). Atlanta, Georgia, USA: ACM. Bailly, G., Lecolinet, E., & Nigay, L. (2008). Flower menus: a new type of marking menu with large menu breadth, within groups and efficient expert mode memorization. Proceedings of the Working Conference on Advanced Visual Interfaces (pp. 15-22). Napoli, Italy: ACM. Benko, H., Wilson, A. D., & Baudisch, P. (2006). Precise selection techniques for multi-touch screens. Proceedings of the Conference on Human Factors in Computing Systems (pp. 1263-1272). Montréal, Québec, Canada: ACM. Hayes, S. T., Hooten, E. R., & Adams, J. A. (2010). Multi-touch interaction for tasking robots. Proceeding of the 5th ACM/IEEE International Conference on Human-Robot Interaction, HRI ’10 (pp. 97–98). New York, NY, USA: ACM. Holz, C., & Baudisch, P. (2010). The generalized perceived input point model and how to double touch accuracy by extracting fingerprints. Proceedings of the 28th International Conference on Human Factors in Computing Systems (pp. 581-590). Atlanta, Georgia, USA: ACM. Kurtenbach, G. P., Sellen, A. J., & Buxton, W. A. S. (1993). An empirical evaluation of some articulatory and cognitive aspects of marking menus. Human-Computer Interaction, 8(1), 1-23. Lepinski, G. J., Grossman, T., & Fitzmaurice, G. (2010). The design and evaluation of multitouch marking menus. Proceedings of the 28th International Conference on Human Factors in Computing Systems (pp. 2233-2242). Atlanta, Georgia, USA. Moscovich, T. (2009). Contact area interaction with sliding widgets. Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology (pp. 13-22). New York, NY, USA: ACM. Parhi, P., Karlson, A. K., & Bederson, B. B. (2006). Target size study for onehanded thumb use on small touchscreen devices. Proceedings of the 8th Conference on Human-Computer Interaction with Mobile Devices and Services, MobileHCI ’06 (pp. 203–210). New York, NY, USA: ACM. Peters, R. M., Hackeman, E., & Goldreich, D. (2009). Diminutive digits discern delicate details: fingertip size and the sex difference in tactile spatial acuity. J. Neurosci., 29(50), 15756-15761. Potter, R. L., Weldon, L. J., & Shneiderman, B. (1988). Improving the accuracy of touch screens: an experimental evaluation of three strategies. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’88 (pp. 27–32). New York, NY, USA: ACM. Sears, A., & Shneiderman, Ben. (1991). High precision touchscreens: design strategies and comparisons with a mouse. International Journal of ManMachine Studies, 34(4), 593-613. Wickens, C. D., Lee, J. D., Liu, Y., & Gordon-Becker, S. (2004). Introduction to Human Factors Engineering (2nd ed.). Upper Saddle River, NJ: Prentice Hall. Yatani, K., Partridge, K., Bern, M., & Newman, M. W. (2008). Escape: a target selection technique using visually-cued gestures. Proceeding of the 26th International Conference on Human Factors in Computing Systems (pp. 285-294). New York, NY, USA: ACM. Zhao, S., Agrawala, M., & Hinckley, K. (2006). Zone and polygon menus: using relative position to increase the breadth of multi-stroke marking menus. In R. Grinter, T. Rodden, P. Aoki, E. Cutrell, R. Jeffries, & G. Olson (Eds.), Proceedings of the 26th International Conference on Human Factors in Computing Systems (pp. 1077-1086). New York, NY, USA:ACM