On the Gap between Automated and In-Vivo ... - Semantic Scholar

1 downloads 129 Views 1MB Size Report
requirements of users with disabilities, in order to achieve personalised accessi- ... cedures are out of reach of software (e.g., verifying if the text of the alt attribute ...
On the Gap between Automated and In-Vivo Evaluations of Web Accessibility Rui Lopes and Luís Carriço LaSIGE/University of Lisbon Campo Grande, Edifício C6, Piso III 1749-016 Lisboa, Portugal {rlopes,lmc}@di.fc.ul.pt

Abstract. In this paper we present an accessibility analysis framework for the specification of Web accessibility evaluation scenarios that differentiates the requirements of users with disabilities, in order to achieve personalised accessibility assessment. We apply this framework to evaluations of Web Accessibility (e.g., WCAG 2.0 conformance) and to usability evaluations with users with disabilities. With this framework we leverage usability tests with these users in order to discern the non-automatable parts of WCAG and, consequently, skip the requirement of having experts analyse these parts for each personalisation scenario. Keywords: Personalised Web Accessibility, Accessibility Theory, Accessibility Requirements.

1 Introduction Accessibility, the ability to access information and services, is at the deep root of Human-Computer Interaction. It is the central concept of e-Inclusion for people with disabilities such as blindness, deafness, etc. However, a broader spectrum of audiences is highly dependent on accessibility factors. Accessibility must be taken into account in different situations, e.g., due to limitations/features of interaction devices (such as mobile phones, in-car multimedia systems, etc.), as well as environment and situational aspects (e.g., lighting, noise, device handling). Therefore, the classical perspective of accessibility should take into not just the user with disabilities but also the concept that puts the user interacting with a device in a particular environment. While in particular application domains the scope of accessibility can be constrained into more restricted audiences, there are other cases where a more broad view is required. Web accessibility is one of these cases for several reasons, including: • The Web is growing in size. As the Web’s contents become more diverse, it attracts more users. Consequently, the diversity of audiences widens and, with this fact, more accessibility-dependent interactivity becomes more important; • The underlying architecture of the Web is decentralised. Based on the previous fact, the Web’s growth can be directly related to its decentralised architecture [1], allowing any person to publish a Web page without requiring any kind of quality C. Stephanidis (Ed.): Universal Access in HCI, Part III, HCII 2009, LNCS 5616, pp. 735–744, 2009. © Springer-Verlag Berlin Heidelberg 2009

736

R. Lopes and L. Carriço

control. Consequently, no user needs are taken into account (apart from the page creator’s point-of-view), thus Web accessibility is dismissed to a wide range of audiences; • The attention span on the Web is highly decreased, when comparing to traditional software. Every Website is “a click away”, and each user’s attention span is shortened if interaction difficulties are found, pushing the user onto other information sources on the Web. This issue becomes highly critical on accessibility-dependent audiences, since the cost of “jumping to the next Website” is negligent. For all of these reasons, it is paramount to understand whether a given Web site is accessible to people with disabilities, by taking into account the synergies and differences between each user (and the device and environment where interaction with the Web is performed), thus achieving a personalised accessibility assessment. Currently, most usability experts, developers, and designers (hereinafter we will refer to this group as experts) typically apply at least one of two complementary approaches for testing Web site accessibility: • WCAG compliance. Through the Web Accessibility Initiative, W3C provides the Web Content Accessibility Guidelines [2], currently at version 2.0, with a set of checkpoints that experts use to verify if a given Web page is accessible, such as testing if an image element has alternative text. These guidelines (and similar approaches such as Section 508 [3]) strive for providing a baseline for universal accessibility on evaluated Web pages. Based on WCAG, several software tools aid experts on ensuring guidelines compliance and establish quality levels on resulting assessments (from A to AAA). Each checkpoint is verified in one of two ways: Automated testing. A checkpoint can be fully automated if its applicability criteria can be performed by software (e.g., detecting the presence of a specific HTML element), and if the testing procedures are also fully automatable (e.g., checking if an img element has an alt attribute). When accessibility compliance fails, an error is associated with the checkpoint; Semi-automated testing. When a checkpoint cannot be fully automated, the applicability criteria can be performed by software, but the corresponding testing procedures are out of reach of software (e.g., verifying if the text of the alt attribute of an img element has the semantics of the displayed image). In this case, a warning is associated with the checkpoint, thus relying on the expert to verify if compliance is met. • Usability testing. Experts usually gather a set of users representative of a Web site’s target audience and verify if any issues arise in what respects to its usability, friendliness, experience, etc. While not all aspects of accessibility can be gathered with traditional usability testing [4], by choosing an appropriate set of users, usability testing can nevertheless yield interesting information about a Web site’s accessibility. As in WCAG compliance, usability tests also provide answers to universal accessibility, through the aggregation of the results from these studies (e.g., gathered through end-user questionnaires). However, both approaches strive for universal accessibility without knowing what is wrong for each specific user scenario (e.g., a specific disability). On the WCAG compliance side, quality marks provide an overall feeling on how accessible a Web

On the Gap between Automated and In-Vivo Evaluations of Web Accessibility

737

site is, disregarding e.g. if a checkpoint concerning a visually impaired person fails on a non-critical section of a Web page (thus dismissing the personalised aspect). On the usability testing side, its results might be skewed due to over-focusing on a Web site’s functional features, and not so much on the way accessibility is tailored. Furthermore, usability analysis tends to dismiss outliers and look at the average results, thus potentially dismissing an accessibility problem faced by just a single user. This typically happens since the cost of having a high number of participants in usability studies is prohibitive, especially if this includes users with disabilities. Consequently, this mismatch can result in incompatible outcomes between both approaches. In this paper we present an accessibility analysis framework for the definition evaluation requirements centred on Web accessibility and associated evaluation methodologies towards personalised accessibility. We provide a set of constructs to specify audiences based on users, devices, and environments characteristics, and bind them to corresponding semantics of WCAG 2.0 compliance. Afterwards, we apply this framework how the effort of usability test planning can be reduced and tailored to different users and usage scenarios that are centred on providing the best user experience for accessibility dependent users.

2 An Analysis Framework for Web Accessibility Evaluation Usability testing with real users is commonly divided into three steps [5], as depicted in Figure 1: first, planning involves selecting users, which tasks they have to perform with a given user interface, and define questionnaires for users to fill; second, the actual testing is performed and users fill the questionnaires accordingly; lastly, experts analyse how users interacted and gathered corresponding data (including the questionnaires).

Fig. 1. Usability testing lifecycle

In this paper we focus on the first stage of this lifecycle, the planning of usability tests, with the definition of an analysis framework properly tailored to personalised accessibility assessment of Web sites, by leveraging the potentials of both WCAG 2.0 compliance testing and traditional usability testing. The way this analysis framework focuses on personalised accessibility assessment centres on the kernel concept of mapping the characteristics of an audience (i.e., a grouping of user, device and environment characteristics) into WCAG checkpoints that are appropriate for that audience. For this we have unfolded it into three sequential phases, as presented in Figure 2: Audience modelling, WCAG testing, and Task and Questionnaire definition.

738

R. Lopes and L. Carriço

Fig. 2. Phases of the analysis framework

The first phase of the framework concerns an analysis of which audiences are representative for the Website. This includes studying user abilities and disabilities, device characteristics, environment settings, and how all of them make sense in the context of expected scenarios of interaction with the Website. The way this information is gathered depends on several factors, including expert’s knowledge, application requirements, or even from early results of contextual analysis [6]. After defining the target audiences, the second phase of the analysis framework is put into action. We have introduced here an early testing phase based on the results provided by WCAG conformance checking tools, as opposed to the traditional testing with users. Based on the modelled audiences, WCAG testing is properly tailored to these, yielding answers on whether the Website is accessible or not for such audiences. However, since part of WCAG is not automatable, further analysis must be performed based on the results from this testing step. Consequently, we have defined a third phase in the analysis framework concerning the definition of tasks and questionnaires for usability tests. Based on the warnings returned from WCAG testing for each audience, tasks and questionnaires are tailored towards each user corresponding to each audience. This goes beyond current Web accessibility assessment practices, in that it is the outcome of usability tests and corresponding questionnaires that will guide the answer on whether a given WCAG warning is verified positively or negatively, instead of relying on expert analysis. Next, we further detail each one of these phases, as well as the way they bind to each other.

3 Audience Modelling We have defined a set of modelling concepts to specify audiences, based on ontology engineering practices [7] with the OWL semantic Web modelling language [8]. Figure 3 presents the concepts and corresponding relationships for the definition of audiences. In a nutshell, the main concepts for defining audiences are ev:AudienceDomain, ev:AudienceClass, and ev:AudienceCharacteristic. An audience domain is composed by a set of audience classes. Each audience class represents the conjunction of user, device, and environment characteristics. Several relationships can be established to describe the semantics of joining characteristics into audiences, for model coherence purposes (e.g., ensuring that no audience is targeted to a totally blind person with a computer screen output). More details on how to apply this meta-model to define audiences and their semantics have been already described elsewhere [9, 10], including a set of taxonomically organised concepts for describing users, devices, and environments.

On the Gap between Automated and In-Vivo Evaluations of Web Accessibility

739

Fig. 3. Audience meta-model

4 WCAG Testing After the definition of audiences according to the modelling phase detailed above, the analysis framework shifts its focus to WCAG testing. This is supported directly by the framework, through a small meta-model that provides support for mapping individual tests to audience characteristics, as depicted in Figure 4.

Fig. 4. Audience characteristic mapping into tests

This mapping relies on a single property that establishes the dependency between characteristics and actual Web accessibility tests. These tests are described according to the EARL reporting language [11], where a single accessibility test verifies if a given WCAG checkpoint is complied by the Website being evaluated. Consequently,

740

R. Lopes and L. Carriço

with this mapping, experts can analyse which checkpoints have been passed in the Website, for each audience modelled in the previous phase. We have implemented a software component that leverages these concepts and mappings between checkpoints and audience characteristics [12] and delivers answers tailored to personalised Web accessibility evaluations based on WCAG. This component answers to questions such as “is this Website accessible to partial sighted users?” This is accomplished by selecting the subset of WCAG checkpoints that have been mapped to a partial sight disability, and test a Website for their compliance. Regarding the errors rose by automatic verifications. With such kind of software implementing this evaluation procedure, experts can begin analysing what accessibility problems might arise for each audience and, consequently, define usability tests regarding the warnings raised by the evaluation software.

5 Task and Questionnaire Definition The last phase of the analysis framework concerns setting up usability tests for accessibility-dependent users. This phase is different from traditional usability testing in that each personalisation is at the core of the analysis framework. Consequently, the definition of tasks and creation of questionnaires is dependent on the audiences modelled in the first phase of the analysis framework. We enforce the notion that this framework is only targeted to issues directly covered by Web accessibility evaluation procedures (e.g., such as those based in WCAG), thus serving as a complement to traditional usability testing. This phase is further decomposed in the following steps: • Audience selection. The expert begins by selecting one audience from the set of audiences modelled in the first phase of the analysis framework. • Checkpoint selection. Based on the mappings between audience characteristics and checkpoints, an expert selects a checkpoint that has both a correspondence to a characteristic of the selected audience, and for which the evaluation has yield a warning. As an example to illustrate this situation, an expert selects an audience representing the typical setup for a user that is totally blind: totally blindness disability, screen reader software, keyboard. While several checkpoints can be fully automated for this audience, there are some that require expert analysis for its verification. From WCAG 2.0, guideline 1.1.1 states1, “All non-text content that is presented to the user has a text alternative that serves the equivalent purpose, except for the situations listed below.” This guideline details a list of techniques2 to detect this, e.g., by verifying if the alternative texts of image elements are equivalent (i.e., if the text describes the image properly). Since this cannot be verified automatically, this is flagged as a warning for all image element instances present in every page of a Website, thus becoming a candidate test for defining a task and a question to detect this issue through usability testing. • Task definition. Based on the scenario detailed in the previous point, the expert must define a task properly tailored to the selected audience. Extending the 1 2

http://www.w3.org/TR/WCAG20/#text-equiv. http://www.w3.org/WAI/WCAG20/quickref/#qr-text-equiv-all.

On the Gap between Automated and In-Vivo Evaluations of Web Accessibility

741

scenario, an image present on a Web page conveys information depicted as a plot chart (e.g., representing tabular data with millions of cells). Consequently, the expert must define a task that is centred on this piece of information, as if there was no disability present, and at the same time, capture the essence of the test (as detailed for all techniques3). The way each task is defined must be deeply rooted on the purpose of the conveyed information, and should take into account all the characteristics of the selected audience (e.g., no “click on the picture” subtasks for the cases where no mouse-based interaction is possible). • Question definition. Lastly, the expert must capture whether if the user performing the defined tasks has perform them correctly and which level of accomplishment is inherent of that interaction. For that, traditional usability questionnaires are used commonly. However, these must also be tailored to the tasks defined earlier. The questions must be formulated in such a way that the user validates (either positively or negatively) the corresponding WCAG checkpoint. The aggregation of results from all users (of the selected audience) performing usability tests will dictate whether the checkpoint has passed or not. This is one crucial distinction from traditional usability testing, since with this analysis framework experts have detailed answers for personalised accessibility. Each one of these steps is to be performed by the expert iteratively until tasks and questionnaires have been defined for each modelled audience and corresponding checkpoints signalled with a warning. Task list and questionnaires should be generated according to each audience and interfaces on the other phases of the usability testing lifecycle (as depicted earlier in Figure 1).

6 Discussion We believe that this approach on bridging the gap between WCAG compliance checking and usability testing applied to accessibility provides more accurate and more detailed answers of whether a Website is accessible to target audiences. We argue that this is due to ending with the practice of expert analysis (since each expert might induce a bias on analysing each checkpoint), and introducing real users in the actual process of accessibility evaluation with usability testing. Furthermore, the personalised approach to usability evaluation as detailed in this paper follows the goal of achieving both Universal Accessibility [13] and Universal Usability [14]. Since the proposed analysis framework is based on the definition of audiences, we also believe that this approach is applicable to other phases of software development process. Audience modelling can be triggered right from the beginning on requirements gathering, and the proposed modelling technique can be used to define nonfunctional requirements of a Website (or Web application). Furthermore, development of tailored user interfaces can be also defined through audience modelling, e.g., with the aid of aspect-oriented development practices (where an aspect corresponds to an audience and implements its particularities). 3

http://www.w3.org/TR/2008/NOTE-WCAG20-TECHS-20081211/G92.

742

R. Lopes and L. Carriço

7 Related Work Developers can leverage existing accessibility assessment tools to check a Web page's conformance to guidelines such as WCAG (a list of such tools is maintained by WAI [15]). It is known that these guidelines target several disabilities (e.g., total blindness, colour blindness, hearing impairment, etc.), but this type of mapping is usually targeted to user groups [16, 17] just at an analysis level, not to individuals and their particular requirements in assessment procedures. In [18] the authors present UGL, an XML based modelling language to specify accessibility guidelines. This language has been successfully applied in personalised Web accessibility studies [19]. While the proposed approach does provide answers to personalised Web accessibility, it limits their application for studying richer semantics of personalised Web accessibility assessment (e.g., querying accessibility knowledge both from evaluations and guidelines specification themselves). On the ontologies side, work has already explored accessibility concerns. These efforts focus on covering end user requirements for accessibility. ICF [20] has been a common ground foundation to describe disabilities, synthesised by Obrenovic et al. [21] to model multi-modal interaction design, and used as the central point to model info-mobility services within the ASK-IT project [22]. More often than not, this type of approaches are focused on developing accessible and personalised Web sites (e.g., through adaptive user interfaces [23]), not on the evaluation side.

8 Conclusions and Future Work This paper presented a novel approach to the evaluation of Web accessibility that takes into account personalisation. We detailed a three-phase analysis framework encompassing audience modelling, WCAG compliance checking, and task and questionnaire definition. By applying these phases, experts such as usability experts, designers, or developers can plan usability tests properly tailored to accessibilitydependent users and, consequently have more reliable results to ensure if a given Website is accessible to its target audiences. Regarding future work, we are currently working on integrating this analysis framework with questionnaire answering software tailored to audience modelling both in its UI (therefore tailored to users’ requirements) and the collection of results from the questionnaires, thus enabling an improved automation of usability tests tailored to Web accessibility. Acknowledgements. This work is funded by the EU FP7 project ACCESSIBLE, contract no. 224145, and by FCT through the SFRH/BD/29150/2006 grant and the Multiannual Funding Programme.

References 1. Jacobs, I., Walsh, N. (eds.): Architecture of the World Wide Web, Volume One. W3C Recommendation, World Wide Web Consortium (2004), http://www.w3.org/TR/webarch/

On the Gap between Automated and In-Vivo Evaluations of Web Accessibility

743

2. Caldwell, B., Chisholm, W., Slatin, J., Vanderheiden, G. (eds.): Web Content Accessibility Guidelines 2.0. W3C Recommendation, World Wide Web Consortium (2008), http://www.w3.org/TR/WCAG20/ 3. Section 508 Amendment to the Rehabilitation Act of 1973, http://www.section508.gov/ 4. Petrie, H., Kheir, O.: The Relationship Between Accessibility and Usability of Websites. In: CHI 2007: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 397–406. ACM Press, New York (2007) 5. Mayhew, D.J.: The Usability Engineering Lifecycle: A Practitioner’s Handbook for User Interface Design. Morgan Kaufmann, San Francisco (1999) 6. Beyer, H., Holtzblatt, K.: Contextual Design: A Customer-Centered Approach to Systems Design. Morgan Kaufmann, San Francisco (1998) 7. Smith, M.K., Welty, C., McGuinness, D.L. (eds.): OWL Web Ontology Language Guide. W3C Recommendation. World Wide Web Consortium (2004), http://www.w3.org/TR/owl-guide/ 8. Schreiber, G., Dean, M. (eds.): OWL Web Ontology Language Reference. W3C Recommendation. World Wide Web Consortium (2004), http://www.w3.org/TR/owl-ref/ 9. Lopes, R., Carriço, L.: Querying Accessibility Knowledge from Web Graphs. In: The Handbook of Research on Social Dimensions of Semantic Technologies. IGI Global (2009) (accepted) 10. Lopes, R., Carriço, L.: The Impact of Accessibility Assessment in Macro Scale Universal Usability Studies of the Web. In: W4A: 5th ACM International Cross-Disciplinary Conference on Web Accessibility, Beijing, China, April 21-22 (2008) 11. Abou-Zahra, S. (ed.): Evaluation and Reporting Language (EARL) 1.0 Schema. W3C Working Draft. World Wide Web Consortium (2007), http://www.w3.org/TR/EARL10-Schema/ 12. Lopes, R., Votis, K., Carriço, L., Likothanassis, S., Tzovaras, D.: Towards the Universal Semantic Assessment of Accessibility. In: 24th Annual ACM Symposium on Applied Computing (SAC), Waikiki Beach, Honolulu, Hawaii, USA, March 8-12 (2009) 13. Obrenovic, Z., Abascal, J., Starcevic, D.: Universal Accessibility as a Multimodal Design Issue. Commun. ACM 50(5), 83–88 (2007) 14. Shneiderman, B.: Universal Usability. Commun. ACM 43(5), 84–91 (2000) 15. Selecting Web Accessibility Evaluation Tools. World Wide Web Consortium, http://www.w3.org/WAI/eval/selectingtools.html 16. Brajnik, G.: Web Accessibility Testing: When the Method Is the Culprit. Computers Helping People with Special Needs, 156–163 (2006) 17. Report to the Access Board: Refreshed Accessibility Standards and Guidelines in Telecommunications and Electronic and Information Technology (2008), http://www.access-board.org/sec508/refresh/report/#83 18. Arrue, M., Vigo, M., Aizpurua, A., Abascal, J.: Accessibility Guidelines Management Framework. In: Stephanidis, C. (ed.) HCI 2007. LNCS, vol. 4556, pp. 3–10. Springer, Heidelberg (2007) 19. Vigo, M., Arrue, M., Brajnik, G., Lomuscio, R., Abascal, J.: Quantitative Metrics for Measuring Web Accessibility. W4A 2007. In: Proceedings of the 2007 international crossdisciplinary conference on Web accessibility, Banff, Canada. ACM, New York (2007) 20. International Classification of Functioning, Disability and Health. World Health Organisation, http://www.who.int/classifications/icf/site/

744

R. Lopes and L. Carriço

21. Obrenovic, Z., Starcevic, D.: Modeling Multimodal Human-Computer Interaction. Computer 37(9), 65–72 (2004) 22. ASK-IT Ontological Framework: Public Deliverable D1.7.1. ASK-IT FP6 Integrated Project (2007), http://www.ask-it.org/ 23. Gauch, S., Speretta, M., Chandramouli, A., Micarelli, A.: The Adaptive Web. In: User Profiles for Personalized Information Access, pp. 54–89 (2007)