Tool Independence for the Web Accessibility ...

2 downloads 2661 Views 1MB Size Report
same accessibility value as another web page fulfilling all priority 1 checkpoints and ..... of large web sites that can host hundreds of thousands of web pages.
This  is  the  author  copy  of  Tool  Independence  for  the  Web  Accessibility  Quantitative  Metric.  Disability   &  Rehabilitation:  Assistive  Technology  4(4),  248-­263.  Informa  Healthcare.  Available  at   http://informahealthcare.com/doi/abs/10.1080/17483100902903291   Note  that  there  might  be  some  inconsistencies  between  this  and  the  above  publication  so  use  this  copy  at   your  own  risk.

  Tool  Independence  for  the  Web  Accessibility  Quantitative  Metric    

 

Markel  Vigo1,  Giorgio  Brajnik2,  Myriam  Arrue1  and  Julio  Abascal1   1   University  of  the  Basque  Country,  Informatika  Fakultatea,  20018  Donostia,  Spain.   markel,  myriam,  [email protected]   2

 Dipartimento  di  Matematica  e  Informatica,  Università  de  Udine,  33100  Udine,  Italy   [email protected]  

  Abstract.   The   Web   Accessibility   Quantitative   Metric   (WAQM)   aims   at   accurately   measuring  the  accessibility  of  web  pages.  One  of  the  main  features  of  WAQM  among   others   is   that   it   is   evaluation   tool   independent   for   ranking   and   accessibility   monitoring   scenarios.  This  paper  proposes  a  method  to  attain  evaluation  tool  independence  for  all   foreseeable   scenarios.   After   demonstrating   that   homepages   have   a   similar   error   profile   than   any   other   web   page   in   a   given   web   site,   15   homepages   were   measured   with  10000  different  values  of  WAQM  parameters  using  the  automatic  evaluation  tools   for   accessibility   EvalAccess   and   LIFT.   Similar   procedure   was   followed   with   random   pages  and  with  several  test  files  obtaining  several  tuples  that  minimize  the  difference   between   both   tools.   1449   web   pages   from   15   web   sites   were   measured   with   these   tuples   and   those   values   that   minimized   the   difference   between   the   tools   were   selected.   Once   WAQM   was   tuned,   the   accessibility   of   15   web   sites   was   measured   with   two   metrics   for   web   sites   concluding   that   even   if   similar   values   can   be   produced,   obtaining   the   same   scores   is   undesirable   since   evaluation   tools   behave   in   a   different   way.     1  Introduction   In   recent   years   a   great   deal   of   research   has   been   carried   out   in   the   field   of   metrics   for   web  accessibility  assessment,  that  is,  how  to  measure  the  accessibility  level  of  a  web   site.  Measuring  the  accessibility  level  is  necessary  since  more  and  more  scenarios  are   based   on   such   levels   (see   examples   below),   and   in   many   cases   they   require   high   accuracy  in  order  to  rate  and  assess  the  accessibility  of  web  sites.   The  most  broadly  used  metric  for  accessibility  is  the  qualitative  metric  proposed  by  the   Web   Accessibility   Initiative1   (WAI)   in   the   context   of   Web   Accessibility   Content   Guidelines  1.0  (Chisholm  et  al.,  1999)  which  are  used  for  measuring  the  conformance   of   a   webpage   with   the   aforementioned   guideline   set.   A   web   page   satisfying   all   checkpoints  obtains  a  “AAA”  rating,  if  it  satisfies  all  priority  1  and  priority  2  checkpoints   it   gets   “AA”,   the   “A”   rating   is   obtained   if   only   all   priority   1   checkpoints   are   satisfied,  

1

 The  Web  Accessibility  Initiative,  WAI.  Available  at  http://www.w3.org/WAI/  

and  finally  the  web  page  is  “non  conformant”  otherwise.  Notice  how  by  just  violating  a   single  priority  1  checkpoint  the  page  becomes  non  conformant.   This   metric   (that   leads   to   ordered   symbolic   values   {AAA,   AA,   A,   NC})   is   not   precise   enough   in   order   to   rate   and   classify   web   applications   according   to   their   accessibility   level.   For   example,   a   web   page   fulfilling   all   priority   1   checkpoints   would   obtain   the   same   accessibility   value   as   another   web   page   fulfilling   all   priority   1   checkpoints   and   almost  all  priority  2  checkpoints:  both  of  them  would  get  the  A  level  conformance.  This   criterion  seems  to  be  based  on  the  assumption  that  if  a  web  page  fails  to  accomplish   one  of  the  checkpoints  in  a  level,  it  is  so  inaccessible  as  if  it  failed  to  fulfil  all  of  them.   Although  this  is  possible,  several  scenarios  exist  that  require  a  higher  resolution  for  the   metric  that  go  beyond  a  small  ordered  scale  of  accessibility  scores.   Quality  Assurance  (QA)  within  Web  Engineering   Web   Engineering   defines   specific   methodologies,   models   and   techniques   for   web   applications  development.  Since  its  final  goal  is  to  obtain  high  quality  web  applications,   a   Quality   Assurance   process   is   of   a   paramount   importance.   This   entails   the   application   of   metrics,   methods   and   quality   models   throughout   the   development   process.   As   a   consequence,   measurement   of   web   usability   and   accessibility   should   be   performed   during  the  different  stages  of  the  lifecycle  and  should  be  capable  to  yield  precise  and   standard   results;   precision   is   needed   in   order   to   be   able   to   distinguish   levels   of   accessibility  that  may  be  close  to  each  other,  and  standardization  is  needed  in  order  to   be  able  to  compare  different  versions  of  a  web  site.   In   this   sense,   quality   models   such   as   2QCV3Q   (Mich   et   al.,   2003)   and   WebQEM   (Olsina   and   Rossi,   2002)   have   been   defined.   The   characteristics   of   web   applications   and   the   necessary   metrics   for   their   evaluation   are   included   in   these   models.   In   both   quality   models,   web   accessibility   is   an   attribute   which   should   be   measured   in   order   to   guarantee  the  quality  of  the  product.  For  this  reason,  quantitative  metrics  are  essential   in   order   to   accurately   measure   usability   and   accessibility   properties   since   both   are   needed   for   QA.   In   addition,   ranking   prototypes   of   a   product   according   to   their   accessibility   level   is   useful   in   order   to   assess   the   impact   of   changes,   updates   in   functionalities,   etc.   in   any   iterative   development   process.   Therefore,   quantitative   metrics   for   measuring   and   ranking   prototypes   according   to   their   accessibility   level   would  be  useful.   Considering  Web  Accessibility  in  Information  Retrieval   Ivory  et  al.  (2004)  conducted  a  study  with  visual  impaired  users  in  order  to  determine   the  factors  which  would  improve  search  engine  results  for  those  users.  It  is  concluded   that   some   users   would   like   to   know   additional   details   about   search   results,   such   as   whether   retrieved   pages   are   accessible   to   them   or   not,   and   the   paper   recommends   sorting   results   according   to   accessibility   or   usability   criteria.   Re-­‐ranking   results   according  to  users’  visual  abilities  would  improve  their  search  experience.   In   this   sense,   Google   has   launched   “Google   Accessible   Search”2   where   results   are   ordered  according  the  criteria  stated  in  their  FAQ:  “pages  with  few  visual  distractions   and  pages  that  are  likely  to  render  well  with  images  turned  off”.  However,  this  is  not  a   2

 Available  at  http://labs.google.com/accessible/  

comprehensive   approach   since   only   some   guidelines   for   visually   impaired   users   are   being  considered  and  users  with  other  type  of  disabilities  are  not  taken  into  account  at   all.   Arrue   et   al.   (2008)   propose   a   model   to   adequately   combine   results   provided   by   traditional  IR  systems,  such  as  search  engines  for  the  WWW,  with  accurate  accessibility   measurement.   Results   are   re-­‐ranked   according   to   their   accessibility   level   or   they   are   shown  in  the  order  provided  by  the  search  engine  but  labelled  with  their  accessibility   score.   The   former   modality   attaches   importance   to   the   accessibility   of   results,   while   in   the  latter  preference  is  given  to  the  relevance  of  the  results  for  a  given  query.  In  any   case,  these  modalities  can  only  be  achieved  if  accessibility  can  be  measured.   Accessibility  Monitoring   Once  a  web  site  has  been  developed,  keeping  track  of  the  evolution  of  its  accessibility   level  has  a  paramount  importance  since  it  may  be  framed  by  legal  restrictions.  Due  to   the   nature   of   the   WWW,   updates   in   web   sites   are   quite   frequent   and   little   control   can   be   exerted   on   its   content   (for   example,   consider   collaborative   web   sites   like   Flickr   or   a   news  web  site  like  CNN.com).  Since  updates  can  decrease  the  accessibility  level,  some   web   sites   find   themselves   in   limbo   situation   which   can   result   in   legal   issues   and   possibly   administrative   fines.   Therefore,   monitoring   of   the   evolution   of   web   accessibility   requires   metrics   that   are   precise   enough   in   order   to   avoid   those   circumstances.   These   metrics   should   be   used   for   ranking   purposes,   so   ordinal   values   may  be  enough.     Accessibility  monitoring  processes  may  be  also  helpful  for  public  institutions  in  order   to  keep  track  of  their  e-­‐government  web  sites’  accessibility  level.  In  addition,  it  may  be   useful  for  comparing  the  accessibility  level  of  different  web  sites  and  creating  ranking   lists  as  an  accessibility  observatory.   Vigo  et  al.  (2007)  report  a  study  on  the  behaviour  of  the  Web  Accessibility  Qualitative   Metric   (WAQM)   when   evaluating   the   accessibility   of   web   sites   with   different   tools,   EvalAccess   and   LIFT.   Using   automated   tools   for   testing   accessibility   is   seen   by   many   as   one   of   the   most   effective   way   to   cope   with   accessibility,   since   only   by   using   tools   (possibly  in  addition  to  human  judgment)  the  solutions  outlined  in  previous  scenarios   can  be  made  viable.  In  the  study,  1363  web  pages  from  15  web  sites  were  evaluated   against   WCAG   1.0   using   both   tools   and   measured   using   WAQM.   The   conclusion   was   that   values   produced   by   WAQM   on   the   basis   of   data   obtained   by   the   two   different   tools   are   completely   different,   but   there   exists   a   strong   correlation;   Spearman’s   correlation  test  on  the  accessibility  index  produced  by  WAQM  leads  to  ρ(1363)=0.719   with   a   high   significance   level   (p