Tell Me Lies: A Methodology for Rigorous Security User Studies

3 downloads 0 Views 74KB Size Report
Tell Me Lies: A Methodology for Scientif- ... even more attention to detail when designing user studies because data ... at risk, in order for us to yield valid results.
Tell Me Lies: A Methodology for Scientifically Rigorous Security User Studies Serge Egelman

Abstract

Brown University

Studies that examine users’ perceptions of online privacy and

Providence, RI 02912

security are especially difficult to design because the study

[email protected]

participant must be in a similar mindset as they would be in real life. To test a system designed to protect users from

Janice Y. Tsai

a known risk, study participants must be made to believe

California Council on Science & Technology

that they are actually at risk, otherwise their resulting be-

Sacramento, CA 95814

haviors cannot be generalized to the real world. At the same

[email protected]

time, ethics issues arise when the study participant is actually placed at risk. In this paper, we describe our methodologies

Lorrie F. Cranor

when performing usable security experiments, and we argue

Carnegie Mellon University

that deception is a necessary component when performing

Pittsburgh, PA 15213

human subjects experiments in the areas of privacy and se-

[email protected]

curity.

Keywords Security, privacy, usability, study methodologies, scientific validity

ACM Classification Keywords H.5.2 Information Interfaces and Presentation: User interCopyright is held by the author/owner(s).

faces – Evaluation/ methodology, H.5.2 User Interfaces:

CHI 2010, April 10–15, 2010. Atlanta, Georgia, USA

User-centered design, H.5.3 Group and Organization In-

ACM 978-1-60558-930-5/10/04.

terfaces: Evaluation/methodology, collaborative computing; K.4.1 Public Policy Issues: Privacy.

Introduction

In this paper, we argue that deception is often required when

The results of an experiment are only as valid as the method-

conducting usability studies of online privacy and security

ology used to design said experiment. The study of human

systems. We specifically discuss two types of lies that we

factors in online privacy and security is an area that requires

tell study participants:

even more attention to detail when designing user studies because data on the user’s primary task is rarely the objec-

1. Priming can be minimized by deceiving participants

tive. That is, users do not usually sit down at the computer

about the purpose of the study and introducing sub-

to “do security;” security is often seen as an impediment to

terfuge tasks.

completing another task, and it is not a task unto itself [5]. Usability studies that frame security as the primary task are

2. Observed trust decisions are only generalizable when participants are led to believe they are at risk.

often flawed because their results cannot be generalized to users’ behavior in their natural environments. As an addi-

In the next section, we explain why priming is especially a

tional constraint, if study participants are aware that online

problem for usable security researchers and why deceiving

security behaviors are being studied, they may alter their ac-

study participants is necessary to minimize priming effects.

tions to “succeed” in the study. Thus, to yield scientifically

We describe several usability studies that our team has con-

valid results from user studies relating to online security and

ducted that involved deceiving participants to think they were

privacy, we are forced to deceive our participants about the

at risk, in order for us to yield valid results. These studies

nature of our studies.

were in the areas of website privacy policies and web browser

The measures that people take to increase their online pri-

phishing warnings. While these studies were conducted in our

vacy and security often come at a cost—time or money—and,

laboratory, the purpose was to gain a better understanding

therefore, a rational person would only take these measures

of how users behave online.

voluntarily when she believes she is legitimately at risk. For instance, it is not rational for a user to create a long unmemorizable password to protect information that is already

Studies

public, because nothing is at stake. Similarly, when study

Over the course of the past five years, we have designed and

participants are performing tasks that require them to make

conducted several studies to examine users’ online privacy

trust decisions, their decisions are of little value if they do

and security perceptions, as well as how they interact with

not believe they are legitimately at risk. However, ethical

systems designed to enhance their online privacy and secu-

guidelines prevent us from putting study participants in ac-

rity. In this section we provide overviews of our methodolo-

tual danger (in addition to creating subsequent participant

gies.

recruitment problems). These concerns create another set of lies that we must tell our study participants to yield scientifi-

Privacy Premiums

cally valid results.

In 2004, we developed Privacy Finder, a new search interface that displays privacy information as search result annotations.1 This way, web users can make decisions about

1

http://www.privacyfinder.org/

which website to visit based on privacy policies. To examine

that the purchasing tasks were not the only thing we were

the effectiveness of this interface, we conducted a series of

studying.

usability studies [3, 4, 2].

Finally, we were concerned that if participants did not believe

Due to the aforementioned problems with simply asking peo-

they were facing legitimate privacy risks, they would not pay

ple to state their privacy preferences, we did not wish to

any attention to privacy information—why should they? For

prime participants to the purpose of these studies. There-

this reason we required participants to make actual purchases

fore, we advertised each experiment as an “online shopping

from unfamiliar (but real) merchants. Participants used their

and searching study.” When participants arrived at our lab-

personal credit cards and billing information so that their con-

oratory, we told them that we were generally interested in

cerns for privacy would approximate the concerns they would

how they interact with search engines when making online

have when making purchases under normal circumstances.

purchases, and we would therefore be observing them use

In this manner, participants understood that the risks they

our custom search engine. So as to minimize priming ef-

faced in our laboratory were the same as the risks they faced

fects, we changed the name from “Privacy Finder” to simply

in their natural environments when making online purchases.

“Finder.”

We created a cost for increased privacy by pre-

selecting the search results such that purchasing from the

Phishing Warnings

high-privacy merchants cost more. Since we paid partici-

Security warnings are a web browser’s last line of defense

pants a static amount for their participation, the premium

against many of the online threats that face users. These

cost of higher privacy came directly out of the participants’

warnings attempt to alert users to potential phishing web-

pockets.

sites, man-in-the-middle attacks, or other types of insecure

We created an experimental condition that annotated search

websites. When users encounter these warnings, they are

results with icons representing privacy levels, as well as a

often in a mindset to fall for an attack. For instance, when a

control condition where these icons were absent or relabeled

user views a phishing warning after clicking a link in a fraud-

to represent irrelevant information. At no time did the exper-

ulent email, she incorrectly trusts the email and is prepared

imenter discuss the icons or privacy itself, though a printed

to transmit her credentials to the phishing website. To prop-

screenshot annotating the search engine features was pro-

erly study these warnings, study participants must be in a

vided in a packet of materials to each participant for their

similar mindset. We conducted a study to examine the us-

reference. When using the computer in their natural envi-

ability of current web browser phishing warnings [1]. This

ronments, no experimenter is present to prompt shoppers

study required particular attention to study design to mini-

about the search results that are most in line with their pri-

mize priming effects and to simulate participants’ natural en-

vacy preferences, however, they may have access to help

vironments.

files.

The first problem we encountered was framing the study so

We included subterfuge tasks that involved searching for

that participants were not primed to phishing concerns. To

product information (e.g., “what is the average cost for a pair

do this, we advertised the study as another “online shop-

of Ugg boots?”). This was to both familiarize participants with

ping study,” and told participants that we would be observing

the search interface, but also to deceive them into thinking

their purchasing behaviors. Each participant purchased items from Amazon and eBay using his or her own billing informa-

tion. After each purchasing task was completed, the exper-

ture how users interact in their natural environments. This

imenter provided the participant with a survey on her shop-

is difficult because users often say they are very concerned

ping experience. This survey served as subterfuge while the

about their privacy and security, but act in ways that are

experimenter sent the participant a phishing message spoof-

not consistent with their concerns. We believe that in order

ing either Amazon or eBay. Before the participant proceeded

to yield valid study results, we must deceive participants as

to subsequent tasks, she was asked to check her email for

to the purpose of the study and by creating an environment

the order confirmation—at which point she also encountered

where users perceive that they are subject to real risk.

the phishing message, subsequently followed the link to the spoofed website, and then encountered a security warning. While this particular attack was highly targeted (participants were more likely to believe the spoofed emails because they had just done business with the website in question), it put participants in the same mindset as they would have been when viewing a phishing warning in their natural environments: they viewed the email as legitimate, and then the

References [1] S. Egelman, L. F. Cranor, and J. Hong.

You’ve been

warned: An empirical study of the effectiveness of web browser phishing warnings. In Proceedings of the ACM Computer-Human Interaction Conference, New York, NY, USA, April 2008. ACM Press.

web browser warning was the last defense against viewing the phishing website.

[2] S. Egelman, J. Tsai, L. Cranor, and A. Acquisti.

Tim-

The second problem that we needed to address was creating

ing is Everything? The Effects of Timing and Placement

an actual sense of risk, such that participants would be forced

of Online Privacy Indicators. In Proceedings of the ACM

to make a value judgment (e.g., is it worth ignoring this warn-

Computer-Human Interaction Conference, New York, NY,

ing?). To approximate a real phishing attack, we registered

USA, 2009. ACM Press.

two domain names and designed websites that were indistinguishable from actual phishing websites. This was the closest

[3] J. Gideon, S. Egelman, L. Cranor, and A. Acquisti. Power

approximation we could make, since using real phishing web-

Strips, Prophylactics, and Privacy, Oh My! In Proceedings

sites would have been unethical due to the severe risks that

of the 2006 Symposium on Usable Privacy and Security,

would place on participants. Since participants were under

pages 133–144, 12-14, July 2006.

the impression that we were studying the usability of shopping websites, they did not believe that the warnings were

[4] J. Tsai, S. Egelman, L. Cranor, and A. Acquisti.

The

part of the experiment. Participants therefore had reason to

effect of online privacy information on purchasing be-

believe there was a potential risk when ignoring the warnings,

havior: An experimental study. In Proceedings of the

and thus we approximated the conditions in which they would

2007 Workshop on the Economics of Information Secu-

be viewing similar warnings in their natural environments.

rity (WEIS’07), Pittsburgh, PA, USA, 2007.

Conclusion

[5] A. Whitten and J. D. Tygar. Why Johnny Can’t Encrypt:

To best evaluate the effectiveness of online privacy and secu-

A Usability Evaluation of PGP 5.0. In Proceedings of the

rity systems and interfaces, researchers must attempt to cap-

8th USENIX Security Symposium, August 1999.