Learning to Search and Searching to Learn ... - Wiley Online Library

25 downloads 83201 Views 113KB Size Report
also found that the design of information technology—both hardware and .... For example, having a college degree may mean that a person learned to use ...... high school; some college, no 4-year degree (includes associate degree); college.
Journal of Computer-Mediated Communication

Learning to Search and Searching to Learn: Income, Education, and Experience Online Philip N. Howard Adrienne Massanari Department of Communication University of Washington

Using data from the Pew Internet and American Life Project surveys, this article explores changing trends in reported sophistication and satisfaction with search skills and with search engines. We find that the proportion of Internet users searching online for answers to specific questions—as opposed to casual browsing—has grown significantly. Moreover, as users get more experience online, they increasingly become dependent on search engines, confident in their findings, and savvy about how search engines structure information, privilege paid results, and track users. When other factors are controlled, years of online experience is a strong predictor of the likelihood of a person doing specific searches on a daily basis, and experience can have an even stronger positive effect than education and income. We also find that years of online experience, frequency of use, and sophistication with multiple search engines can overcome socioeconomic status in predicting how active a person is in searching across different topics. doi:10.1111/j.1083-6101.2007.00353.x

Introduction

In important ways, searching for information is a social act, shaped by the political, economic, and cultural contexts in which we develop our research skills and learn to use new technologies. There is a growing literature about user search habits and the mechanisms of search. Statistical models have consistently revealed that being more educated or being in a higher income bracket predicts most of the variation in Internet users’ experience, from their ability to learn new things online, to their ability to maintain contact with family and friends (Howard, 2004). Research has also found that the design of information technology—both hardware and software architecture—structures the kinds of information people find online (Elmer, 2004; Howard, 2006; Introna & Nissenbum, 2000; Shaw & Sandvig, 2005). Search engines organize content, providing a structured, mediated context in which people explore the political, economic, and cultural world. Such patterning can take implicit forms through the default settings provided by software engineers or more explicit forms through paid positioning of links in search results. 846

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

Search engines are not only technological systems but political ones (Hindman, Tsioutsiouliklisz, & Johnson, 2003). They deliberately structure information and are structured instantiations of the norms of software designers (Carmel, 1997; Gerhart, 2004). These informational structures systematically exclude (in some cases by design and in some accidentally) certain websites in favor of others, systematically giving prominence to some at the expense of others. Both the retrieval of results and the structure of retrieved information erodes over time, especially as search engines update their indices regularly, overwrite web pages with newer ones, add new pages to the index, and lose older ones (Hellsten, Leydesdorff, & Wouters, 2006). The process of searching for information also has a social side. Research has shown that learning new Internet search skills, whether watching others work at public library terminals or having other family members share their search tips, is a social experience (Hargittai, 2004b; Sandvig, 2001). Race, gender, age, income, and education all potentially have a bearing on the likelihood that adults will try sophisticated online searches and on how likely they are to be satisfied by the results. If socio-economic status bears on search sophistication and satisfaction, to what degree can this status be overcome? In other words, can individuals’ online experiences over time compensate for lower levels of education or income? Many adults in the United States have now had Internet access for a decade, and by 2006, over 65% of the adult population—about 92 million people—had Internet access on a daily basis (Pew Internet and American Life Project, 2006). Yet relatively little is known about the daily habits of people online. For example, the effect that Internet experience has on a person’s online searching behavior has yet to be determined. What is known is that people in higher income brackets may have better computer equipment and faster connections, and people with more education may have larger search vocabularies and more experience using computers to search for information in a variety of contexts, from work to home to public spaces like libraries and schools. Searching the Internet for an answer to a specific question involves fairly focused research, often using dedicated search websites such as Google, AltaVista, or AlltheWeb. In contrast, more casual research activities include browsing the content, links, and site maps at websites where users expect to find material related to their informational needs or using the search engines that are native to that particular website. This less systematic way of researching content often involves reading RSS (Really Simple Syndication) feeds from top news sites, obtaining information from portals like Yahoo!, MSN.com, or AOL.com, or exploring websites that are expected to host the content (such as news searches at CNN.com, forecast searches at Weather.com, or sports-related searches at ESPN.com). It may be wrong to expect individual users to know the relevant search terms they need before they begin their research. With time and effort, users may refine their search terms—and, in the process, their search skills—to access the information in which they are interested. This kind of search sophistication, we suggest below, may come with practice. While it is possible to define a sophisticated search by the specific Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

847

techniques used to query a search engine, it is also useful to define a sophisticated search in a socially situated context: the attributes of sophisticated searchers relative to the larger population of Internet users. Thus, we define search sophistication as: (1) the use of the Internet to look for information about specific questions, (2) in multiple domains of life, (3) several times a day and with multiple search engines, and (4) finding what was sought and being confident in the results. Studies of how student users search for political and health information have revealed that several years of Internet experience can help overcome deficits in income and education (Tewksbury & Althaus, 2000). Interestingly, research into how adults in Minnesota searched for political content during the 2002 election suggests that some of these effects may be cohort specific (Shah, Kwak, & Holbert, 2001). Having multiple years of online experience appears to predict voter turnout and bring some respondents political sophistication as much as several of the more traditional explanatory variables such as income, education, and level of interest in politics. Additionally, education plays a significant role in an individual’s searching behavior. For example, having a college degree may mean that a person learned to use Boolean search terms to query library databases, and having a good income may mean that a person has a fast connection for retrieving information from the Internet. Therefore, it is reasonable that income and education may explain much of a user’s sophistication or satisfaction with Internet searches. However, if someone does not have a high level of income or education, can that person overcome socioeconomic barriers with practice and online experience? As some research has shown, people often learn to use tools on an ad hoc basis, getting tips from friends and family as they go (Hargittai, 2004b). Therefore, it is reasonable to expect that individuals may acquire search skills over time, and that this experience might compensate for lower income or education levels. Thus, the question becomes: As people spend more time online conducting searches for information, are they actually learning how to search? Becoming adept at searching for information online actually involves learning a number of different searching techniques that one can tailor and change on the fly. For example, experiments have revealed that directory-based searching may not produce faster, more relevant results than query-based searching, and that as Internet users discover new search strategies, they discard less effective ones (Dennis, Bruza, & McArthur, 2002). Reformulating queries can significantly improve the relevance of the documents through which the user must trawl, although this can become a time-intensive task. However, the vast majority of Internet users will not learn this from academic journals on information retrieval. Instead, Internet users acquire this kind of knowledge as they use the technology, experimenting with different techniques and adapting research strategies. Their search skills may improve over time as they utilize and compare the differently structured contexts of search engines. This kind of behavior can sometimes be predicted by what statisticians call a learning algorithm. The learning algorithm reveals the classification rate that 848

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

occurs when someone uses intuition, rather than clear information, to make a decision. The process of doing an Internet search may be best thought of as an experimental process in which users test particular Internet tools and terms. Through experience—success and failure—users build up a repertoire of terms, skills, and Internet tools that makes them productive searchers. Whereas controlled experiments provide a snapshot of users’ search skills and behaviors at a single point in time, such experiments may miss the longer-term evolutionary process by which Internet users’ skills evolve (Schapire, Freund, Bartlett, & Lee, 1998). This learning algorithm results in salient differences between people who are new to technology, often labeled ‘‘newbies,’’ and those who are more experienced (Ho¨lscher & Strube, 2000). Concomitantly, users’ learning algorithms are formed through their frustration at not finding what they hope to find, as inexperienced users are much more likely to give up on a particular search—or the Internet as a search tool—if searching is too difficult (Slone, 2002). Although it takes participant observation to report accurately on the search techniques of particular individuals, national survey data can teach us something about broad patterns in reported search behavior and experience. In 2002, more than half of the adults who went online to search for political, campaign, or election news did a broad search, while only one third went straight to a particular site for their information (Pew Internet and American Life Project, 2002). Slightly less than half of the adults began with a general site like MSN or the AOL homepage with organized topics to browse, while one third of the adults went to a site like Google and typed in keywords to get a list of the websites that might have the information they were seeking. Less than one fifth of the adults went to a news site or a politically oriented site to begin their search. Once they had a list of search results to explore, fully two thirds of the adults read the explanation of each website and chose the one that best fit what they were looking for, and one quarter went systematically down the list in order. Only one in ten chose a search result because they recognized a name or sponsor among the results. Two thirds of the adults reported visiting three or fewer websites in their search for political and election information, and 29% had to visit four or more websites. In the end, over half of the adults found most or all the information they were looking for, and almost one third reported running out of time and giving up. We often assume that income or education explains most of the variation in user search sophistication and satisfaction, but to what degree does online experience compensate for income or education? Methods and Data

To test the hypothesis that Internet experience can overcome the effects of education and income in predicting a users’ search sophistication and satisfaction, we explored data from the Pew Internet and American Life Project 2000 and 2004 surveys. This research group conducted a daily tracking survey about adults and their Internet use, Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

849

employing a random digit dial process to sample from the U.S. adult population. The 2000 data were collected between March and December of that year. In this sample, Pew reports with 95% confidence that the error attributable to sampling and other random effects for the full sample is 1/2 3%, and for the sub-sample of Internet users, 1/2 4%. The 2004 sample, respondents aged 18 and older, was collected between February and December 2004. In this sample, Pew reports with 95% confidence that the error attributable to sampling and other random effects is 1/2 3%. In 2000 and 2004, the survey response rates were 32% (Pew Internet and American Life Project, 2000, 2004).1 Table 1 presents some basic descriptive statistics about the samples. The first Internet users were often male, well educated, white, and often occupied a high income bracket (Howard, Rainie, & Jones, 2001; Katz & Rice, 2002). In this way, Table 1 reveals that the sample is consistent with other national-level surveys of Internet use. Moreover, there are important differences between the population of Internet users in 2000 and 2004. While the population of Internet users grew significantly and became more diverse in terms of race, gender, income, and education, the average level of search experience diminished slightly. In 2000, respondents were asked if they had been using the Internet for one year or less, between one and three years, or for more than three years. By 2004, many people had some online experience, so this question was changed to ask respondents the number of years they had had Internet access. These 2004 data allow us to compute categories that can be compared with the 2000 data. We find a large rise in the proportion of Internet users with more than three years of experience, a jump from 35% in 2000 to 82% in 2004. Over this period there was a slight rise in the proportion of Internet users searching on a daily basis, and by 2004, 56% of daily users queried more than one search engine a day. Findings and Analysis

Since the descriptive statistics suggest important differences between the levels of search experience in 2000 and 2004, we begin with a comparison of search habits, and look at the different kinds of searches conducted by Internet users in these years. Table 2 reveals that the experience we call ‘‘search’’ is not limited to, or even dominated by, search engines. People use the Internet as a source for many different kinds of information. Search can involve looking through edited listings such as Yahoo! categories and stock tickers or checking an online newspaper for sports scores (Hargittai, 2004a). Thus, Table 2 is organized into domains of search activity: searching for fun and culture, life and health queries, more utilitarian and transactional searches, and searching for news and information about politics. In general, individuals who reported going online ‘‘yesterday’’ are more active Internet users than the population that reported ‘‘ever’’ going online. Among those who had ever done these activities, the largest change is in the portion of people using the Internet to search for weather reports and news about politics. Among the 850

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

Table 1 Descriptive statistics

Gender Male Female Education No college degree College degree or more Income Less than $50,000 $50,000 or more Don’t know or refused Hispanic Race White African American Asian American Other Don’t know or refused Internet Users Ever gone online? Went online yesterday? Yes, went online yesterday No, did not go online yesterday Don’t know refused Internet experience Less than 1 year of experience 1-3 years of experience More than 3 years of experience Years online (mean)(a) Frequency of use At least once a day 1 – 5 days a week Every few weeks or less often Don’t know refused Search engines Only uses one (b) More than one (b) Unweighted N

2000 (Percent)

2004 (Percent)

All

All

Internet Users

Internet Users

47 53

51 49

47 53

49 51

70 29

58 42

69 31

58 42

49 28 23 7

41 41 18 7

46 33 21 7

39 46 15 6

79 11 2 5 2

82 9 2 5 2

82 10 2 4 2

84 8 2 4 2

..

100

..

100

.. .. ..

56 44 0

.. .. ..

57 42 1

.. .. .. ..

13 52 35

.. .. .. ..

2 16 82 6.58

61 28 6 5

.. .. .. ..

66 25 5 4

.. .. .. .. .. .. 26,094

..

.. .. 13,921

.. .. 7,518

44 56 4,631

Notes: (a) Sample size for this variable is 4,550 because some respondents answered ‘‘Do not know.’’ (b) Sample size for this variable is 1,144 because the question was fielded for a shorter period of time.

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

851

Table 2 Research habits of U.S. adults with Internet access, 2000 and 2004 Domains of Research Activity

Did This Ever? (Percent) 2000

Do an Internet search to find the answer to a specific question you have Fun & Culture Check sports scores and information Check weather reports and forecasts Get information about travel, such as checking airline ticket prices or hotel rates Look for information about a hobby or interest Look for information about movies, books or other leisure activities Life & Health Gone on the Internet to look for information about prescription drugs?(a) Look for health or medical information Look for information about a job Look for information about a place to live Look for religious or spiritual information Utility for Information and Transactions Look for information about a product or service you are thinking about buying Get financial information such as stock quotes or mortgage interest rates Do research for school or training

Did This Yesterday? (Percent)

2004

Change

79

84

5

38

42

62

2000

2004

Change

17

31

14

4

10

11

1

78

16

17

22

5

66

73

7

7

8

1

77

77

0

19

19

0

63

..

..

9

..

..

..

32

..

..

..

..

56

..

..

6

..

..

38

38

0

5

4

21

27

32

5

2

3

1

22

30

8

3

3

0

73

79

6

13

16

3

45

44

21

14

9

25

54

58

4

10

11

1

(continued)

852

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

Table 2 Continued Domains of Research Activity

Did This Ever? (Percent) 2000

Look up a phone number or address online Not including email, do any type of work or research online for your job Search for a map or driving directions Politics & News Get news online Go online for news or updates about the Olympics Look for information from a local, state, or federal government website Look for news or information about politics and the campaign Searched online for particular news stories, photographs, or videos that media outlets decided NOT to cover?(a) Percent doing 3 or more of the research activities(b) Percent doing 5 or more of the research activities(b) Unweighted N

Did This Yesterday? (Percent)

2004

Change

..

7

..

51

51

..

2000

2004

Change

..

..

..

0

17

19

2

7

..

..

..

..

61 18

70 ..

9 ..

22 8

29 ..

7 ..

50

57

7

7

11

4

40

51

11

13

15

2

..

23

..

..

..

..

97

64

233

28

12

216

83

22

261

13

2

211

13,919

4,631

..

13,919

4,631

..

Notes: (a) Respondents who had ever done these activities, not just those who had done this yesterday. (b) In 2000, respondents were asked about 17 different informational activities, and in 2004 respondents were asked about 18 different informational activities.

population who were online the day before being surveyed, the largest change is in the portion of people who did narrow searches on specific questions and who searched for news. In 2000, only 17% of online adults were searching for specific answers to questions on a daily basis, but by 2004, 31% of the adult online population was doing this on a daily basis. Some kinds of topics came in and out of cultural relevance, so some lines of research were only explored in one of the two years analyzed here. Table 2 reveals the diversity of topics and the diversity of means people have for retrieving information from the Internet. As large numbers of people came online between 2000 and 2004, the level of popularity of certain topics and activities changed. The proportion of people getting Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

853

news, weather, and government information increased, while the proportion of people looking online for jobs or financial information diminished. Overall, Table 2 tells a consistent story. Even though the Internet was ‘‘massified’’ by large numbers of adults who gained Internet access in recent years, the proportion of people doing focused research on specific topics increased. However, the portion of people doing searches across multiple domains declined. Over time, the percent looking for information in any three or five domains dropped, even among the daily Internet users. Given that many of the survey questions ask about looking for and getting information online, this suggests that for many Internet users, learning to use the Internet is a process of transitioning from casual ‘‘looking’’ for information to more focused searching for the answer to a ‘‘specific question.’’ While the proportion of people searching for answers to specific queries has grown, the proportion of people who report doing five research activities, or even three, has dropped. In 2000, 28% of daily users did three or more of the research activities in Table 2, but by 2004, only 12% did three or more of these activities. Whereas Table 2 reveals some important trends on how people search for types of information, Table 3 reveals how different people approach searching online for answers to specific questions. By 2004, 37% of online males and 25% of online females were doing this sort of daily, directed query. People with at least a college degree did this kind of searching more than those without. People in the higher income bracket, with a college degree or more, report doing more searches, especially on a daily basis. In terms of race and ethnicity, the passing of four years saw interesting changes in search trends. Moreover, the number of people who had ‘‘ever’’ done research on these topics rose, such that by 2004 the vast majority of the population, whether male or female, equipped with a college degree or not, or earning above or below $50,000 a year, had searched online for answers to specific questions. Yet, a strong difference remained between the levels at which people had ‘‘ever’’ done these things and the proportion of people who did those things on a daily basis. For those doing specific searches on a daily basis, gender, education, and income were still important considerations. Internet experience is one of the more revealing categories in Table 3. In the samples of people collected in 2000 and 2004, there seems to be a relationship between the amount of experience respondents had online and their use of the Internet for specific research questions. Beginners are defined as having less than a year of Internet experience, novice users are defined as having between one and three years of Internet experience, and the category of experienced users is defined as having more than three years of experience. To learn more about how experience with the Internet may have impact on search sophistication and satisfaction, we looked deeper into the cohort of respondents in the 2004 survey. Table 4 reveals interesting differences between new users and users that are more experienced. Even though Tables 1 and 2 reveal a wide range of topics and means of doing research online, more experienced Internet users were more likely to use 854

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

Table 3 U.S. adults who have Internet access and search online for answers to specific questions, 2000 and 2004 Type of Internet User

Gender Education Income Hispanic Race

Internet experience

Weighted N

Male Female No college degree College degree or more Less than $50,000 $50,000 or more White African American Asian American(a) Other Beginner, less than 1 year of Internet use Novice, 1-3 years of Internet use Experienced, more than 3 years of Internet use

Did this Ever? (Percent) 2000

2004

79 80 75 86 77 84 73 80 74 74 78 63

88 80 81 88 80 89 71 84 78 88 81 52

78

Did this Yesterday? (Percent) Change

2000

2004

Change

9 0 6 2 3 5 22 4 4 14 3 211

19 15 14 21 15 20 15 17 14 19 25 8

37 25 23 41 24 37 28 32 16 33 29 4

18 10 9 20 9 17 13 15 2 15 4 24

72

26

13

13

0

87

87

0

25

35

10

9,552

3,994

9,552

3,994

Note: (a) In 2004, this category included Pacific Islanders.

search engines on a daily basis to answer specific questions. Internet users who have been online for more than three years will use search engines at least once a day, and when they do, they are more likely to use more than one search engine. In fact, search engines are a key component of the Internet for the cohort of experienced users: 33% report that they could not live without search engines, 88% report that they find what they look for, and 50% report that they are very confident in their findings. Beginners and novice users report lower values for these three things. In 2004, respondents were asked how much they would miss using search engines. Predictably, more experienced users felt they could not live without them, and less experienced users felt they would not really miss using search engines. Yet half of the most experienced users reported that they could go back to other ways of finding information online, a higher proportion than the new users. In part, this may be explained by the self-confidence experienced users have; even if they cannot access a search engine, they may think their research skills are good enough to find the information they need by other means. Experienced users are more satisfied that they are finding the results they want and express more confidence in their search skills.2 However, Table 4 also reveals that search sophistication involves more than being able to use complex search terms while conducting directed research. Being Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

855

Table 4 Satisfaction and sophistication among Internet users who have used a search engine, by years of experience online, 2004 Less than 1 Year 1-3 Years of More Than of Experience Experience 3 Year’s (Percent) (Percent) Experience (Percent) How often do you use search engines to find information online? ..several times a day, or once a day 16 19 38 How many different search engines do you use on a regular basis? ..mean 1.23 1.51 1.55 How much do you, personally, rely on Internet search engines as a way of finding information? ..couldn’t live without Internet search engines 15 18 34 ..could go back to other ways of finding information 38 51 51 ..wouldn’t really miss Internet search engines 38 29 14 When you use a search engine to look for information online, how often do you actually find the information you’re looking for? ..always or most of the time 69 84 88 How confident do you feel about your searching abilities when using a search engine to find information online? ..very confident 23 27 50 In general, do you think Internet search engines are a fair and unbiased source of information? ..yes 77 75 67 Have you ever used a search engine that provided customers [with sponsored or unpaid] results? ..yes 25 51 69 ..no 25 49 26 ..don’t know, or refuse to answer 50 0 5 Can you always tell the difference between the PAID results and UNPAID results you get from a search engine, or are you not always able to tell? ..can always tell 25 27 48 ..not always able to tell, don’t know, or refuse answer 75 73 52 In general, do you APPROVE or DISAPPROVE of search engines keeping track of how each customer uses their search engine and what they search for? ..approve 23 39 36 If you learned that a search engine was accepting fees from websites, and was listing those websites without making it clear that they were paid or sponsored, would you stop using that search engine or continue to use it? ..yes, would stop using it 54 44 45 Some search engines keep track of how each customer uses their search engine and what they search for. Search engines say this helps them provide customers with better search results. Have you heard or read about this? ..yes 23 30 46 If you learned that a search engine was keeping track of YOUR searches, would you stop using that search engine, or would you continue to use it? ..would stop using it 80 69 66 ..would continue to use it 0 19 26 ..don’t know, or refuse to answer 20 12 8 Weighted N 13 171 964

Note: These survey data were collected between May and June 2004. For ease of analysis, some response categories are collapsed as indicated. 856

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

a sophisticated searcher must also mean being aware of some of the political aspects of search engine design. The most experienced Internet users express slight skepticism about whether search engines are returning unbiased information. They are more aware of the issue of paid and sponsored results creeping into their search findings. The stark evidence of this is that 48% of experienced users are aware that they use a search engine that returns sponsored results, while only 25% of beginners can identify whether they have been given sponsored results. In addition, experienced users are more likely to think they can identify sponsored search results. Surprisingly, these more experienced users also seem relatively nonchalant about sponsored search results. It may be that more experienced users are resigned to being tracked and shown sponsored results, and they perceive these as minor inconveniences in an otherwise useful and free service. Tables 1 and 2 revealed that even though many of the new users who came online between 2000 and 2004 had lower levels of income and education, they were conducting focused searches to find answers to specific questions, but they were doing fewer types of research activities. The third table revealed that whether they were asked about their habits in 2000 or 2004, the vast majority of respondents with more than three years of Internet experience had done this kind of search. The fourth table revealed that Internet experience is associated not only with a high frequency of search engine use, but also with some awareness of the political and informational realities of search engines, an understanding of potential differences between paid and unpaid results, some awareness of how search engines track users, and acknowledgment of bias in search results. With this more nuanced picture of what makes a sophisticated searcher online, the next step is to model the impact of Internet experience, controlling for education, income, and other factors, on becoming a better searcher.

Modeling Daily Specific and Active Search Online

To set into sharp relief the relative contributions of income, education, Internet experience, and other demographic factors on search activity, we developed several statistical models. We worked with the 2004 dataset because it contains a more complete range of possible explanatory variables. We created two kinds of dependent variables. First, we made a dummy variable based on the question, ‘‘Did you do an Internet search to find the answer to a specific question you have?’’ Descriptive statistics for this variable appear in Table 3. Respondents could make two types of positive response, saying either they had done this activity sometime in the past or that they had done this activity yesterday. To be conservative, we work with only the sub-sample of people who did this activity yesterday, so that we can explain variation in the pool of people who do a specific search on a daily basis, or do any of the various search activities identified in Table 2. While we certainly would have more respondents who had ‘‘ever’’ Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

857

searched online, this smaller population probably had clearer memories of their most recent search experience. Second, we created a new categorical index based on the number of search topics searched in a day, with the highest category being five topics or more. Table 1 reveals that even though there are a wide range of topics that people research online, very few people searched in more than five of these areas on a daily basis. Thus, we compressed the range so that the maximum category of this variable is actually five or more topics searched in a day.3 It is rare to find a good proxy for search sophistication in survey data. Measuring search sophistication can be easier in quasi-experimental research where the subjects are closely supervised and given a range of tasks to complete in a timely manner. However, we argue that sophisticated Internet researchers or those who use the Internet to do research in multiple domains of life are more likely to treat the Internet as a tool for finding answers to very specific questions. Our assumption is that most beginners will find information online by browsing, tentatively following links, and honing their ability to distinguish between paid and unpaid results. Using a search engine to drill to specific information is a more advanced skill, so we use this as a measure of search sophistication. Table 5 presents the results of several ways of predicting whether people reported doing an online search for a specific answer to a question. To help explain the relationship between years of experience online and search sophistication, we modeled key variables for predicting such habits. Although it is common to report the coefficients from the logistic regression of independent variables onto a dependent variable, the exponentiated coefficients are the more intuitive odds ratios. The odds ratio is the probability that one variable, controlling for all the other factors in a model, will correctly predict a person’s positive response to a question. For example, in Model 1, the odds that a respondent with a bachelor’s degree did a daily online search are 154.7% greater ((2.547 2 1) 3100) than the odds for a respondent with less than a bachelor’s degree. Using Model 1, it is possible to predict the odds that a particular respondent did a specific search. The odds that a 30 year old woman without bachelor’s degree reported doing a specific search online—if she earned less than $50,000 a year, was African American and not Hispanic—are almost 6 to 1. In contrast, if this woman did have a bachelor’s degree, the odds that she searched online are much better, over 15 to 1.4 To help isolate the effect of experience online on search sophistication, we controlled for demographic variables such as age, gender, and race, in addition to socioeconomic variables such as income and education. Of the statistically significant effects in Model 1, being older, female, not having a college degree, earning less than $50,000 a year, or self-identifying as African American decreases the odds that respondents did daily searches. Controlling for other effects, respondents who selfidentify as African American, Asian American, Hispanic, or other race seem less likely than whites to have done daily searches, although not all of these effects are statistically significant. Controlling for other variables, the largest single effects are education and income. Having a college degree and a household income of more 858

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

Table 5 Logistic regression explaining the likelihood of having daily searched online for specific answers to a question Model 1

Model 2

Demography, Education, and Income on Specific Search Log Odds (Standard Error)

Demography, Education, Income, and Online Experience on Specific Search Log Odds (Standard Error)

Constant 0.852 (0.223) (B instead of Beta) Age 0.978 (0.004) ** Female 0.549 (0.124) ** Education, college 2.547 (0.128) ** degree or more Income, $50,000 or more 1.192 (0.090) * Hispanic 0.868 (0.275) Race (white as reference) African American 0.442 (0.290) ** Asian American 0.594 (0.458) Other 0.877 (0.345) Years online .. Frequency of use (every few weeks or less often as reference). 1–5 days a week .. At least once a day .. Search engines, .. more than one 0.118 Nagelkerke R2 Unweighted N 1,373

0.115

(0.526) **

0.979 0.637 2.060

(0.005) ** (0.141) ** (0.147) **

0.984 1.216

(0.100) (0.328)

0.461 0.433 0.909 1.044

(0.324) * (0.481) (0.389) (0.019) *

2.494 14.077 0.943

(0.500) (0.472) ** (0.141)

0.261 1,112

Notes: These survey data were collected between May and June 2004. ** Significant at 0.01, * significant at 0.05.

than $50,000 per year increases the odds that a respondent reported doing specific, focused searches on a daily basis. In Model 2, we add additional explanatory variables: the number of years that a person has had access to the Internet, frequency of use, and the number of search engines used. The number of years online and being a daily Internet user proved to be statistically significant effects, and the variation explained in the model improved. Of the statistically significant effects, being older, female, or African American decreases the odds that a respondent did a daily search. Interestingly, the addition of variables for years of online experience and frequency of use may have blunted the negative effect of being older or female. Controlling for other variables, having a college degree is still a large, positive predictor of daily search habits. The number of years online has a statistically significant impact, but it appears to be a smaller effect than education, gender, or race. However, the impact of this effect rises quickly Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

859

for respondents who have multiple years of online experience. For respondents with one or two years of Internet experience, the effect of having a college degree will be larger than the effect of their Internet experience. For respondents with more years of Internet access, however, the effect of Internet experience quickly outpaces the effects of education. For example, we can compare several 30 year old, female Internet users who all earn less than $50,000 a year and selfidentify as African American but not Hispanic. The first respondent has been online for a year but not gone to college, and according to Model 2, the odds that she did an online search yesterday are 1 to 1. The second respondent has been online for a year, and is a college graduate, so the odds that she did an online search yesterday are 2 to 1. The third respondent is not a college graduate, but she has had Internet access for 4 years, and the odds that she did an online search yesterday are 4 to 1. In comparison, the effect of having several years of Internet access quickly becomes greater than the effect of having a college degree. In terms of predicting whether someone from this sample reported doing a specific search, one or two years of Internet access are comparable to having a college degree or high income. Beyond this, the effect of having multiple years of Internet access is larger than the education effects found in this sample. Our final step in the analysis is to look for predictors of how often people search. We composed an index of many different topics for research and found that on a daily basis, even the most active online researchers rarely complete more than five research tasks. We created a five-point index of search activity for the population who reported being online yesterday. Table 6 reveals our OLS regression for models to predict how much research the respondents had done. The beta coefficients represent the results of predicting the number of research activities an Internet user did online in 2004. Respondents were queried about 18 different research activities, broken into domains of fun and culture, life and health, utility and transactional searches, and news and politics. To find out what factors predict search activity across multiple domains, our search activity index indicates whether a person in the sample did one, two, three, four, and five or more of the 18 activities described in Table 2. In Model 3, being more educated or having more income are the strong positive predictors of whether an Internet user is also an active searcher.5 Consistent with our previous findings, the statistically significant negative effects include being older, female, and African American. The strong, statistically significant positive predictors of how many search activities are done by someone in the sample include education and income. The total amount of explained variation in Model 3 is fairly small, and in model 4 we add the variables on search experience. Adding Internet experience in as a predictor, which we do in Model 4, seems to decrease the education and income effects. The education effect, however, remains a strong predictor of the diversity of topics researched on a daily basis online. Age, gender, and race are still statistically significant negative effects. The effect of high income is still positive, but it loses its statistical significance in Model 4. In this model, several of the Internet experience variables are statistically significant. 860

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

Table 6 OLS regression on daily search activity index Model 3

Model 4

Demography, Education, and Income on Search Activity Index Beta (Standard Errors)

Demography, Education, Income, and Online Experience on Search Activity Index Beta (Standard Errors)

Constant (B instead of Beta) 0.868 (0.067) ** Age 20.086 (0.001) ** Female 20.109 (0.036) ** Education, college 0.215 (0.038) ** degree or more Income, $50,000 or more 0.080 (0.026) ** Hispanic 0.000 (0.079) Race (white as reference) African American 20.068 (0.068) ** Asian American 0.010 (0.123) Other 20.009 (0.102) Years online .. Frequency of use (every few weeks or less often as reference) 1–5 days a week .. At least once a day .. Search engines, more than one .. 0.085 Nagelkerke R2 N 1,793

0.192 20.093 20.111 0.154

(0.199) (0.003) ** (0.079) ** (0.084) **

0.011 0.067

(0.057) (0.184) *

20.053 20.014 20.042 0.107

(0.157) * (0.288) (0.222) (0.011) **

0.056 0.379 0.068 0.224 1,793

(0.160) (0.148) ** (0.079) *

Notes: These survey data were collected between May and June 2004. ** Significant at 0.01, * significant at 0.05.

The OLS regression distinguishes the relative contribution of demographic and Internet experience effects to predicting how many different kinds of searches a respondent did the day before being surveyed. The more years of online experience a respondent has, the more variety in topical research the respondent was doing online, and relative to those who used the Internet every few weeks, those who used the Internet at least once a day were much more likely to be active searchers. In fact, the effect of being a high frequency user is greater than the effect of education. Additionally, Internet users who reported using more than one search engine as a daily habit are more likely to research several topics online. Conclusion

We find that under certain conditions, controlling for demographic and socioeconomic variables, hands-on experience with the Internet has a major impact on a user’s search sophistication and satisfaction and can make up for deficits in socioeconomic status. Just as important, the analysis helps us understand what it means to Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

861

learn to become a sophisticated searcher relative to the whole population of Internet users. We defined a sophisticated searcher as someone who uses the Internet to search for information on specific questions and in multiple domains of life, uses multiple search engines several times a day, finds what they want, and is confident in their results. Early Internet adopters were highly educated and were likely to use the Web for research, but our study shows that a large proportion of even the new Internet users in 2000 and 2004 were doing focused searches online. Moreover, a few years of Internet experience seems to make people more sophisticated searchers, in terms of the likelihood that they will direct their queries to search engines, and in their awareness of some of the politics and biases of search engines. In explaining which daily Internet users were more likely to search for an answer to a specific question online, we found that the effect of having multiple years of Internet access can be larger than the effect of having a college degree. Controlling for several factors, education, and years of online experience had comparable effects in our model explaining which daily Internet users were likely to do a wide range of research activities online. These findings are significant because they suggest that people can develop sophisticated search skills even if they do not have high levels of education and income through their general experience online. However, income and education are important contextual factors that explain what people will get out of their Internet use. In addition, socio-economic status provides advantages for some people but barriers for other people. We found that Internet experience alone could overcome some of these barriers. This suggests that, over time, individuals may be able to acquire the search skills they need to complete various tasks online, even if they do not fall into more privileged demographic categories. Furthermore, these findings suggest that to a measurable degree, search sophistication can come from Internet use, thus lending further support to those who advocate for the end of the digital divide in terms of technology access. The digital divide may also be about education and literacy, but in a concrete and fundamental way, it is about access alone. While time online may not solve all users’ difficulties accessing information, it may encourage a willingness to experiment with different search strategies and an attention to the political/social limitations of search engines. This, in turn, may create a new class of users who are technically sophisticated searchers, having overcome significant barriers to their entry into the digital world. Acknowledgment

For her advice on earlier drafts, we are grateful to Gina Neff. Notes 1 Non-response in telephone interviews produces some known biases in survey-derived estimates, because participation tends to vary for different subgroups of the 862

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

2

3 4

5

6

population, and these subgroups are likely to vary also on questions of substantive interest (Witte & Howard, 2002). Pew computes demographic weighting parameters from a special analysis of the most recently available Census Bureau’s Current Population Survey at the time of each survey. This analysis produces population parameters for the demographic characteristics of American adults age 18 or older who are living in households that contain a telephone landline. These parameters are then compared with the sample characteristics to construct sample weights. The weights are derived using an iterative technique that simultaneously balances the distribution of all weighting parameters. In each table, we specify whether or not we used the sample weights. In addition to sampling error, question wording and practical difficulties in conducting telephone surveys may introduce some error or bias into the findings of opinion polls. Of the survey questions analyzed here, these responses are most likely to suffer from some social desirability effects. For example, many respondents would have wanted to sound confident in their own search skills. Nonetheless, there are plausible differences between the levels of confidence that new users and experienced users have. Moreover, experienced users may over-report familiarity with multiple search engines, the complexity of their searches, or the range of domains in which they actively search for information. The interviewers were instructed to rotate through the list of options for the respondents. In the first example, the odds equal 0.852 * 0.978(Age) * 0.549(Female) * 2.547(College Degree or More) * 1.192($50,000 or More) * 0.868(Hispanic) * 0.442(African American) * 0.594(Asian American) * 0.877(Other), and with e(0) = 1, the odds equal 0.852 * 0.978(30) * 0.549(1) * 2.547(0) * 1.192(0) * 0.868(0) * 0.442(1) * 0.594(0) * 0.877(0) = 6.066. In the second example, the difference is that the respondent has a college degree, so the odds equal 0.852 * 0.978(30) * 0.549(1) * 2.547(1) * 1.192(0) * 0.868(0) * 0.442(1) * 0.594(0) * 0.877(0) = 15.450. Even though the explained variation in Models 1 through 4 is under 30%, these models are significantly better predictors of reported behavior than the baseline odds of the constant alone. Education is a seven-point index based on the question, ‘‘What is the last grade or class you completed in school?’’ The response options, from least to greatest, included: none, or grades 1–8; high school incomplete (grades 9–11); high school graduate (grade 12 or GED certificate); technical, trade, or vocational school AFTER high school; some college, no 4-year degree (includes associate degree); college graduate (B.S., B.A., or other 4-year degree); and finally post-graduate training/ professional school after college (toward a Masters/Ph.D., Law, or Medical school). Income is an 8-point index based on the question, ‘‘Last year, that is in 2003, what was your total family income from all sources, before taxes?’’ The response options, from least to greatest, included: less than $10,000; $10,000 to under $20,000; $20,000 to under $30,000; $30,000 to under $40,000; $40,000 to under $50,000; $50,000 to under $75,000; $75,000 to under $100,000; and finally $100,000 or more. These gradations were tested in modeling, but were found to be statistically insignificant. For the sake of parsimony, income and education were ultimately collapsed to bivariate variables.

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

863

References Carmel, E. (1997). American hegemony in packaged software trade and the ‘culture of software.’ The Information Society, 13(1), 125–142. Dennis, S., Bruza, P., & McArthur, R. (2002). Web searching: A process-oriented experimental study of three interactive search paradigms. Journal of the American Society for Information Science and Technology, 53(2), 120–133. Elmer, G. (2004). Profiling Machines: Mapping the Personal Information Economy. Cambridge, MA: MIT Press. Gerhart, S. (2004). Do Web search engines suppress controversy? First Monday, 9(1) Retrieved February 24, 2007 from http://www.firstmonday.org/issues/issue9_1/gerhart/ index.html Hargittai, E. (2004a). Classifying and coding online actions. Social Science Computer Review, 22(2), 210–227. Hargittai, E. (2004b). Informed Web surfing: The social context of user sophistication. In P. N. Howard & S. Jones (Eds.), Society Online: The Internet in Context (pp. 257–274). Thousand Oaks, CA: Sage Press. Hellsten, I., Leydesdorff, L., & Wouters, P. (2006). Multiple presents: How search engines rewrite the past. New Media & Society, 8(6), 901–924. Hindman, M., Tsioutsiouliklisz, K., & Johnson, J. A. (2003, August). ‘‘Googlearchy’’: How a few heavily-linked sites dominate politics on the Web. Paper presented at the Annual Meeting of the Midwest Political Science Association, Philadelphia. Ho¨lscher, C., & Strube, G. (2000). Web search behavior of Internet experts and newbies. Computer Networks, 33 (1–6), 337-346. Howard, P. N. (2004). Embedded media: Who we know, what we know, and the context of life online. In P. N. Howard & S. Jones (Eds.), Society Online: The Internet in Context (pp. 1–27). Thousand Oaks, CA: Sage. Howard, P. N. (2006). New Media Campaigns and the Managed Citizen. New York: Cambridge University Press. Howard, P. N., Rainie, L. H., & Jones, S. (2001). Days and nights on the Internet: The impact of a diffusing technology. American Behavioral Scientist, 45(3), 383–404. Introna, L., & Nissenbum, H. (2000). Shaping the Web: Why the politics of search engines matters. The Information Society, 16(3), 1–17. Katz, J. E., & Rice, R. E. (2002). Social Consequences of Internet Use: Access, Involvement, and Interaction. Cambridge, MA: MIT Press. Pew Internet and American Life Project. (2000). Daily Tracking Survey—November 2000. Washington, DC. Pew Internet and American Life Project. (2002). Daily Tracking Survey—November 2002. Washington, DC. Pew Internet and American Life Project. (2004). Daily Tracking Survey—November 2004. Washington, DC. Pew Internet and American Life Project. (2006). Daily Tracking Survey—November 2006. Washington, DC. Sandvig, C. (2001). Unexpected outcomes in digital divide policy: What children really do in the public library? In M. Compaine & S. Greenstein (Eds.), Communications Policy in Transition: The Internet and Beyond (pp. 265–293). Cambridge, MA: MIT Press.

864

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5), 1651–1686. Shah, D., Kwak, N., & Holbert, L. (2001). ‘‘Connecting’’ and ‘‘disconnecting’’ with civic life: Patterns of Internet use and the production of social capital. Political Communication, 18(2), 141–162. Shaw, R., & Sandvig, C. (2005, September). Software defaults as de facto regulation: The case of wireless APs. Paper presented at the 33rd Research Conference on Communication, Information and Internet Policy, Arlington, VA. Slone, D. J. (2002). The influence of mental models and goals on search patterns during Web interaction. Journal of the American Society for Information Science and Technology, 53(13), 1152–1169. Tewksbury, D., & Althaus, S. (2000). Differences in knowledge acquisition among readers of the paper and online versions of a national newspaper. Journalism and Mass Communication Quarterly, 77(3), 457–479. Witte, J., & Howard, P. N. (2002). The future of polling: Relational inference and the development of Internet survey instruments. In J. Manza, F. L. Cook, & B. I. Page (Eds.), Navigating Public Opinion: Polls, Policy and the Future of American Democracy (pp. 272–289). New York: Oxford University Press.

About the Authors

Philip N. Howard is an assistant professor in the Communication Department at the University of Washington. His book New Media Campaigns and the Managed Citizen (Cambridge University Press, 2006) is about the role of information technology in campaign strategy and political culture. Currently, he directs the World Information Access Project (http://www.wiareport.org). Address: Department of Communication, 141 Communications Building, Box 353740, University of Washington, Seattle, Washington, 98195-3740, USA Adrienne Massanari is a doctoral candidate in the Department of Communication at the University of Washington. Her research interests include the discourse and practice of information architecture and user-centered design fields. Address: Department of Communication, University of Washington, Box 353740, Seattle, WA 98195-3740, USA

Journal of Computer-Mediated Communication 12 (2007) 846–865 ª 2007 International Communication Association

865