Children's web search with Google: the ... - Semantic Scholar

5 downloads 358 Views 462KB Size Report
school children use modern search engines to solve informational search tasks. Specifically ... queries leads to more successful search outcomes than keyword.
IDC 2012

SHORT PAPERS

12th-15th June, Bremen, Germany

Children's Web Search With Google: The Effectiveness of Natural Language Queries Yvonne Kammerer, Maja Bohnacker Knowledge Media Research Center Schleichstr. 6, 72076 Tübingen, Germany

{y.kammerer, m.bohnacker}@iwm-kmrc.de ABSTRACT

search results (e.g., [8, 9], and understanding the content of Web pages (e.g., [4]. With regard to the formulation of search queries a frequently addressed issue is children's use of natural-language or full-sentence queries instead of keyword searches [2, 5, 6, 9, 13].

In this paper, we present work in progress on how elementary school children use modern search engines to solve informational search tasks. Specifically, in a laboratory study with 21 children aged 8-10 we investigated whether the use of natural-language queries leads to more successful search outcomes than keyword queries when searching the Internet with Google. Both quantitative and qualitative data are reported that indicate the advantages of natural-language queries. Along, based on our observations we present a query-reformulation tool for a search engine interface for children that we are currently developing.

Though, results by [9] indicate that children still perform better with Google than with search interfaces that were specifically designed for children. Specifically, the use of natural-language queries today, due to query expansion techniques, is no longer a problem for general Web search engines that are designed to search unconstrained information spaces. Kids search interfaces, in contrast, are often based on editorially reviewed, rather small directories of children's Web sites and, thus, often do not implement the more advanced search methods within their local search functionality.

Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information search and retrieval – Query formulation, Search process, Selection process; H.5.2 [Information interfaces and presentation]: User Interfaces - Graphical user interfaces, Natural language, Screen design, User-centered design

Children, World Wide Web, information-seeking behavior, search engine, query formulation, typing, search results.

In the present paper, we investigate whether children's use of natural-language queries nowadays not only is no longer a problem, but actually a more beneficial search strategy for children when searching vast information spaces by using a general Web search engine. The paper reports work in progress on how elementary school children use the search engine Google to solve informational tasks and how the use of natural-language queries seems to benefit their search success. Finally, we introduce a prototype of a query-reformulation tool we designed based on our observations.

1. INTRODUCTION

2. RELATED WORK

General Terms Performance, Human Factors.

Keywords

Previous research on children's Web search behavior has shown that children often experience difficulties when using Web search engines or Web catalogues.

In recent years the World Wide Web (hereinafter referred to as the Web) not only for adults but also for children has evolved into a major information source, offering enormous amounts of information of varying quality. According to the latest report of the KIM study, a German large-scale study on children's media habits, in 2010 57% of German children aged 6-13 years used the Internet, with information seeking via search engines having been the most popular activity [1]. Children increasingly use the Web to search for homework assignments, followed by information about celebrities, computer games, news, or pets. They usually express high confidence in their ability to search the Web [1, 11], which, however, often does not correlate with their objective search success. Previous studies have shown that children face various problems when using search engines such as Google. They usually have difficulties with defining information problems and specifying search terms (e.g., [5, 6, 2], as well as with adequately evaluating

First, they have difficulties with formulating adequate search queries because of their limited vocabulary [14]. Their ability to name and associate abstract concepts is only about to arise [7] and they barely have any understanding of the operating mechanisms of search engines [3, 14]. All of this causes them to enter ineffective search terms (e.g., too few or too many terms, wording either too general or too specific, [2, 3, 5, 6, 14]). Besides, especially for younger children the use of keyboards as text entry device can be tedious [5]. Most elementary school students are not able to type and watch the screen at the same time. In fact they spend a considerable amount of time searching for the letters on a keyboard and rarely look up. They also frequently make massive and unusual spelling errors that are hard to interpret for search engines [e.g. 5, 6].

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. IDC 2012, June 12–15, 2012, Bremen, Germany. Copyright 2012 ACM 978-1-4503-1007-9...$10.00.

In addition, many children enter natural-language or full-sentence queries. In an early study, Marchionini [12] had shown that especially 8-10 year olds as compared to 11-12 year olds frequently used natural language or phrases instead of keywords while using an electronic encyclopedia to search, resulting in unsuccessful searches. Similarly, in a study by Bilal [2] 35% of seventh-grade children searching for factual information with the online catalogue Yahooligans started their search with a natural-

184

IDC 2012

SHORT PAPERS

12th-15th June, Bremen, Germany

3.2.1 Survey data

language question. The online catalogue, however, produced empty results sets with this kind of query. In another study [13], even 63% of 32 fifth and sixth-grade children used full-sentence queries on Web search engines. In line with that, also more recent studies showed that many elementary school students used natural language opposed to keywords when searching with search engines [9], particularly when searching for more complex information [5]. Contrary to the early days of the Web, natural language queries, nowadays, however, do not seem to cause any problems. Also, in contrast to earlier research, Jochmann-Mannak and colleagues [9] showed that most children gratefully and successfully made use of the spelling correction tool ‘Did you mean’ in Google. Furthermore, many children took notice and made use of the dynamic query suggestions that are offered by Google on a character-by-character-basis during typing. They used these query suggestions as online 'spelling checker' or as ‘type help’ so that they had to type only a few letters and then clicked on the query suggestion that appeared in the drop-down box. Based on the results by [9] we expected that natural language queries in combination with the use of spelling suggestions and dynamic query suggestions might be a more beneficial search strategy for elementary school children than keyword queries. Finally, once a search query has been entered, children can also have difficulties understanding the information presented to them on the results page and on the Web pages. Often they are overwhelmed or even frustrated by the huge number of results presented, seldom explore search results past the first search engine results page (SERP), and have problems with judging the relevance and trustworthiness of the results [8, 9, 10, 13]. We expected that this might be another advantage of natural language queries as they might result in less and more specific search results than single keywords.

A profile survey was used in which children were asked about demographical data, such as age, grade and gender, as well as their self-rated Web search experience and skills. On 4-point scales they had to rate (a) how much they liked searching the Web for information, (b) how successful they usually are when searching the Web, and (c) how often they needed help during Web search. Furthermore, they were asked which search engines they knew, and whether they used search engines alone or with the help of others. In addition, the parents were asked to fill out another survey that consisted of a set of more specific questions about children's Web search experience and skills.

3.2.2 Task performance To examine how children search and find information when using Google, we asked our participants to conduct a set of three search tasks. The first task was "Which is the highest building of the world? How high is it and where does it stand?" The second task was "What is civet coffee?" (the literal translation of the German term would be "cat coffee"). The third task that we analyzed for the present paper consisted of the following two subtasks: (a) "Kangaroos are marsupials. But do all kangaroos have pouches?" and (b), after having answered the first subtask, "Ok, because the pouch is a place for keeping the young babies, only female kangaroos have them. But how does the baby kangaroo stay put inside of the pouch while the mummy jumps around all the time?" We borrowed this latter subtask from [9]. The two subtasks differ in their complexity. The first one can be defined as a fact-based task with a yes/no answer (No, male kangaroos do not have pouches), whereas the latter is a more complex research task requiring a more sophisticated explanation (Inside the pouch the baby grabs onto one of four teats and remains attached to it for several months).

Thus, in the study presented in this paper we examined the extent to which elementary school students used natural language queries when using Google to conduct a series of informational search tasks. More importantly, we aimed at investigating whether this type of queries would lead to more successful search outcomes than keyword queries.

During the search tasks, all browser activities (i.e., typing, clicking, scrolling, etc.) were recorded. Additionally, a Web cam was used to videotape the child in front of the computer and to record the spoken comments of both the child and the experimenter. In addition, the children’s eye movements on the screen during task performance were recorded using a 60 Hz remote eye-tracking system by Sensomotoric Instruments.

3. METHOD Between January and March 2012 we conducted a study to investigate how children search for information using Google. More precisely, the children were asked to conduct a set of informational search tasks about buildings, food, and animals of different complexity and we analyzed which type of queries (natural language vs. keyword queries) children entered to solve the tasks. In this paper we focus on the third task only which consisted of two subtasks (one fact-based task with a yes/no answer, and one more complex research task requiring a more sophisticated explanation).

3.3 Procedure The children were tested in individual sessions of approximately 45 minutes. First, the study procedure was explained to both the child and the parent and both were asked for their consent for the child to participate in the study. Subsequently, the parent was asked to leave the laboratory and to fill out the questionnaire on the child's Web search experience and skills. Before the children started with the Web search tasks, together with the experimenter they also filled out a paper-and-pencil questionnaire about their Web search habits, experience, and skills as well as about their prior knowledge of the topics used in the search tasks.

3.1 Participants For our study we recruited children aged 8-10 through parents letters distributed to elementary schools in a mid-size city in southwest Germany. 21 children participated in our study: 13 boys and 8 girls, 5 age eight, 6 age nine, and 10 age ten. All participants had computers and an Internet connection at home.

Then, they started with the Google search. The experimenter presented each task verbally to the children, in order to prevent them from simply typewriting the keywords instead of thinking about the formulation of the queries and the spelling of the words [cf. 9]. During task performance, the experimenter sat next to the children to reassure them if necessary. If the experimenter had to assist in finding the correct answer after several unsuccessful attempts of the child, the task was counted as unsolved though. For each task we analyzed the queries typed into the search box, the time taken to complete the task, and whether the task was solved successfully without the help of the experimenter.

3.2 Data collection and tasks The study was conducted in a laboratory setting, with each child being tested individually. Parents were asked to wait outside during testing.

185

IDC 2012

SHORT PAPERS

4. RESULTS

12th-15th June, Bremen, Germany

one. Then, after being confronted with the second subtask, she types "How do the baby kengaroos stay inside the pouch", again using the ‘Did you mean’ suggestion to correct the spelling error. She scans all search results of the first and second SERP and finally selects the last search result on the second SERP, which is a pdf. After some reading, she finds the correct answer in the pdf. Solving the second subtask took her 8.0 minutes. Note, however, that both subtasks also could have been solved by selecting the first search result that was provided by Google for the first query.

4.1 Survey data According to the parents, the children, on average, have been using computers for 2.8 years (SD = 1.9), and the Internet for 1.8 years (SD = 1.9), but not on a daily basis (76% once a week or less). In line with the KIM study [1] parents reported that their children predominantly searched for information on the Web for school assignments (62%) and games (57%), followed by music (29%), and handicraft instructions (19%). The majority of the children was highly confident in their ability to search information on the Web, with 86% indicating that they often or always found what they were looking for and 81% stating that they seldom or never needed help. Only one child did not search at all on the Web. 76% indicated that they liked searching the Web. When asking the children which search engines they knew, 76% mentioned Google. Of the parents, even 90% reported that their children used Google.

Lisa, ten years old, who searches information on the Web about once a week, first enters the keyword "kangaroo", resulting in many ambiguous search results on the first SERP (e.g., a city magazine and the homepage of a mathematics competition, which both are called "kangaroo"). She selects two kids lexica also presented on the first SERP, that, however, do not provide the answer to the first subtask. The experimenter, finally, gives the advice to enter more specific search terms, and after a second hint to include the term "pouch" she types in "kangaroo pouch". Then, she accesses the second Web page presented on the first SERP, that contains the answer to both subtasks. However, she does not find the answers, until the experimenter shows her the respective paragraphs.

4.2 Task performance 4.2.1 Quantitative data To answer the two subtasks about the kangaroos (see 3.2.2 for the task descriptions) 12 children used natural-language queries such as "Do all kangaroos have a pouch?", or "How does the baby kangaroo stay inside the pouch?", whereas 8 children used keyword queries such as "kangaroo(s)", "kangaroo pouch", "kangaroo baby", "kangaroo without pouch", or "kangaroo baby in the pouch" (with the two last named, more specific keywords having resulted in greater success than the three first for most of the children). One child first used an unspecific keyword query (i.e., "kangaroo"), but then switched to natural-language queries with which he finally successfully solved the first subtask, but not the second.

Andy, nine years old, who uses the Internet on a daily basis and searches information on the Web 4-6 days a week, also starts by entering "kangaroo" as the first search term, but then during his search refines the search terms several times, entering "kangaroo without pouch", "red kangaroo", "kangaroo man", "kangaroo male", "kangaroo male pouch", and accesses several Web pages (among them 4 Wikipedia articles). After having entered the last query he finally finds the answer to the first subtask on a Q&A community site (after 8.5 minutes). Then, he continues his search with the keywords "kangaroo baby in pouch" and "kangaroo baby wikipedia". Finally, he also correctly solves the second subtask (after another 7.1 minutes), by accessing the same Q&A community site as before. Interestingly, there he uses the local search and enters the natural-language query "How do babies stay inside the pouch".

Of the children that used natural-language queries, 8 solved both subtasks correctly, 4 at least the first subtask, and only 1 child could not solve either of the two subtasks without help. In contrast, of the children that used keyword queries, only 3 solved both tasks correctly, 2 the first one, and 3 were not able to correctly solve either of the two subtasks. Statistically, marginally significantly more children that used natural-language queries (92.3%) correctly solved at least one subtask than children that used keyword searches (62.5%), χ2(1, N = 21) = 2.85, p = .09. The time taken for the two tasks did not differ significantly between children that used natural-language queries (M = 9.0 minutes, SD = 2.8) and those that used keyword searches (M = 10.5 minutes, SD = 4.5), t(19) = -0.99, p = .34. The three children that solved both subtasks successfully by using keyword searches were older boys with high search skills as judged by the parents.

5. CONCLUSION AND FUTURE WORK Our observations confirm previous findings on children's query formulation [2,3,6,7,9]. When searching with keywords children mostly entered only one or two unspecific search terms, leading to search results that did not match their information needs. First, many of the presented search results were more general information sources that required deep exploration to find the needed information. Second, due to confusingly similar brand and product names and homonyms, result sets retrieved by the children when using keywords frequently contained ambiguous result items. In case of the Kangaroo task, for instance, several search results on the first SERP led to the Website "MathKangaroo", a mathematics competition, and ‘Das KänguruManifest’, a current German bestselling book series. Ambiguous search results, however, require more sophisticated evaluation techniques, involving fast reading and scanning, matching one's information needs to delivered content, and eventually altering query terms accordingly.

4.2.2 Qualitative data To provide a more detailed and qualitative account of how the children solved the tasks, in the following we shortly present the cases of three children (using fictitious names): Amy, a successful natural-language query user, Lisa, an unsuccessful keyword-query user, and Andy, a successful keyword-query user. Amy, eight years old, who not had used the Web to search for information before, first enters the question "Do all kengaroos have a pouch", then using Google's ‘Did you mean’ suggestion to correct the spelling error. She quickly scans some of the search results on the first SERP, goes to the second SERP, and selects the third Web page, which in the very first sentence provided the answer, such that Amy after 3.2 minutes correctly solves subtask

Elementary school children, in most cases, have not yet acquired the necessary skills (such as reading techniques and reflection or abstraction skills) to successfully and effortlessly deal with ambiguous search results. Thus, evaluating ambiguous and/or rather universal information sources often resulted in working memory overload and weariness. To conclude, especially for

186

IDC 2012

SHORT PAPERS

12th-15th June, Bremen, Germany

6. REFERENCES

younger and less-experienced children, keyword searches seem not to meet their needs while searching the Web with today’s search engines. In contrast, natural-language queries led to more explicit results, comprising more understandable titles and often containing the answer directly in the search result snippet. Accordingly, in many cases natural-language queries resulted in greater search success than keyword searches in our study, with a particular benefit for children with little or no Web search experience (e.g., one inexperienced boy using natural-language queries was able to find all of the answers by only reading the search result snippets on the Google SERPs).

[1] Behrens, P., Rathgeb, T., König, T., and Schmid, T. 2010. KIM-Studie 2010. Kinder + Medien, Computer + Internet. Medienpädagogischer Forschungsverbund Südwest, Stuttgart. [2] Bilal, D. 2000. Children's use of the Yahooligans! web search engine: I. Cognitive, physical, and affective behaviors on fact-based search tasks. J AM SOC INFORM SCI. 51, 7, 646-665. [3] Bilal, D. 2002. Children's use of the Yahooligans! web search engine. III. Cognitive and physical behaviors on fully self-generated search tasks. J AM SOC INF SCI TEC. 53, 13, 1170-1183.

Overall, we observed that all children who used keywords experienced some difficulties trying to apply the rules of the common search paradigm. While we may teach children to use keywords when searching (which, in fact, is why those searching with keywords reported to do so), they still lack general concepts which would allow them to use this “search knowledge” adequately. This applies to choosing entry terms and refining their queries with additional terms, as well as understanding how these inter-relate with the result outcome. Some children, who struggled trying to search “correctly” or as they have been told before by parents, asked during the search whether it was ok to put a whole question in the box and seemed relieved when the experimenter affirmed this.

[4] De Belder, J., and Moens, M. 2010. Text simplification for children. In Proceedings of the SIGIR '10 workshop on accessible search systems (Geneva, Switzerland). ACM, New York, NY, 19-26. [5] Druin, A., Hutchinson, H., Foss, E., Hatley, L., Golub, E., Leigh Guha, M., and Fails, J. 2009. How children search the internet with keyword interfaces. In Proceedings of the Anonymous 8th International Conference on Interaction Design and Children (IDC '09, Como, Italy). ACM, New York, NY, 89-96.

We are currently developing an experimental search user interface (retrieving search results from Google Custom Search API), that targets the needs of elementary school children (see Figure 1). Based on our observations that children succeed better using their own strategies than trying to apply adult strategies, we aim at enabling them to learn how a search system works and to enter the next level of search, namely keyword-search, while using natural language queries. Specifically, we have designed a scaffolding feature that highlights the most relevant terms of a naturallanguage query and visualizes the query as manipulative standalone “word-objects” or “phrase-objects” as a search engine would interpret them. The feature will allow users to click themselves through result sets of narrowed-down selections of search terms, enabling them to “keep” their natural-language searches while exploring alternative search result sets. We assume that the reversible query-reformulation support offered by our prototype will stimulate children to experiment with search queries and allow their search strategies to grow with their evaluation and reflection skills. We have recently finished first usability tests regarding the newly developed interface feature with a similar user group. In the next step we plan to test the query-tool functionality in multiple sessions in a school environment.

[6] Druin, A., Foss, E., Hutchinson, H., Golub, E., and Hatley, L. 2010. Children’s roles using keyword search interfaces at home. In Proceedings of the 28th International Conference on Human Factors in Computing Systems (CHI '10, Atlanta, GA, USA). ACM, New York, NY, 413-422. [7] Eastin, M. S. 2008. Toward a cognitive developmental approach to youth perceptions of credibility. In M.J. Metzger & A.J. Flanagin (Eds.), Digital Media, Youth, and Credibility, pp. 2948. Cambridge, Massachusetts: MITPress. [8] Hirsh, S.G. 1999. Children’s relevance criteria and information seeking on electronic resources. J AM SOC INFORM SCI. 50, 14, 1265-1283. [9] Jochmann-Mannak, H., Huibers, T., Lentz, L., and Sanders, T. 2010. Children searching information on the Internet: Performance on children’s interfaces compared to Google. In Proceedings of the SIGIR '10 workshop on accessible search systems (Geneva, Switzerland). ACM, New York, 27-35. [10] Large, A., and Beheshti, J. 2000. The web as a classroom resource: Reactions from the users. J AM SOC INFORM SCI. 51, 12, 1069-80. [11] Livingstone, S., Bober, M., and Helsper, E. 2005. Internet literacy among children and young people. London: LSE Report. [12] Marchionini, G. 1989. Information-seeking strategies of novices using a full-text electronic encyclopedia. J AM SOC INFORM SCI. 40, 1, 54-66. [13] Schacter, J., Chung, G., and Dorr, A. 1998. Children's internet searching on complex problems: Performance and process analyses. J AM SOC INFORM SCI. 840-849. [14] van der Sluis, F. and van Dijk, B. 2010. A closer look at children's information retrieval usage: Towards childcentered relevance. In Proceedings of the SIGIR '10 workshop on accessible search systems (Geneva, Switzerland). ACM, New York, NY, 3-10

Figure 1. Our experimental search user interface that visualizes the entered query as manipulative "word-objects".

187